• Stars
    star
    657
  • Rank 68,589 (Top 2 %)
  • Language
    Python
  • License
    Creative Commons ...
  • Created almost 13 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

📣 Connects your web site to social media. Likes, retweets, mentions, cross-posting, and more...

Bridgy Bridgy Circle CI Coverage Status

Bridgy connects your web site to social media. Likes, reposts, mentions, cross-posting, and more. See the user docs for more details, or the developer docs if you want to contribute.

https://brid.gy/

Bridgy is part of the IndieWeb ecosystem. In IndieWeb terminology, Bridgy offers backfeed, POSSE, and webmention support as a service.

License: This project is placed in the public domain.

Development

Pull requests are welcome! Feel free to ping me in #indieweb-dev with any questions.

First, fork and clone this repo. Then, install the Google Cloud SDK and run gcloud components install beta cloud-datastore-emulator to install the datastore emulator. Once you have them, set up your environment by running these commands in the repo root directory:

gcloud config set project brid-gy
python3 -m venv local
source local/bin/activate
pip install -r requirements.txt
# needed to serve static files locally
ln -s local/lib/python3*/site-packages/oauth_dropins/static oauth_dropins_static

Now, you can fire up the gcloud emulator and run the tests:

gcloud beta emulators datastore start --use-firestore-in-datastore-mode --no-store-on-disk --host-port=localhost:8089 --quiet < /dev/null >& /dev/null &
python3 -m unittest discover -s tests -t .
kill %1

If you send a pull request, please include or update a test for your new code!

To run the app locally, use flask run:

gcloud beta emulators datastore start --use-firestore-in-datastore-mode --no-store-on-disk --host-port=localhost:8089 --quiet < /dev/null >& /dev/null &
GAE_ENV=localdev FLASK_ENV=development flask run -p 8080

Open localhost:8080 and you should see the Bridgy home page!

To test a poll or propagate task, find the relevant Would add task line in the logs, eg:

INFO:root:Would add task: projects//locations/us-central1/queues/poll {'app_engine_http_request': {'http_method': 'POST', 'relative_uri': '/_ah/queue/poll', 'app_engine_routing': {'service': 'background'}, 'body': b'source_key=agNhcHByFgsSB1R3aXR0ZXIiCXNjaG5hcmZlZAw&last_polled=1970-01-01-00-00-00', 'headers': {'Content-Type': 'application/x-www-form-urlencoded'}}, 'schedule_time': seconds: 1591176072

...pull out the relative_uri and body, and then put them together in a curl command against localhost:8080 (but don't run it yet!), eg:

curl -d 'source_key=agNhcHByFgsSB1R3aXR0ZXIiCXNjaG5hcmZlZAw&last_polled=1970-01-01-00-00-00' \
  http://localhost:8080/_ah/queue/poll

Then, restart the app with FLASK_APP=background to run the background task processing service, eg:

gcloud beta emulators datastore start --consistency=1.0 --host-port=localhost:8089 --quiet
GAE_ENV=localdev FLASK_ENV=development flask run -p 8080

Now, run the curl command you constructed above.

If you hit an error during setup, check out the oauth-dropins Troubleshooting/FAQ section. For searchability, here are a handful of error messages that have solutions there:

bash: ./bin/easy_install: ...bad interpreter: No such file or directory

ImportError: cannot import name certs

ImportError: cannot import name tweepy

File ".../site-packages/tweepy/auth.py", line 68, in _get_request_token
  raise TweepError(e)
TweepError: must be _socket.socket, not socket

error: option --home not recognized

There's a good chance you'll need to make changes to granary or oauth-dropins at the same time as bridgy. To do that, clone their repos elsewhere, then install them in "source" mode with:

pip uninstall -y oauth-dropins
pip install -e <path-to-oauth-dropins-repo>
ln -sf <path-to-oauth-dropins-repo>/oauth_dropins/static oauth_dropins_static

pip uninstall -y granary
pip install -e <path to granary>

To deploy to App Engine, run scripts/deploy.sh.

remote_api_shell is a useful interactive Python shell that can interact with the production app's datastore, memcache, etc. To use it, create a service account and download its JSON credentials, put it somewhere safe, and put its path in your GOOGLE_APPLICATION_CREDENTIALS environment variable.

Deploying to your own App Engine project can be useful for testing, but is not recommended for production. To deploy to your own App Engine project, create a project on gcloud console and activate the Tasks API. Initialize the project on the command line using gcloud config set project <project-name> followed by gcloud app create. You will need to update TASKS_LOCATION in util.py to match your project's location. Finally, you will need to add your "background" domain (eg background.YOUR-APP-NAME.appspot.com) to OTHER_DOMAINS in util.py and set host_url in tasks.py to your base app url (eg app-dot-YOUR-APP-NAME.wn.r.appspot.com). Finally, deploy (after testing) with gcloud -q beta app deploy --no-cache --project YOUR-APP-NAME *.yaml

To work on the browser extension:

cd browser-extension
npm install
npm run test

To run just one test:

npm run test -- -t 'part of test name'

Browser extension: logs in the JavaScript console

If you're working on the browser extension, or you're sending in a bug report for it,, its JavaScript console logs are invaluable for debugging. Here's how to get them in Firefox:

  1. Open about:debugging
  2. Click This Firefox on the left
  3. Scroll down to Bridgy
  4. Click Inspect
  5. Click on the Console tab

Here's how to send them in with a bug report:

  1. Right click, Export Visible Messages To, File, save the file.
  2. Email the file to bridgy @ ryanb.org. Do not post or attach it to a GitHub issue, or anywhere else public, because it contains sensitive tokens and cookies.

Browser extension: release

Here's how to cut a new release of the browser extension and publish it to addons.mozilla.org:

  1. ln -fs manifest.firefox.json manifest.json
  2. Load the extension in Firefox (about:debugging). Check that it works.
  3. Bump the version in browser-extension/manifest.json.
  4. Update the Changelog in the README.md section below this one.
  5. Build and sign the artifact:
    cd browser-extension/
    npm test
    ./node_modules/web-ext/bin/web-ext build
  6. Submit it to AMO.
    # get API secret from Ryan if you don't have it
    ./node_modules/web-ext/bin/web-ext sign --api-key user:14645521:476 --api-secret ...
    
    # If this succeeds, it will say:
    ...
    Your add-on has been submitted for review. It passed validation but could not be automatically signed because this is a listed add-on.
    FAIL
    ...
    It's usually auto-approved within minutes. Check the public listing here.

Here's how to publish it to the Chrome Web Store:

  1. ln -fs manifest.chrome.json manifest.json
  2. Load the extension in Chrome (chrome://extensions/, Developer mode on). Check that it works.
  3. Build and sign the artifact:
    cd browser-extension/
    npm test
    ./node_modules/web-ext/bin/web-ext build
  4. Open the console.
  5. Open the Bridgy item.
  6. Choose Package on the left.
  7. Click the Upload new package button.
  8. Upload the new version's zip file from browser-extension/web-ext-artifacts/.
  9. Update the Changelog in the Description box. Leave the rest unchanged.
  10. Click Save draft, then Submit for review.

Browser extension: Changelog

0.6.1, 2022-09-18

  • Don't open silo login pages if they're not logged in. This ran at extension startup time, which was mostly harmless in manifest v2 since the background page was persistent stayed loaded, but in manifest v3 it's a service worker or non-persistent background page, which gets unloaded and then reloaded every 5m.

0.6.0, 2022-09-17

0.5, 2022-07-21

  • Update Instagram scraping.

0.4, 2022-01-30

  • Fix Instagram comments. Add extra client side API fetch, forward to new Bridgy endpoint.
  • Expand error messages in options UI.

0.3.5, 2021-03-04

  • Dynamically adjust polling frequency per silo based on how often we're seeing new comments and reactions, how recent the last successful webmention was, etc.

0.3.4, 2021-02-22

  • Allow individually enabling or disabling Instagram and Facebook.

0.3.3, 2021-02-20

  • Only override requests from the browser extension, not all requests to the silos' domains.

0.3.2, 2021-02-18

  • Fix compatibility with Facebook Container Tabs.

0.3.1, 2021-02-17

  • Add Facebook support!

0.2.1, 2021-01-09

  • Add more details to extensions option page: Instagram login, Bridgy IndieAuth registration, etc.
  • Support Firefox's Facebook Container Tabs addon.

0.2, 2021-01-03

  • Add IndieAuth login on https://brid.gy/ and token handling.
  • Add extension settings page with status info and buttons to login again and poll now.
  • Better error handling.

0.1.5, 2020-12-25

  • Initial beta release!

Adding a new silo

So you want to add a new silo? Maybe MySpace, or Friendster, or even Tinder? Great! Here are the steps to do it. It looks like a lot, but it's not that bad, honest.

  1. Find the silo's API docs and check that it can do what Bridgy needs. At minimum, it should be able to get a user's posts and their comments, likes, and reposts, depending on which of those the silo supports. If you want publish support, it should also be able to create posts, comments, likes, reposts, and/or RSVPs.
  2. Fork and clone this repo.
  3. Create an app (aka client) in the silo's developer console, grab your app's id (aka key) and secret, put them into new local files in the repo root dir, following this pattern. You'll eventually want to send them to @snarfed too, but no hurry.
  4. Add the silo to oauth-dropins if it's not already there:
    1. Add a new .py file for your silo with an auth model and handler classes. Follow the existing examples.
    2. Add a 100 pixel tall button image named [NAME]_2x.png, where [NAME] is your start handler class's NAME constant, eg 'twitter'.
    3. Add it to the app front page and the README.
  5. Add the silo to granary:
    1. Add a new .py file for your silo. Follow the existing examples. At minimum, you'll need to implement get_activities_response and convert your silo's API data to ActivityStreams.
    2. Add a new unit test file and write some tests!
    3. Add it to api.py (specifically Handler.get), app.py, index.html, and the README.
  6. Add the silo to Bridgy:
    1. Add a new .py file for your silo with a model class. Follow the existing examples.
    2. Add it to app.py and handlers.py (just import the module).
    3. Add a 48x48 PNG icon to static/.
    4. Add a new [SILO]_user.html file in templates/ and add the silo to index.html. Follow the existing examples.
    5. Add the silo to about.html and this README.
    6. If users' profile picture URLs can change, add a cron job that updates them to cron.py.
  7. Optionally add publish support:
    1. Implement create and preview_create for the silo in granary.
    2. Add the silo to publish.py: import its module, add it to SOURCES, and update this error message.

Good luck, and happy hacking!

Monitoring

App Engine's built in dashboard and log browser are pretty good for interactive monitoring and debugging.

For alerting, we've set up Google Cloud Monitoring (née Stackdriver). Background in issue 377. It sends alerts by email and SMS when HTTP 4xx responses average >.1qps or 5xx >.05qps, latency averages >15s, or instance count averages >5 over the last 15m window.

Stats

I occasionally generate stats and graphs of usage and growth from the BigQuery dataset (#715). Here's how.

  1. Export the full datastore to Google Cloud Storage. Include all entities except *Auth, Domain and others with credentials or internal details. Check to see if any new kinds have been added since the last time this command was run.

    gcloud datastore export --async gs://brid-gy.appspot.com/stats/ --kinds Activity,Blogger,BlogPost,BlogWebmention,Facebook,FacebookPage,Flickr,GitHub,GooglePlusPage,Instagram,Mastodon,Medium,Meetup,Publish,PublishedPage,Reddit,Response,SyndicatedPost,Tumblr,Twitter,WordPress
    

    Note that --kinds is required. From the export docs, Data exported without specifying an entity filter cannot be loaded into BigQuery. Also, expect this to cost around $10.

  2. Wait for it to be done with gcloud datastore operations list | grep done or by watching the Datastore Import/Export page.

  3. Import it into BigQuery:

    for kind in Activity BlogPost BlogWebmention Publish SyndicatedPost; do
      bq load --replace --nosync --source_format=DATASTORE_BACKUP datastore.$kind gs://brid-gy.appspot.com/stats/all_namespaces/kind_$kind/all_namespaces_kind_$kind.export_metadata
    done
    
    for kind in Blogger Facebook FacebookPage Flickr GitHub GooglePlusPage Instagram Mastodon Medium Meetup Reddit Tumblr Twitter WordPress; do
      bq load --replace --nosync --source_format=DATASTORE_BACKUP sources.$kind gs://brid-gy.appspot.com/stats/all_namespaces/kind_$kind/all_namespaces_kind_$kind.export_metadata
    done
    

Open the Datastore entities page for the Response kind, sorted by updated ascending, and check out the first few rows: https://console.cloud.google.com/datastore/entities;kind=Response;ns=__$DEFAULT$__;sortCol=updated;sortDir=ASCENDING/query/kind?project=brid-gy

Open the existing Response table in BigQuery: https://console.cloud.google.com/bigquery?project=brid-gy&ws=%211m10%211m4%214m3%211sbrid-gy%212sdatastore%213sResponse%211m4%211m3%211sbrid-gy%212sbquxjob_371f97c8_18131ff6e69%213sUS

Update the year in the queries below to two years before today. Query for the same first few rows sorted by updated ascending, check that they're the same:

SELECT * FROM `brid-gy.datastore.Response`
WHERE updated >= TIMESTAMP('202X-11-01T00:00:00Z')
ORDER BY updated ASC
LIMIT 10

Delete those rows:

DELETE FROM `brid-gy.datastore.Response`
WHERE updated >= TIMESTAMP('202X-11-01T00:00:00Z')

Load the new Response entities into a temporary table:

bq load --replace=false --nosync --source_format=DATASTORE_BACKUP datastore.Response-new gs://brid-gy.appspot.com/stats/all_namespaces/kind_Response/all_namespaces_kind_Response.export_metadata

Append that table to the existing Response table:

SELECT
leased_until,
original_posts,
type,
updated,
error,
sent,
skipped,
unsent,
created,
source,
status,
failed,

ARRAY(
  SELECT STRUCT<`string` string, text string, provided string>(a, null, 'string')
  FROM UNNEST(activities_json) as a
 ) AS activities_json,

IF(urls_to_activity IS NULL, NULL,
   STRUCT<`string` string, text string, provided string>
     (urls_to_activity, null, 'string')) AS urls_to_activity,

IF(response_json IS NULL, NULL,
   STRUCT<`string` string, text string, provided string>
     (response_json, null, 'string')) AS response_json,

ARRAY(
  SELECT STRUCT<`string` string, text string, provided string>(x, null, 'string')
  FROM UNNEST(old_response_jsons) as x
) AS old_response_jsons,

__key__,
__error__,
__has_error__

FROM `brid-gy.datastore.Response-new`

More => Query settings, Set a destination table for query results, dataset brid-gy.datastore, table Response, Append, check Allow large results, Save, Run.

Open sources.Facebook, edit schema, add a url field, string, nullable.

  1. Check the jobs with bq ls -j, then wait for them with bq wait.
  2. Run the full stats BigQuery query. Download the results as CSV.
  3. Open the stats spreadsheet. Import the CSV, replacing the data sheet.
  4. Change the underscores in column headings to spaces.
  5. Open each sheet, edit the chart, and extend the data range to include all of the new rows.
  6. Check out the graphs! Save full size images with OS or browser screenshots, thumbnails with the Download Chart button. Then post them!

Final cleanup: delete the temporary Response-new table.

Delete old responses

Bridgy's online datastore only keeps responses for a year or two. I garbage collect (ie delete) older responses manually, generally just once a year when I generate statistics (above). All historical responses are kept in BigQuery for long term storage.

I use the Datastore Bulk Delete Dataflow template with a GQL query like this. (Update the years below to two years before today.)

SELECT * FROM Response WHERE updated < DATETIME('202X-11-01T00:00:00Z')

I either use the interactive web UI or this command line:

gcloud dataflow jobs run 'Delete Response datastore entities over 1y old'
  --gcs-location gs://dataflow-templates-us-central1/latest/Datastore_to_Datastore_Delete
  --region us-central1
  --staging-location gs://brid-gy.appspot.com/tmp-datastore-delete
  --parameters datastoreReadGqlQuery="SELECT * FROM `Response` WHERE updated < DATETIME('202X-11-01T00:00:00Z'),datastoreReadProjectId=brid-gy,datastoreDeleteProjectId=brid-gy"

Expect this to take at least a day or so.

Once it's done, update the stats constants in admin.py.

Misc

The datastore is exported to BigQuery (#715) twice a year.

We use this command to set a Cloud Storage lifecycle policy on our buckets to prune older backups and other files:

gsutil lifecycle set cloud_storage_lifecycle.json gs://brid-gy.appspot.com
gsutil lifecycle set cloud_storage_lifecycle.json gs://brid-gy_cloudbuild
gsutil lifecycle set cloud_storage_lifecycle.json gs://staging.brid-gy.appspot.com
gsutil lifecycle set cloud_storage_lifecycle.json gs://us.artifacts.brid-gy.appspot.com

See how much space we're currently using in this dashboard. Run this to download a single complete backup:

gsutil -m cp -r gs://brid-gy.appspot.com/weekly/datastore_backup_full_YYYY_MM_DD_\* .

More Repositories

1

granary

💬 The social web translator
Python
387
star
2

bridgy-fed

🕸 Bridges the IndieWeb to Mastodon and the fediverse via ActivityPub.
Python
244
star
3

huffduff-video

📺 Extract the audio from videos on YouTube, Vimeo, and other sites and send it to Huffduffer.
Python
91
star
4

mockfacebook

UNMAINTAINED: A standalone HTTP server that implements Facebook's FQL and Graph API.
Python
61
star
5

ownyourresponses

Creates posts on your web site for your likes, replies, reshares, and event RSVPs on social networks.
Python
44
star
6

open-in-app

UNMAINTAINED: Android app that makes Facebook, Twitter, etc. links open in their native apps instead of the browser.
Java
43
star
7

twitter-atom

Brings your Twitter feed to your feed reader!
HTML
40
star
8

arroba

Python implementation of Bluesky PDS and AT Protocol, including repo, MST, and sync XRPC methods
Python
35
star
9

oauth-dropins

🔑 Drop-in OAuth client flows for Python on Google App Engine.
Python
33
star
10

facebook-atom

Brings your Facebook news feed to your feed reader!
Python
32
star
11

p4sync

A library and plugins for a few music players that (attempts to) synchronize playback across multiple computers. Details: http://snarfed.org/synchronizing_mp3_playback
C
32
star
12

indie-map

🗺 A public IndieWeb social graph and dataset.
Python
32
star
13

lexrpc

Python client and server for Bluesky/AT Protocol's XRPC + Lexicon
Python
31
star
14

instagram-atom

Brings your Instagram feed to your feed reader!
JavaScript
28
star
15

libmsntp

libmsntp is a full-featured, compact, portable SNTP library. (SNTP is a simplified version of NTP.)
C
21
star
16

webfinger-unofficial

A WebFinger server for Facebook and Twitter.
Python
19
star
17

misc

Miscellaneous stuff: scripts, patches, userstyles, etc.
Python
19
star
18

freedom

UNMAINTAINED: Copies your Facebook, Twitter, and Google+ posts to a blog, with all formatting and details intact.
Python
17
star
19

portablecontacts-unofficial

UNMAINTAINED: A library and REST API that serves Facebook and Twitter user data in PortableContacts format
Python
14
star
20

webutil

Common utilities and handler code for Python webapps and App Engine.
Python
13
star
21

ostatus-unofficial

UNMAINTAINED: A stand-in, proxy OStatus server for sites like Facebook and Twitter that don't implement it themselves.
Python
11
star
22

salmon-unofficial

A web service that implements the Salmon distributed comment protocol for Facebook, Twitter, and Google+.
Python
10
star
23

carbox

Content Addressable aRchive file tools for python
Python
7
star
24

ownyourcheckin

copies your Facebook checkins to your WordPress blog
Python
7
star
25

baffle

UNMAINTAINED: Lets you use Microsub client apps with traditional feed readers like NewsBlur.
JavaScript
5
star
26

hackermention

Universal webmention backfeed for silos like Reddit, Hacker News, and more.
Python
4
star
27

flask-gae-static

Flask extension that serves static file handlers in Google App Engine app.yaml files
Python
4
star
28

plusstreamfeed

Converts your Google+ stream (ie posts from people in your circles) to Atom so you can read it in a feed reader
Python
3
star
29

advent-of-code-2021

Advent of Code 2021 solutions in esoteric languages
Emacs Lisp
3
star
30

fillcode

An Emacs minor mode that fills, or wraps, some parts of source code.
Emacs Lisp
3
star
31

bluesky-atom

Read your Bluesky skyline in your feed reader!
Python
2
star
32

snarfed.org

My web site
HTML
2
star
33

bowflex-planner

UNMAINTAINED: A webapp that optimizes Bowflex exercise routines to minimize rearranging weights and attachments.
Go
1
star
34

beautifulsoup

Mirror of the official BeautifulSoup repo.
Python
1
star
35

codeherenow

A scrolling ticker of source code checkins made by people recently at a specific place. Good for hackathons, conferences, etc. Also includes current local Twitter posts, Flickr photos, etc.
Python
1
star
36

dag-json

Python implemention of the IPLD DAG-JSON codec
Python
1
star
37

shell-py3

🖥 Interactive server-side Python shell for Google App Engine Python 3.7 Standard.
Python
1
star