• This repository has been archived on 21/Sep/2021
  • Stars
    star
    159
  • Rank 234,661 (Top 5 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created over 11 years ago
  • Updated almost 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Home-made Google Reader replacement powered by Node.js and Cloudant

BirdReader

Introduction

In March 2013, Google announced that Google Reader was to be closed. I used Google Reader every day so I set out to find a replacement. I started with other online offerings, but then I thought "I could build one". So I created BirdReader which I have released to the world in its unpolished "alpha".

BirdReader is designed to be installed on your own webserver or laptop, running Node.js. e.g.

  • on an old PC
  • on a cloud server e.g. AWS Micro server (free!)
  • on a Raspberry Pi

desktop screenshot

Features

  • import your old Google Reader subscriptions
  • fetches RSS every 5 minutes
  • web-based aggregated newsfeed
    • mark articles as read
    • delete articles without reading
    • 'star' articles
    • add a new feed
    • sorted in newest-first order
    • bootstrap-based, responsive layout
    • tagging/untagging of feeds
    • Twitter/Facebook sharing
    • basic HTTP authentication (optional)
    • filter read/unread/starred streams by tag
    • filter read/unread/starred streams by feed
    • full-text search (only works when using Cloudant as the CouchDB storage engine)
    • icons for feeds and articles
    • expand all
    • browse-mode - go through unread articles one-by-one, full screen
    • live stats via WebSockets (NEW!)

As of July 2013, the web client also makes a WebSockets connection back to the server so that when new articles are added to the database, then the numbers of read, unread and starred articles can be 'pushed' to the server, without the client having to poll. This also offers other advantages

  • when fetching an article or list of articles, we no longer have to also fetch the article counts, making fetches faster
  • article counts arrive at the client asynchronously
  • article counts are always up to date
  • url scheme has changed to a 'hash-bang' scheme, so that all page updates are via Ajax, to prevent frequent disconnection of the WebSocket and to reduce network traffic

N.B if you have previous installation of BirdReader, you will have to run 'npm install' to pick up the socket.io package.

How does it work?

BirdReader doesn't store anything locally other than its source code and your configuration. The data is stored in a Cloudant (CouchDB) database in the cloud. You will need to sign up for a free Cloudant account (disclaimer: other hosted CouchDB services are available, and this code should work with any CouchDB server e.g. your own).

Two databases are used:

feeds database

The 'feeds' database stores a document per RSS feed you are subscribed to e.g.

{
    "_id": "f1cf38b2f6ffbbb69e75df476310b3a6",
    "_rev": "8-6ad06e42183368bd696aec8d25eb03a1",
    "text": "The GitHub Blog",
    "title": "The GitHub Blog",
    "type": "rss",
    "xmlUrl": "http://feeds.feedburner.com/github",
    "htmlUrl": "http://pipes.yahoo.com/pipes/pipe.info?_id=13d93fdc3d1fb71d8baa50a1e8b50381",
    "tags": ["OpenSource"],
    "lastModified": "2013-03-14 15:06:03 +00:00",
    "icon": "http://www.bbc.co.uk/favicon.ico"
}

This data is directly imported from the Google Checkout OPML file and crucially stores:

  • the url which contains the feed data (xmlUrl)
  • the last modification date of the newest article on that feed (lastModified)

articles database

The 'articles' database stores a document per article e.g. :

{
    "_id": "3c582426df29863513500a736111fa4e",
    "_rev": "1-b49944fd0edf8f50fc17c6562d75169e",
    "feedName": "BBC Entertainment",
    "tags": ["BBC"],
    "title": "VIDEO: Iran planning to sue over Argo",
    "description": "Best Picture winner Argo has been criticised by the Iranian authorities over its portrayal of the 1979 Iran hostage crisis.",
    "pubDate": "2013-03-15T15:20:31.000Z",
    "link": "http://www.bbc.co.uk/news/entertainment-arts-21805140#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa",
    "pubDateTS": 1363360831,
    "read": false,
    "starred": false,
    "icon": "http://www.bbc.co.uk/favicon.ico"
}

The _id and _rev fields are generated by CouchDB. The feedName and tags come from the feed where the article originated. The rest of the fields come form the RSS article itself apart from 'read' and 'starred' which we add to record whether an article has been consumed or favourited.

Views

Cloudant/CouchDB only allows data to be retrieved by its "_id" unless you define a "view". We have one view on the articles database called "byts" that allows us to query our data:

  • unread article, sorted by timestamp
  • read articles, sorted by timestamp
  • starred articles, sorted by timestamp
  • counts of the number of read/unread/starred articles

The view creates a "map" function which emits keys like

  ["string",123455]

where "string" can be "unread", "read" or "starred", and "12345" is the timestamp of the article.

Another view "bytag" has a different key:

  ['string','tag',12345]

where "string" can be "unread", "read" or "starred", the "tag" is the user supplied tag and "12345" is the timestamp of the article. This allows us to get unread articles tagged by "BBC" in newest first order, for example.

Full-text search

BirdReader supports full-text search of articles by utilising Cloudant's full-text capability. A Lucene index is created to allow the articles' titles and descriptions to searchable. A simple form on the top bar allows the user to search the collected articles with ease. N.B if you are using a non-Cloudant backend (e.g. plain CouchDB), then the search facility will not work.

Scraping articles

Every so often, BirdReader fetches all the feeds using the feedparser. Any articles newer than the feed's previous newest article is saved to the articles database.

Adding new feeds

New feeds can be added by filling in a web form with the url of the page that has an RSS link tag. We use the extractor library to pull back the page, find the title, meta description and link tags and add the data to our feeds database.

What does it look like?

The site is built with Bootstrap so that it provides a decent interface on desktop and mobile browsers

Mobile Screenshot

mobile browse screenshot

mobile screenshot

Key technologies

  • Node.js - Server side Javascript
  • Express - Application framework for node
  • feedparser - RSS feed parser
  • Cloudant - Hosted CouchDB
  • async - Control your parallelism in Node.js
  • Bootstrap - Twitter responsive HTML framework
  • sax - XML parser for Node.js
  • extractor - HTML scraper, to find RSS links in HTML pages
  • socket-io - WebSockets library

Installation

You will need Node.js and npm installed on your computer (version 0.8.x or version 0.10.x). Unpack the BirdReader repository and install its dependencies e.g.

  git clone [email protected]:glynnbird/birdreader.git
  cd birdreader
  npm install

N.B on Mac, your're likely to need [https://developer.apple.com/xcode/](development tools) installed.

Copy the sample configuration into place

  cd includes
  cp _config.json config.json

Edit the sample configuration to point to your CouchDB server.

Run Birdreader with

  node birdreader.js

See the website by pointing your browser to port 3000:

  http://localhost:3000/

Importing Google Reader subscriptions

You can export your Google Reader subscriptions using Google Takeout. Download the file, unpack it and locate the subscriptions.xml file.

You can import this into BirdReader with:

  node import_opml.js subscriptions.xml

Securing your BirdReader server

BirdReader allows you to protect your webserver by username and password by adding an "authentication" section to our includes/config.json:

  "cloudant": {
    .  
    .
  },
  "authentication": {
    "on": true,
    "username": "myusername",
    "password": "mypassword"
  }
}

Authentication will only be enforced if "authentication.on" is set to "true". A restart of BirdReader is required to pick up the config.

Purging older articles

If you don't want to keep articles older than x days, then you can add the following to your config:

"purgeArticles": {
  "on": true,
  "purgeBefore": 15
}

The above will instruct BirdReader to purge articles older than 15 days every 24 hours.

Benchmarks

BirdReader has been tested on a Mac, Amazon EC2 and Raspberry Pi. Benchmarks here.

Icons

When a feed is added to BirdReader, we attempt to get an "icon" for the feed based on the "Favicon" of the blog. This is stored in the feeds database and every article that is fetched subsequently, inherits the feed's icon.

This feature was added after launch. To retro-fit icons to your existing feeds, run:

  node retrofit_favicons.js

Tests

There are some tests in the 'test' directory to run them you'll need Mocha:

  sudo npm install -g mocha

and then run the tests

  mocha

More Repositories

1

couchimport

CouchDB import tool to allow data to be bulk inserted
JavaScript
137
star
2

countriesgeojson

Countries of the world as GeoJSON
65
star
3

teletext

Hacker news as teletext
HTML
54
star
4

couchwarehouse

Data warehouse for CouchDB
JavaScript
47
star
5

yub

Yubico/Yubikey Client API Library for Node.js
JavaScript
44
star
6

toot

A very simple Mastodon command-line client for posting toots.
JavaScript
42
star
7

usstatesgeojson

US states as GeoJSON
Shell
40
star
8

sqltomango

SQL to Mango (Cloudant Query) JSON converter library
JavaScript
28
star
9

datamaker

Data generator command-line tool and library. Create JSON, CSV, XML data from templates.
JavaScript
27
star
10

couchshell

A simple command-line shell that allows you to interact with CouchDB/Cloudant as if it were a Unix file system
JavaScript
27
star
11

kuuid

Time-sortable UUID - roughly time-sortable unique id generator
JavaScript
23
star
12

qrate

A Node.js queue that provides concurrency and rate-limiting (based on async.queue)
JavaScript
21
star
13

drummer

Offline-first drum machine
JavaScript
20
star
14

couchreplicate

CouchDB and Cloudant replication command-line tool and library
JavaScript
19
star
15

deconflict

CouchDB conflict resolution sample code
JavaScript
17
star
16

smartsponsor

An Ethereum smart contract that allows individuals to collect sponsorship "money" for charity events
HTML
17
star
17

couchbackup

CouchDB backup and restore command-line utility.
JavaScript
14
star
18

ansible-install-kafka

Ansible playbook to install Kafka to one or more Ubuntu server machines
Shell
14
star
19

couchmigrate

CouchDB command-line design document migration tool
JavaScript
13
star
20

proforma

Offline-first form filling app
JavaScript
10
star
21

ansible-install-couchdb

Ansible playbook to install CouchDB 2.0 on Raspberry Pi
9
star
22

couchdiff

CouchDB/Cloudant diff tool - is database A different to database B?
JavaScript
8
star
23

badgescanner

Offline-first, qr-code scanner web app. Stores data in PouchDB.
JavaScript
8
star
24

redis-tools

Redis command-line tools to allow mass export, import and deletion of keys
PHP
8
star
25

md

Offline-first, PouchDB-powered, Markdown word processor app.
JavaScript
8
star
26

volt

A Google Chrome extension that can be used to store login details for websites offline with optional sync.
JavaScript
7
star
27

ccurl

CouchDB command-line tool to allow shortened curl commands without putting username/password in your command-line history
JavaScript
7
star
28

envoy-serverless

OpenWhisk version of Cloudant Envoy
JavaScript
6
star
29

changesreader

CouchDB changes reader
JavaScript
6
star
30

couchsnapshot

CouchDB snapshot utility
JavaScript
6
star
31

nosqlimport

General purpose CSV/TSV importer for NoSQL databases
JavaScript
6
star
32

linkshare

A Google Chrome extension that allows links to be store locally and shared with individuals or teams.
CSS
6
star
33

ukcountiesgeojson

UK counties as GeoJSON
6
star
34

onedbperuser

One Database Per User Cloudant tooling
JavaScript
6
star
35

skyphp

A PHP library to get programme information from Sky's TV platform
PHP
5
star
36

Node.js-Web-Server

A simple Node.js web server
JavaScript
5
star
37

simple-autocomplete-service

Node.js app that uses Redis to provide an autocomplete API on data that is uploaded as text files
CSS
5
star
38

postfdb

A CouchDB-like database backed by FoundationDB
JavaScript
5
star
39

couchdeconflict

Command-line utility to remove conflicts from CouchDB/Cloudant documents
Shell
4
star
40

skynode

A simple Sky television EPG written in Node.js, Express and Jade.
JavaScript
4
star
41

couchfirehose

CouchDB data transfer tool
JavaScript
4
star
42

dynamodbexport

DynamoDB export command-line script.
JavaScript
3
star
43

postdb

A CouchDB-like database that uses PostgreSQL as the storage engine.
JavaScript
3
star
44

gutenberg

Offline-first e-book reader
JavaScript
3
star
45

cachemachine

An configurable cache of outgoing HTTP requests and their responses.
JavaScript
3
star
46

markettrader

Bitcoin market trading game
JavaScript
3
star
47

guitars

Demo app using Cloudant as a faceted search engine
JavaScript
3
star
48

dogfight

Recreation of old BBC Micro Dogfight game
JavaScript
3
star
49

mastodonclient

A minimal Mastodon client
JavaScript
3
star
50

foldinizer

Organise files into year/month folder structure
PHP
3
star
51

couchsnap

Minimal CouchDB snapshotting
JavaScript
3
star
52

autotweet

Automatic twitter command-line client
Python
2
star
53

rss

RSS feed mangler
Vue
2
star
54

xword

Collaborative Cryptic Crossword Solver
JavaScript
2
star
55

askeroids

A variation of the classic game Askeroids written as a Java applet in 2000
Java
2
star
56

audiomark

Audio snippet collector and qr-code sharer
JavaScript
2
star
57

detach

Microservice to remove attachments from Cloudant database and save to Object Storage
JavaScript
2
star
58

proxee

API Proxy providing authentication, access control using CouchDB as a data store
JavaScript
2
star
59

postdblite

Database with CouchDB-like API backed by SQLite
JavaScript
2
star
60

documentdbexport

Export a DocumentDB collection to JSON
JavaScript
1
star
61

dynamodbcopy

DynamoDB copy tool
JavaScript
1
star
62

businesscard

PouchDB demo app (Cordova)
JavaScript
1
star
63

changes

A simple Node.js script to listen to the Cloudant _changes feed
JavaScript
1
star
64

datamakerapp

Electron App to generate CouchDB/Cloudant data
JavaScript
1
star
65

etchasketch

Etcha-a-sketch using IoT
JavaScript
1
star
66

secretsanta

Node.js Secret Santa gift picker
JavaScript
1
star
67

chaise

Static CouchDB dashboard app
Vue
1
star
68

shippingforecastgeojson

Radio 4's Shipping Forecast zones as GeoJSON
1
star
69

logshare-server

Log-sharing utility - sever-side code
JavaScript
1
star
70

traffic

A CouchApp that generates load for demonstrating the CouchDB API
JavaScript
1
star
71

centesimal

A demonstration of centesimal time
JavaScript
1
star
72

crm

A demo Customer Relations Management system built with IBM Cloud Functions and the Cloudant database.
JavaScript
1
star
73

cloudant-timeseries

Cloudant helper library for managing time-series data stored in monthly databases
JavaScript
1
star
74

metrics-collector-visualisation-microservice

A web-based visualisation microservice showing data coming from the Metrics Collector
JavaScript
1
star
75

sitescore

Website scoring system
CSS
1
star
76

blockchain-workshop

Exercises for a Blockchain/Ethereum workshop
1
star
77

ccurllib

Utility library for ccurl
JavaScript
1
star
78

metrics-collector-midi-microservice

Turns incoming data into music
JavaScript
1
star
79

bluemix_datacache

Node.js library to interact with IBM BlueMix's SessionCache service
JavaScript
1
star
80

metrics-collector-aggregation-microservice

Consumer microservice that performs simple count, sum and stats operations on incoming data.
JavaScript
1
star
81

envoy

A CouchDB proxy to enable replication of database subsets
JavaScript
1
star
82

flickr-album-restorer

Utility to convert downloaded Flickr data into pictures in one folder per album
JavaScript
1
star
83

scheduledcloudantbackup

Script to perform a Cloudant backup to Cloud Object Storage
JavaScript
1
star