Auto-News: An Automatic News Aggregator with LLM
A personal news aggregator to pull information from multi-sources + LLM (ChatGPT) to help us read efficiently with less noise, the sources including Tweets, RSS, YouTube, Web Articles, Reddit, and random Journal notes.
Why need it?
In the world of this information explosion, we live with noise every day, it becomes even worse after the generative AI was born. Time is a precious resource for each of us, How to use our time more efficiently? It becomes more challenging than ever. Think about how much time we spent on pulling/searching/filtering content from different sources, how many times we put the article/paper or long video as a side tab, but never got a chance to look at, and how much effort to organize the information we have read. We need a better way to get rid of the noises, focus on reading the information efficiently based on our interests, and stay on track with the goals we defined.
See this Blog post and this YouTube video for more details.
auto_news_300k.mp4
Features
- Pull from Feed sources (including RSS, Reddit, Tweets, etc) and summarize them
- Summarize YouTube videos (generate transcript if needed)
- Filter content based on personal interests and remove 80%+ noises
- Generate
TODO
list from Takeaways/Journal-notes - Weekly top-k aggregations
- Journal notes summarization and insights generating
- A unified/central reading experience (e.g., RSS reader-like style, Notion based)
Architecture
- UI: Notion-based, cross-platform (Web browser, iOS/Android app)
- Backend: Runs on Linxu/MacOS
Backend System Requirements
Component | Minimum Requirements | Recommended |
---|---|---|
OS | Linux, MacOS | Linux, MacOS |
CPU | 2 cores | 4 cores |
Memory | 6GB | 12GB |
Disk | 20GB | 50GB |
Installation
Preparison
- [Required] Notion token
- [Required] OpenAI token
- [Required] Docker
- [Optional] Highly Recommended! Notion Web Clipper
- [Optional] Reddit Tokens
- [Optional] Twitter Developer Tokens, Paid Account Only
[UI] Create Notion Entry Page
Go to Notion, create a page as the main entry (For example Readings
page), and enable Notion Integration
for this page
[Backend] Create Environment File
Checkout the repo and copy .env.template
to build/.env
, then fill up the environment vars:
NOTION_TOKEN
NOTION_ENTRY_PAGE_ID
OPENAI_API_KEY
- [Optional]
REDDIT_CLIENT_ID
andREDDIT_CLIENT_SECRET
- [Optional] Vars with
TWITTER_
prefix
[Backend] Build Services
make deps && make build && make deploy && make init
[Backend] Start Services
make start
Now, the services are up and running, it will pull sources every hour.
[UI] Set up Notion Tweet/RSS/Reddit list
Go to the Notion entry page we created before, and we will see the following folder structure has been created automatically:
Readings
├── Inbox
│  ├── Inbox - Article
│  └── Inbox - YouTube
│  └── Inbox - Journal
├── Index
│  ├── Index - Inbox
│  ├── Index - ToRead
│  ├── RSS_List
│  └── Tweet_List
│  └── Reddit_List
└── ToRead
└── ToRead
- Go to
RSS_List
page, and fill in the RSS name and URL - Go to
Reddit_List
page, and fill the subreddit names - Go to
Tweet_List
page, and fill in the Tweet screen names (Tips: Paid Account Only)
[UI] Set up Notion database views
Go to Notion ToRead
database page, all the data will flow into this database later on, create the database views for different sources to help us organize flows easier. E.g. Tweets, Articles, YouTube, RSS, etc
Now, enjoy and have fun.
Operations
[Monitoring] Control Panel
For troubleshooting, we can use the URLs below to access the services and check the logs and data
Service | Role | Panel URL |
---|---|---|
Airflow | Orchestration | http://localhost:8080 |
Milvus | Vector Database | http://localhost:9100 |
Adminer | DB accessor | http://localhost:8070 |
Stop/Restart Services
In case we want, apply the following commands from the codebase folder.
# stop
make stop
# restart
make stop && make start
.env
and DAGs
Redeploy make stop && make init && make start
Upgrade to the latest code
make upgrade && make stop && make init && make start
Rebuild Docker Images
make stop && make build && make init && make start