backend
Media Cloud is an open source, open data platform that allows researchers to answer quantitative questions about the content of online media.sentence-splitter
Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.cliff-annotator
A lightweight server to allow HTTP requests to the Stanford Named Entity Recognized and a heavily modified CLAVIN geoparser.api-client
Public client for consuming content from the Media Cloud Online News Archive & Directory.web-tools
The shared repository for Media Cloud web apps (Explorer, Source Manager, Topic Mapper)date_guesser
A library to extract a publication date from a web page, along with a measure of the accuracy.nyt-news-labeler
Tag news stories based on models trained on the NYT corpus.api-tutorial-notebooks
A set of jupyter notebooks demonstrating how to use the Media Cloud API.feed_seeker
Find rss, atom, xml, and rdf feeds on webpagesmetadata-lib
How Media Cloud approaches extracting metadata from online news storiesweb-search
Code that drives the public web-based tools for the Media Cloud Online News Archive and Directory.copy-kvs
Copy a lot of objects between various key-value stores (MongoDB GridFS, PostgreSQL BLOBs, Amazon S3)rss-fetcher
Intelligently fetch lists of URLs from a large collection of RSS Feeds as part of the Media Cloud Directory.cliff-api-client
A Python client for the CLIFF geoparsing toolemail-templates
Templates for emails that Media Cloud sends.wayback-news-client
A client library to access the Wayback Machine news archive search.word-embeddings-server
Helpful micro-service to return results from word2vec modelsglimpse
Get a glimpse of attention to a topic on social media.docker-compose-just-quieter
Docker Compose CLI utility wrapper which makes `docker-compose` quieter.postgresql-citus-aws-graviton2
PostgreSQL built for AWS Graviton2sitemap-tools
simple toolkit of tools for consuming sitemapsfernandos-csv-randomizer
Fernando's CSV randomizer -- reads a CSV file, picks a specified number of random rows and writes them to a separate filecliff-homepage
A simple homepage for the CLIFF projecthausastemmer
Hausa language stemmer (Bimba et al., 2015)clavin-build-geonames-index
Builds and releases CLAVIN GeoNames.org index as a binarysous-chef
Configurable Data Analytics Pipelinenews-search-api
Internal API server that offers search access to the Media Cloud Online News Archive (in Elasticsearch).story-indexer
The core pipeline used to ingest online news stories in the Media Cloud archive.Love Open Source and this site? Check out how you can help us