• Stars
    star
    18
  • Rank 1,208,065 (Top 24 %)
  • Language
    Python
  • Created over 14 years ago
  • Updated over 14 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Groups a company's news articles from Yahoo Finance and Google Finance using non-negative matrix factorization (NMF)
This is a Python script that fetches news articles from Yahoo Finance and Google Finance RSS by stock symbol (e.g. MSFT, ORCL, USO), and groups the articles using non-negative matrix factorization (NMF).  Sometimes the groupings are wonky, but relevant themes throughout the stories do arise.  This can at least be used for a quick perusal of news articles, FWIW.

Beautiful Soup is used to extract the text from each article.  Stop words are removed and only English words are used in the word matrices for each article.  

The generated article groupings are rendered as an HTML file via a Cheetah template.  The HTML output is sent to ./output/<symbol_name>.html by default.

Usage:
gen_features.py -s <symbol> [-o <outputfile>] [-n <numfeatures>] [-i <iterations>]

Example:
python gen_features.py -s msft -n 10 -i 50

Requirements:
Beautiful Soup (http://www.crummy.com/software/BeautifulSoup/)
Cheetah (http://www.cheetahtemplate.org/)
feedparser (http://www.feedparser.org/)
NumPy (http://numpy.scipy.org/)
PyWordNet (http://osteele.com/projects/pywordnet/)