• Stars
    star
    1,349
  • Rank 34,821 (Top 0.7 %)
  • Language
    Python
  • Created about 6 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Returns latest research results by crawling arxiv papers and summarizing abstracts. Helps you stay afloat with so many new papers everyday.

sotawhat

License

Read more about SOTAWHAT here.

You can use sotawhat through a web interface here. Thanks hmchuong!

This script runs using Python 3. It requires nltk, six, and pyspellchecker. To install it as a Python package, follow the following steps:

Step 1: clone this repo, and go inside that repo:

$ git clone [HTTPS or SSH linnk to this repo]
$ cd sotawhat

Step 2: install using pip

$ pip3 install .

On Windows, due to encoding errors, the script may cause issues when run on the command line. It is recommended to use pip install win-unicode-console --upgrade prior to launching the script. If you get UnicodeEncodingError, you must install the above.

In MacOS, you can get the SSL error

[nltk_data] Error loading punkt: <urlopen error [SSL:
[nltk_data]     CERTIFICATE_VERIFY_FAILED] certificate verify failed:
[nltk_data]     unable to get local issuer certificate (_ssl.c:1045)>

this will be fixed by reinstalling certificates

$ /Applications/Python\ 3.x/Install\ Certificates.command

Usage

This project adds the sotawhat script for you to run globally on Terminal or commandline.

To query for a certain keyword, run:

$ sotawhat [keyword] [number of results]

For example:

$ sotawhat perplexity 10

or

$ sotawhat language model 10

If you don't specify the number of results, by default, the script returns 5 results. Each result contains the title of the paper with author and published date, a summary of the abstract, and link to the paper.

We've found that this script works well with keywords that are:

  • a model (e.g. transformer, wavenet, ...)
  • a dataset (e.g. wikitext, imagenet, ...)
  • a task (e.g. language model, machine translation, fuzzing, ...)
  • a metric (e.g. BLEU, perplexity, ...)
  • random stuff