• Stars
    star
    9
  • Rank 1,939,727 (Top 39 %)
  • Language
    Python
  • Created about 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Examples of code in spark

More Repositories

1

nlp-in-practice

Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Jupyter Notebook
1,068
star
2

ROUGE-2.0

ROUGE automatic summarization evaluation toolkit. Support for ROUGE-[N, L, S, SU], stemming and stopwords in different languages, unicode text evaluation, CSV output.
Java
194
star
3

phrase-at-scale

Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English
Python
125
star
4

opinosis-summarization

This repo contains code and dataset for the Opinosis Summarization Framework
50
star
5

OpinRank

OpinRank Dataset. Dataset containing user reviews for entities namely cars and hotels. Full reviews from Tripadvisor (~259,000 reviews) and Edmunds (~42,230 reviews)
34
star
6

clinical-concepts

Discovering Related Clinical Concepts using Large Amounts of Clinical Notes. An unsupervised graphical approach to mine related concepts by leveraging the volume within large amounts of clinical notes.
22
star
7

stop-words

Stop word lists
4
star
8

hashtags_test

Test hashtags
2
star
9

Micropinion-Generation-Dataset

Dataset for Micropinion Generation. Dataset is based on user reviews from CNET. The reviews are on products from various categories like tv, cell phones, gps etc.
2
star
10

JavaPractice

Practice practice practice. Bubble sort, factorial, powerset, subarray, mergesort, remove duplicates, etc.
Java
1
star