• Stars
    star
    172
  • Rank 219,998 (Top 5 %)
  • Language
    Python
  • Created over 12 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Set of Machine Learning and Stochastic Optimazion tools based on Hadoop, Spark and Storm https://pkghosh.wordpress.com/

Introduction

Set of predictive and exploratory machine learning tools with Spark and Python

Philosophy

  • Simple to use
  • Input output in CSV format
  • Metadata defined in simple JSON file
  • Extremely configurable with tons of configuration knobs

Solution

  • Exploratry Analytic
  • KNN Cluster
  • Naive Bayes
  • Discrimininant analysis
  • Nearest Neighbor
  • Decision Tree and Random Forest
  • SVM
  • Association Mining
  • Reinforcement learning
  • Multi Arm Bandit
  • Stochastic Optimization
  • Feedforward Network
  • LSTM
  • Autoencoder
  • Deep Reinforcement Learning
  • NLP and Neural Language Model
  • Graph Convolution Network
  • MLOps

Blogs

The following blogs of mine are good source of details of avenir. These are the only source of detail documentation

Getting started

Project's resource directory has various tutorial documents for the use cases described in the blogs.

Configuration

All configuration parameters are described in the wiki page https://github.com/pranab/avenir/wiki/Configuration

Build

Please refer to resource/dependency.txt for build time and run time dependencies

For Hadoop 1

  • mvn clean install

For Hadoop 2 (non yarn)

  • git checkout nuovo
  • mvn clean install

For Hadoop 2 (yarn)

  • git checkout nuovo
  • mvn clean install -P yarn

Help

Please feel free to email me at [email protected]

Contribution

Contributors are welcome. Please email me at [email protected]