• Stars
    star
    103
  • Rank 333,046 (Top 7 %)
  • Language
    Scala
  • License
    MIT License
  • Created over 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

GitHub stars GitHub contributors MIT License

What is Studio-9?

Studio9 is an open source platform for doing collaborative Data Management & AI/ML anywhere Whether your data is trapped in silos or you’re generating data at the edge, Studio9 gives you the flexibility to create AI and data engineering pipelines wherever your data is. And you can share your AI, Data, and Pipelines with anyone anywhere. With Studio9, you can achieve newfound agility to effortlessly move between compute environments, while all your data and your work replicates automatically to wherever you want.

Below are described the major components of Studio-9.

1. *ORION* - A service further consisting of three components namely Job Dispatcher, Job Supervisor and Job Resource Cleaner. Job Dispatcher mainly forwards messages from RabbitMQ to the proper Job Supervisor, instantiating it for each new job request. Job Supervisor is responsible for instantiating job master for each new job which will have a new job supervisor setup. Job Resource Cleaner consumes messages from RabbitMQ and spins a new JobResourcesCleanerWorker for handling each message which then executes tasks for cleaning the resources. 

2. *ARIES* - A microservice that allows read/write access to ElasticSearch. It stores Job Metadata, Heartbeats and Job Results in ElasticSearch as documents. 

3. *TAURUS* - This service works as a message dispatcher using SQS/SNS.

4. *BAILE* - It receives messages from the UI service called Salsa and then sends them to Cortex if its not Online Prediction. In case of Online Prediction, Salsa sends messages to Taurus which then sends them to Cortex.

5. *ARGO* - A service designed to capture all configuration parameters for all job types or services. These parameters are saved by Argo in ElasticSearch. 

6. *PEGASUS* - A prediction storage service that receives messages from Taurus via Orion to upload data to RedShift. The messages contain metadata for online prediction job and CSV file with prediction results. 

What are use cases?

Computational Data Core that Automatically Scales and Adapts to You

Imagine never having to worry about how to keep your data organized, keep track of how, when, and where it was manipulated, keep track of where it came from, or keep track of all its meta-data. Now imagine being able to effortlessly and securely share your data and its lineage with your colleagues. Finally, imagine being able to do any Analytics or Machine Learning right where your data is. The Studio9 Computational Data Core makes this all possible.

The Data Science Replication Engine

Every step you perform in the Analytics & AI Lifecycle results in a valuable asset – a snippet of code, or a data transformation pipeline, or a table of newly engineered data, or an album of images or a new algorithm. Imagine having the power to instantly use any asset anyone creates to build bigger and better AI models that constantly expand your power to generate breakthroughs. Studio9 gives your team the frictionless ability to organize, track, share, and re-use all your Analytics & AI assets.

Automated Model Governance & Compliance

Studio9 allows Model Risk Management, Regulatory Constraints, and Documentation Policies that your models must abide by to be encoded right into Pipeline and automatically reproduced every time a model is refreshed by Studio9. This includes Model Explainability, Model Fairness & Bias Analytics, Model Uncertainty, and Model Drift analytics – all of which are performed automatically. We don’t think AI makes machines smarter. It exists to make you smarter. The easier it is for you to make AI, the greater your ability to make breakthroughs. Whether you have unlimited compute resources in the cloud, or you are limited at the edge, your ability to make breakthroughs should be unencumbered. We are committed to giving you the breakthrough Data Management & AI/ML capabilities you need so you can create the breakthroughs you want – anywhere, anytime, and with anyone.

What Studio9 can do?

Reduce Your AI Workload 120x

Studio9 provides a large inventory of building blocks from which you can stitch together custom AI and Data Engineering pipelines. Rapidly assemble and test many different pipelines to create the AI you need. Turn your data into AI with near-zero effort and cost. Since Studio9 is an open platform, newer cutting-edge AI building blocks that are emerging every day are put right at your fingertips.

Studio9 helps you find the breakthroughs hidden in your data

Studio9 streamlines your burden of wrangling data. With its continuously expanding portfolio of building blocks, Studio9 makes it easier for you to clean, integrate, enrich, and harmonize your data. Do it all within your own infinitely scalable database environment without any of the hassle of managing your own database.

Push-button Model Deployment

You now have the power to deploy and run your Data Processing pipelines, Models, and AI anywhere – from infinitely scalable Cloud computing infrastructure to your own laptop to ultra-low power edge computing devices – with no additional programming or engineering effort required. We designed Studio9 for deployment flexibility so you can build, train, share, and execute your AI anywhere you want.

Flow of Studio-9

Studio-9_flow

How to deploy Studio9 on Local?

So for deploying the Studio-9 on local, we have to understand the sequence of the services to be deployed. But before deployment of services we need to see some prerequisites for application.

Prerequisites:

Mesos-Marathon Cluster Setup

Apache Zookeeper

     Version: 3.7.1

Deploying Zookeeper on local

Apache Mesos

     Version: 1.7.2

Deploying Mesos on local

Marathon

     Version: 1.5.0

Deploying Marathon on local

Here we need one more machine so for this we will create a VM on local machine by using Vagrant because mesos-marathon cluster work on master slave architecture.

Vagrant

Deploying Vagrant on local

Note: We will run Mesos-Master on base machine and Marathon as well as Mesos-Slave on VM.

Mesos-Slave:

Process to run mesos-slave on slave is same as specified above only difference is the command we will use to run.

./bin/mesos-slave.sh --master=<base-machine IP>:5050  --work_dir=/var/run/mesos --log_dir=/var/log/mesos --    containerizers=docker,mesos --image_providers=appc,docker --isolation=filesystem/linux,docker/runtime

Now, we will deploy the below services:

Elastic Search

Deploying Elastic Search on local

MongoDB

Deploying MongoDB on local

RabbutMQ

Deploying RabbitMQ on local

Postgress

Deploying Postgres on local

After the deployment of above services, we will deploy the below services in the same sequence as they are listed below:

Aries

Deploying Aries Service on local

Argo

Deploying Argo Service on local

Orion

Deploying Orion Service on loal

Cortex

Deploying Cortex Service on local

Pegasus

Deploying Pegasus Service on local

Taurus

Deploying Taurus Service on local

UM-Service

Deploying UM-Service on local

Baile

Deploying Baile Service on local

Salsa

Deploying Salsa Service on local


How to Create a docker images ?

Step1: When we change something in the code then we need to build a new docker image.

Step 2 : We just need to run the below command to build the image from the dockerfile.

If you are in the same directory where you have docker file.

     docker build -t <image_name>:<version> .

EX-

    docker build -t python:1.0 .         

If you are buling a image from the other side of your dockerfile 's Path then you can simply pass the path at the end of command :

    docker build -t <image_name>:<version> ./<PATH to file>

Ex-

    docker build -t python:1.0 ./<PATH to dockerfile>

Step 3: First you should tag the image accourding to your preferance:

    docker tag <image_name>:<version> <user_name>/repo_name>:<version>

Ex:

    docker tag python:latest username/python:1.0  

Step 4: Now we can push the image to the dockerhub or other cotainer registory:

    docker push username/python:1.0 

Step5 : Now we can change the image name in the code or where we are using this perticular image.


How to deploy Studio9 using Docker-Compose?

We'll be deploying Studio9 on local using a docker-compose file.

Prerequisites

  • OS: Ubuntu 16.04 LTS - 4vCPUs and 16GB memory.
  • Mesos-marathon Cluster
  • AWS account
  • AWS IAM
  • AWS S3 buckets
  • AWS S3 buckets accessible to AWS IAM
  • Docker should be installed on your local system.
  • If you don't have docker installed in your system, kindly refer to this link
  • After successfully installing Docker, clone the Repository.
  • Run the Docker Compose file by running the below command:

sh docker-compose up -d

or

sh docker compose up -d

  • If you want to see the logs, use the below command:

sh docker-compose up

  • To stop the services, use the below commands:

sh docker compose down

NOTE: Use the above commands in the directory where the docker compose file exists.

Explanation of Docker Compose

For running the Studio-9 on local, we are using docker-compose.

  • We are using a single network i.e. 'studio9' for all the services that'll run for studio-9.
  • Here we have 17 services that will be deployed on local machine to run the Studio-9.
  • There are four volumes being used in Studio-9, three for elastic-search and one for mongoDB.
  • The elastic-search master node is accessible at the port 9200.
  • Kibana service will run after the Elastic-search nodes are up and will be accessible at port 5601.
  • Mongo express service depends on mongo and will be accessible at 8081.
  • Zookeeper is using the same network i.e. 'studio9' and will be accessible 2181.
  • RabbitMQ is accessible at ports 5672 and 15672.
  • Next we have Aries service and it depends on Elastic-search nodes and will be accessible at 9000.
  • The Cortex service depends on Aries RabbitMQ and will be accessible at 9000.
  • The Argo service also depends on Elastic-search nodes and will be accessible at 9000.
  • Gemini service depends on zookeeper and sql-server and will be accessible at 9000.
  • Taurus service depends on RabbitMQ, Cortex, Baile, Argo and Aries and will be accessible at 9000.
  • Orion service depends on Cortex, Zookeeper and RabbitMQ and will be accessible at 9000.
  • Pegasus service depends on Taurus RabbitMQ and Postgres nad will be accessible at 9000.
  • UM service depends on Mongo and will be accessible at 9000.
  • Baile service depends on Mongo, UM service, Aries, Cortex, SQL-server and Zookeeper and will be accessible at 9000.
  • SQL-Server depends on UM Service and will be accessible at 9000.
  • Salsa service is responsible for the UI of Studio-9 and it depends on Baile with port 80.
  • Postgres service depends on postgres-db and will be accessible at 8080.

πŸ‘¨β€πŸ’» Author

🀝 Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

⭐️ Show your support

Give a ⭐️ if this project helped you!

πŸ“ License

Copyright Β© 2022 knoldus, Inc (https://www.knoldus.com).
This project is licensed under the MIT license.

More Repositories

1

Node.js_UserLogin_Template

This is Node.js Login Template with a nice User Interface. If you want to use Node.js as plateform and MongoDB as Database then this demo project can be used as a starting point for your application.
JavaScript
110
star
2

Lambda-Arch-Spark

Scala
73
star
3

Play-Spark-Scala

Scala
52
star
4

playing-microservices.g8

This activator project describes Microservice architecture design using Play 2.4.x and Scala
Scala
46
star
5

ScalaJs_Weather_Report

Weather Information System- Get the mood of your city on one click using Scala.js.
Scala
40
star
6

real-time-stream-processing-engine

This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.
Scala
40
star
7

playing-reactive-mongo.g8

This activator project describes a classic CRUD application in Play 2.4.x with ReactiveMongo
Scala
28
star
8

activator-play-slick-app

This is an activator project providing a seed for starting with Play & Slick, how to write unit test and how to use mocking for unit testing.
Scala
27
star
9

playing-reactjs.g8

This activator project describes a basic example to render UI using ReactJS with Play 2.4.x, Scala and Anorm. It also demonstrates the use of evolution in Play 2.4.x
Scala
26
star
10

scalajobz

Community platform for sharing and applying for Scala Jobs err Jobz
JavaScript
23
star
11

spark-graphx-twitter

An example of Spark and GraphX with Twitter as sample
Scala
20
star
12

Lift_UserLogin_Template

This is LiftWeb Login Template with a nice User Interface. If you want to use Liftweb as framework , Scala as Programming Language and MongoDB as Database then this demo project can be used as a starting point for your application .
CSS
20
star
13

spark-spray-starter

Scala
16
star
14

kafka-streams-scala-examples

Scala
16
star
15

hawk

Rust
16
star
16

Employee-Self-Service

Building Reactive play application with Anorm
Scala
15
star
17

playing-slick-pg

Reactive Play application with Slick extensions for PostgreSQL
Scala
15
star
18

activator-play-elasticsearch-autocomplete.g8

This is a Play activator project. It's describe how to build autocomplete search on the Elasticsearch.
Scala
15
star
19

akka-http-slick.g8

This is an activator project for providing a seed for starting with Akka-Http and Slick.
Scala
14
star
20

activator-play-slick-angularjs.g8

This is an activator project providing a seed for starting with Play & Slick using AngularJS
Scala
14
star
21

structured-streaming-application

Structured Streaming is a reference application showing how to easily integrate structured streaming Apache Spark Structured Streaming, Apache Cassandra and Apache Kafka for fast, structured streaming computations on data.
Scala
13
star
22

slick-starting-on-the-right-foot.g8

This is an activator project for showcasing best practices, writing unit test and providing a seed for starting with Slick.
Scala
12
star
23

akka-http-websocket-microservices.g8

Scala
12
star
24

spark-s3

Spark Plugin for Amazon S3
Scala
12
star
25

jwt-akka-http-example

Integrate JWT with Akka HTTP to handle Authentication and Authorization service .
Scala
11
star
26

activator-kafka-java-producer-consumer.g8

This is an activator project for showcasing how to read & write data from Kafka-cluster using Java Producer & Consumer API.
Java
11
star
27

activator-kafka-scala-producer-consumer.g8

This is an activator project for showcasing how to read & write data from Kafka-cluster using Scala Producer & Consumer API.
Scala
10
star
28

AWS-Lambda-With-Rust

Rust
9
star
29

FormDemoInPlay

The play.api.data package contains several helpers to handle HTTP form data submission and validation. This application would explain how to design forms with validations in play2.0 using scala and how to communicate with database using mongodb.
Scala
9
star
30

neo4j-scala-starter

Scala
8
star
31

spark-scala-async

Scala
7
star
32

jwt-play-authentication

Demonstration of JWT Authentication with Play Framework.
Scala
7
star
33

greetctl

Go
7
star
34

play-scala-tutorials

Play Scala Tutorials
HTML
7
star
35

akka-quartz-scheduler-application.g8

Scala
7
star
36

Sparkathon

A library having Java and Scala examples for Spark 2.x
Java
7
star
37

lagom-scala-sbt-standalone

A basic example of building a Lagom Scala project to run outside of the Lightbend Production Suite (aka ConductR).
Scala
7
star
38

simple-akka-http-websocket-example.g8

A simple example of Websockets Server with akka-http
Scala
7
star
39

kafka-tweet-producer.g8

This is an activator project. It describes how to pull tweets from twitter and push into Kafka.
Scala
6
star
40

scala-solr-akkahttp.g8

Scala
6
star
41

play-reactive-slick

Scala
6
star
42

playing-kundera-cassandra

A simple CRUD application in Play! Framework using Kundera and Cassandra as a Database.
Shell
6
star
43

akka-roller

Akka Training Project
Scala
6
star
44

load-external-scala-file

This application will teach how to compile and load an external scala file in a scala apllication
Scala
6
star
45

spark-streaming-gnip

An Apache Spark utility for pulling Tweets from Gnip's PowerTrack in realtime
Scala
6
star
46

CrossCuttingConcern_Scala

Implement Cross Cutting Concern using Aspect-Oriented Programming (AOP) in Scala
Scala
6
star
47

lagom-on-k8s

A Lagom Java example showcasing a Restaurant-like application, Maven flavor.
Java
6
star
48

rust-kcov

This is a Rust-Kcov example to show how you can generate rust code coverage in HTML format.
JavaScript
6
star
49

nodejs-crud

A simple application to learn CRUD operations using NodeJs, mongoDB and ejs as templating engine.
JavaScript
5
star
50

scala-design-patterns

Scala Design Patterns
Scala
5
star
51

knolx-portal

Knolx portal is a Knolx management portal
Scala
5
star
52

playing-aws-scala

A simple example of Amazon Web Services in the Scala way with Play Framework and AWScala
Scala
5
star
53

kafka-scala-producer-consumer-example

Scala
5
star
54

playing-multipartform

A basic example to handle and test MultipartFormData request in Play Framework 2.3.8
Scala
5
star
55

activator-kafka-spark-streaming.g8

This is an activator project for showcasing integration of Kafka 0.10 with Spark Streaming.
Java
4
star
56

AWS-Lambda-Example

A simple AWS lambda example to start with.
Java
4
star
57

StreamingKafkaDataWithSpark

Scala
4
star
58

Flinkathon

A library having Scala examples for Apache Flink
Scala
4
star
59

akka-http-file-upload

A basic application to upload a file using akka-http in Scala with its test cases
Scala
4
star
60

java8-google-guice-akka.g8

Java
4
star
61

jQuery-form-validation

CSS
4
star
62

functors-applicatives-monads

Scala
4
star
63

atdd-scalatest-scala

Acceptance Testing with ScalaTest
Scala
4
star
64

spark-streaming-meetup

Meetup sample code
Scala
4
star
65

angular-material-table-full-features

TypeScript
4
star
66

sicp

Exercises of SICP solved with Scala
Scala
4
star
67

substrate-barcode-scanner-pallet

This substrate based pallet scans 2D barcode of the product which is tied to a Blockchain system.
Rust
4
star
68

atdd-cucumber-scala

Small example of ATDD wth Cucumber and Scala
Scala
4
star
69

lagom-kafka-consumer-only.g8

This is a java lagom application which has only kafka consumer implemented
Java
4
star
70

crud-using-mean-stack

JavaScript
4
star
71

EmbeddedKafka

Scala
4
star
72

knol-spray-auth

Scala
4
star
73

scala-slick-mssql

Scala
4
star
74

lagom-scala-mysql-es.g8

Lagom Event Sourcing with MySQL
Scala
4
star
75

FormDemoInPlayWithMysql

How to connect a Play Application with Mysql as database in Scala.Play includes a simple data access layer called Anorm that uses plain SQL to interact with the database and provides an API to parse and transform the resulting datasets.
Scala
3
star
76

priorityActor

Scala
3
star
77

play-java-jpa.g8

Java
3
star
78

swagger-ui-akka-http.g8

Generating Swagger UI with Akka Http API.
JavaScript
3
star
79

notipieJs

JavaScript
3
star
80

spark-ignite

A sample application to demonstrate sharing RDDs states across spark applications.
Scala
3
star
81

Postgres-With-Rust

Rust
3
star
82

playing-json-object-filter-js

Play framework with client-side JS filtering of JSON objects and rendering HTML snippets via jQuery.
JavaScript
3
star
83

playing-gravatar.g8

This activator project describes a basic example to generate gravatar using email address with Play Framework
Scala
3
star
84

ajax-with-play

AJAX calling in Play Framework 2.3.4
Scala
3
star
85

flink-jdbc-connector-example.g8

This is an activator project showcasing use of Flink JDBC Connector.
Java
3
star
86

scalajobz-mobile

Mobile client for ScalaJobz
JavaScript
3
star
87

rust-times

Rust Times is a bi-weekly newsletter which gives you an overview of the most popular Rust articles, jobs, events, and news.
HTML
3
star
88

spray-akka-starter

Template to get started with Spray and Akka
Scala
3
star
89

lagom-spike

Scala
3
star
90

atlas-java-crud

CRUD application to interact with Apache Atlas Server
Java
3
star
91

lagom-poc

Java
3
star
92

kode-combat-2019-procespy

Rust
3
star
93

non_exhaustive_rust_template

A template to show use of `non_exhaustive` attribute to prevent downstream crates from exhaustively listing out all variants/fields. This attribute prevents breaking code changes in future releases.
Rust
3
star
94

akka-fsm-throttler

Scala
2
star
95

CodeSquad-maven-plugin

A CodeSquad maven plugin to automatic upload code quality report on CodeSquad server.
Java
2
star
96

Sudoku.g8

Scala
2
star
97

OPA-JWT-Validation

2
star
98

activator-stateful-kstream-kafka.g8

Java
2
star
99

twitter-finagle

Scala
2
star
100

coding-standards

coding-standards
Scala
2
star