• Stars
    star
    2,029
  • Rank 22,818 (Top 0.5 %)
  • Language Makefile
  • License
    Other
  • Created almost 3 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications

Versatile Data Pipeline: unstructured data ETL

We're hiring 🚀

Doc | Website | Community | Blog


Instill VDP   Twitter URL

GitHub release (latest SemVer including pre-releases) Artifact Hub Discord Integration Test Documentation deployment workflow License MIT License ELv2

Versatile Data Pipeline (VDP) is a source available unstructured data ETL tool to streamline the end-to-end unstructured data processing pipeline:

  • Extract unstructured data from pre-built data sources such as cloud/on-prem storage, or IoT devices

  • Transform it into analysable or meaningful data representations by AI models

  • Load the transformed data into warehouses, applications, or other destinations

VDP Concept

Highlights

Demo playground

An online demo VDP instance has been provisioned, in which you can directly play around the basic features in its Console via https://demo.instill.tech.

Want to showcase your ML/DL models? We offer fully-managed VDP on Instill Cloud. Please sign up the form and we will reach out to you.

Prerequisites

  • macOS or Linux - VDP works on macOS or Linux, but does not support Windows yet.

  • Docker and Docker Compose - VDP uses Docker Compose (specifically, Compose V2 and Compose specification) to run all services at local. Please install the latest stable Docker and Docker Compose before using VDP.

  • yq > v4.x. Please follow the installation guide.

  • (Optional) NVIDIA Container Toolkit - To enable GPU support in VDP, please refer to NVIDIA Cloud Native Documentation to install NVIDIA Container Toolkit. If you'd like to specifically allot GPUs to VDP, you can set the environment variable NVIDIA_VISIBLE_DEVICES. For example, NVIDIA_VISIBLE_DEVICES=0,1 will make the triton-server consume GPU device id 0 and 1 specifically. By default NVIDIA_VISIBLE_DEVICES is set to all to use all available GPUs on the machine.

Quick start

Execute the following commands to start pre-built images with all the dependencies:

$ git clone https://github.com/instill-ai/vdp.git && cd vdp

# Launch all services
$ make all

🚀 That's it! Once all the services are up with health status, the UI is ready to go at http://localhost:3000!

VDO Console

Jump right in VDP 101: Create your first pipeline on VDP and explore other VDP tutorials.

Note

The image of model-backend (~2GB) and Triton Inference Server (~23GB) can take a while to pull, but this should be an one-time effort at the first setup.

Shut down VDP

To shut down all running services:

$ make down

Guidance philosophy

VDP is built with open heart and we expect VDP to be exposed to more MLOps integrations. It is implemented with microservice and API-first design principle. Instead of building all components from scratch, we've decided to adopt sophisticated open-source tools:

We hope VDP can also enrich the open-source communities in a way to bring more practical use cases in unstructured data processing.

Documentation

📔 Documentation

Check out the documentation & tutorials to learn VDP!

📘 API Reference

The gRPC protocols in protobufs provide the single source of truth for the VDP APIs. The genuine protobuf documentation can be found in our Buf Scheme Registry (BSR).

For the OpenAPI documentation, access http://localhost:3001 after make all, or simply run make doc.

Model Hub

We curate a list of ready-to-use models for VDP. These models are from different sources and have been tested by our team. Want to contribute a new model? Please create an issue, we are happy to test and add it to the list 👐.

Model Task Sources Framework CPU GPU
MobileNet v2 Image Classification GitHub-DVC ONNX
Vision Transformer (ViT) Image Classification Hugging Face ONNX
YOLOv4 Object Detection GitHub-DVC ONNX
YOLOv7 Object Detection GitHub-DVC ONNX
YOLOv7 W6 Pose Keypoint Detection GitHub-DVC ONNX
PSNet + EasyOCR Optical Character Recognition (OCR) GitHub-DVC ONNX
Mask RCNN Instance Segmentation GitHub-DVC PyTorch
Lite R-ASPP based on MobileNetV3 Semantic Segmentation GitHub-DVC ONNX
Stable Diffusion Text to Image GitHub-DVC, Local-CPU, Local-GPU ONNX
Megatron GPT2 Text Generation GitHub-DVC FasterTransformer

Note: The GitHub-DVC source in the table means importing a model into VDP from a GitHub repository that uses DVC to manage large files.

Community support

For general help using VDP, you can use one of these channels:

  • GitHub - bug reports, feature requests, project discussions and contributions

  • Discord - live discussion with the community and our team

  • Newsletter & Twitter - get the latest updates

If you are interested in hosting service of VDP, we've started signing up users to our private alpha. Get early access and we'll contact you when we're ready.

Contributing

We love contribution to VDP in any forms:

Note
Code in the main branch tracks under-development progress towards the next release and may not work as expected. If you are looking for a stable alpha version, please use latest release.

License

See the LICENSE file for licensing information.

We're hiring 🚀

Interested in building VDP with us? Join our remote team and build the future for unstructured data ETL. Check out our open roles.

More Repositories

1

console

📺 Instill Console for 🔮 Instill Core: https://github.com/instill-ai/instill-core
TypeScript
34
star
2

instill.tech

🎁 Instill AI's product website
TypeScript
23
star
3

cli

⌨️ Instill CLI for 🔮 Instill Core: https://github.com/instill-ai/instill-core
Go
21
star
4

deprecated-model

⚗️ Instill Model contains components for AI model orchestration
Makefile
20
star
5

community

🎉 All about our lovely community's feedback, feature request, bug report, whim, etc.
19
star
6

pipeline-backend

⇋ A REST/gRPC server for Instill VDP API service
Go
18
star
7

model-backend

⇋ A REST/gRPC server for Instill Model API service
Go
16
star
8

cortex

🏖️ Instill AI's cortex for frontend
TypeScript
13
star
9

deprecated-core

🔮 Instill Core contains components for supporting Instill VDP and Instill Model
Makefile
13
star
10

protogen-go

🖇 The auto-generated Go code by protocol buffer compiler
6
star
11

protobufs

🐃 A collection of Instill Protocol Buffers
templ
6
star
12

python-sdk

📦 Python SDK for Instill Core/Cloud
Python
6
star
13

cookbook

📔 A collection of Jupyter Notebooks
5
star
14

api-gateway

⾨ KrakenD API Gateway
Go
5
star
15

component

⚙️ Instill Component enhances Instill VDP, unlocking limitless possibilities
Go
5
star
16

instill-core-stats

📈 Statistics and traffic reports for Instill Core
HTML
4
star
17

model-ocr-dvc

⚗️ OCR model repository based on MMOCR model and EasyOCR managed by DVC
Python
4
star
18

typescript-sdk

📦 Typescript SDK for Instill Core/Cloud
4
star
19

demo

🎬 It's demo time!
Python
3
star
20

connector-backend

⇋ A REST/gRPC server for Instill AI's data connector service
JavaScript
3
star
21

helm-charts

⎈ The Helm charts of Instill AI
2
star
22

model-diffusion-dvc

⚗️ Diffusion model repository based on HuggingFace Diffusion 2.1 managed by DVC
Python
2
star
23

x

𝕏 Go libraries shared by Instill Go repositories
Go
2
star
24

protogen-python

🖇 The auto-generated Python code by protocol buffer compiler
Python
2
star
25

connector

🔌 Connectors for Instill VDP
Go
2
star
26

model-phi-3-mini

⚗️ Phi-3-mini 3.8B instruct model repository
Python
1
star
27

model-semantic-segmentation-dvc

⚗️ Semantic segmentation model repository based on OpenMMLab managed by DVC
Python
1
star
28

model-stomata-instance-segmentation-dvc

⚗️ Stomata Instance segmentation model repository based on Detectron2 managed by DVC
Python
1
star
29

controller

🎮 A controller to management all VDP states
Go
1
star
30

.github

🏡 Instill AI organisation profile and default configuration
1
star
31

model-llama2-7b-dvc

⚗️ Llama2 7b model repository trained by meta managed by DVC
Python
1
star
32

model-llava-7b-dvc

⚗️ Llava 7b model repository trained by liuhaotian managed by DVC
Python
1
star
33

controller-vdp

🎮 A controller-vdp manages components in Instill VDP
1
star
34

connector-ai

🔌 Connectors for AI models
Go
1
star
35

controller-model

🎮 A controller-model manages components in Instill Model
Go
1
star
36

homebrew-tap

🍺 Homebrew formulae that allows installation of Instill AI tools through the Homebrew package manager
Ruby
1
star
37

mgmt-backend

⇋ A REST/gRPC server for Instill AI's Management API service
Go
1
star
38

model-instance-segmentation-dvc

⚗️ Instance segmentation model repository based on Detectron2 managed by DVC
Python
1
star
39

model-yolov7-dvc

⚗️ YOLOv7 trained on MSCOCO for object detection managed by DVC
Python
1
star
40

triton-python-model

🔩 Python package to streamline model serving in Triton and custom conda environment for Triton python-backend
Python
1
star
41

operator

⚙️ Operators for Instill VDP
Go
1
star