• Stars
    star
    20,716
  • Rank 1,208 (Top 0.03 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created almost 5 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

☁️ Build multimodal AI applications with cloud-native stack

Jina logo: Build multimodal AI services via cloud native technologies · Model Serving · Generative AI · Neural Search · Cloud Native

Build multimodal AI applications with cloud-native technologies

PyPI PyPI - Downloads from official pypistats Github CD status

Jina lets you build multimodal AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production. You can focus on your logic and algorithms, without worrying about the infrastructure complexity.

Jina provides a smooth Pythonic experience for serving ML models transitioning from local deployment to advanced orchestration frameworks like Docker-Compose, Kubernetes, or Jina AI Cloud. Jina makes advanced solution engineering and cloud-native technologies accessible to every developer.

Wait, how is Jina different from FastAPI? Jina's value proposition may seem quite similar to that of FastAPI. However, there are several fundamental differences:

Data structure and communication protocols

  • FastAPI communication relies on Pydantic and Jina relies on DocArray allowing Jina to support multiple protocols to expose its services. The support for gRPC protocol is specially useful for data intensive applications as for embedding services where the embeddings and tensors can be more efficiently serialized.

Advanced orchestration and scaling capabilities

  • Jina allows you to easily containerize and orchestrate your services and models, providing concurrency and scalability.
  • Jina lets you deploy applications formed from multiple microservices that can be containerized and scaled independently.

Journey to the cloud

  • Jina provides a smooth transition from local development (using DocArray) to local serving using Deployment and Flow to having production-ready services by using Kubernetes capacity to orchestrate the lifetime of containers.
  • By using Jina AI Cloud you have access to scalable and serverless deployments of your applications in one command.

Documentation

Install

pip install jina

Find more install options on Apple Silicon/Windows.

Get Started

Basic Concepts

Jina has three fundamental layers:

  • Data layer: BaseDoc and DocList (from DocArray) are the input/output formats in Jina.
  • Serving layer: An Executor is a Python class that transforms and processes Documents. By simply wrapping your models into an Executor, you allow them to be served and scaled by Jina. Gateway is the service making sure connecting all Executors inside a Flow.
  • Orchestration layer: Deployment serves a single Executor, while a Flow serves Executors chained into a pipeline.

The full glossary is explained here.

Serve AI models

Let's build a fast, reliable and scalable gRPC-based AI service. In Jina we call this an Executor. Our simple Executor will wrap the StableLM LLM from Stability AI. We'll then use a Deployment to serve it.

Note
A Deployment serves just one Executor. To combine multiple Executors into a pipeline and serve that, use a Flow.

Let's implement the service's logic:

executor.py
from jina import Executor, requests
from docarray import DocList, BaseDoc

from transformers import pipeline


class Prompt(BaseDoc):
    text: str


class Generation(BaseDoc):
    prompt: str
    text: str


class StableLM(Executor):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.generator = pipeline(
            'text-generation', model='stabilityai/stablelm-base-alpha-3b'
        )

    @requests
    def generate(self, docs: DocList[Prompt], **kwargs) -> DocList[Generation]:
        generations = DocList[Generation]()
        prompts = docs.text
        llm_outputs = self.generator(prompts)
        for prompt, output in zip(prompts, llm_outputs):
            generations.append(Generation(prompt=prompt, text=output))
        return generations

Then we deploy it with either the Python API or YAML:

Python API: deployment.py YAML: deployment.yml
from jina import Deployment
from executor import StableLM

dep = Deployment(uses=StableLM, timeout_ready=-1, port=12345)

with dep:
    dep.block()
jtype: Deployment
with:
  uses: StableLM
  py_modules:
    - executor.py
  timeout_ready: -1
  port: 12345

And run the YAML Deployment with the CLI: jina deployment --uses deployment.yml

Use Jina Client to make requests to the service:

from jina import Client
from docarray import DocList, BaseDoc


class Prompt(BaseDoc):
    text: str


class Generation(BaseDoc):
    prompt: str
    text: str


prompt = Prompt(
    text='suggest an interesting image generation prompt for a mona lisa variant'
)

client = Client(port=12345)  # use port from output above
response = client.post(on='/', inputs=[prompt], return_type=DocList[Generation])

print(response[0].text)
a steampunk version of the Mona Lisa, incorporating mechanical gears, brass elements, and Victorian era clothing details

Note
In a notebook, you can't use deployment.block() and then make requests to the client. Please refer to the Colab link above for reproducible Jupyter Notebook code snippets.

Build a pipeline

Sometimes you want to chain microservices together into a pipeline. That's where a Flow comes in.

A Flow is a DAG pipeline, composed of a set of steps, It orchestrates a set of Executors and a Gateway to offer an end-to-end service.

Note
If you just want to serve a single Executor, you can use a Deployment.

For instance, let's combine our StableLM language model with a Stable Diffusion image generation model. Chaining these services together into a Flow will give us a service that will generate images based on a prompt generated by the LLM.

text_to_image.py
import numpy as np
from jina import Executor, requests
from docarray import BaseDoc, DocList
from docarray.documents import ImageDoc


class Generation(BaseDoc):
    prompt: str
    text: str


class TextToImage(Executor):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        from diffusers import StableDiffusionPipeline
        import torch

        self.pipe = StableDiffusionPipeline.from_pretrained(
            "CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16
        ).to("cuda")

    @requests
    def generate_image(self, docs: DocList[Generation], **kwargs) -> DocList[ImageDoc]:
        images = self.pipe(
            docs.text
        ).images  # image here is in [PIL format](https://pillow.readthedocs.io/en/stable/)
        docs.tensor = np.array(images)

Build the Flow with either Python or YAML:

Python API: flow.py YAML: flow.yml
from jina import Flow
from executor import StableLM
from text_to_image import TextToImage

flow = (
    Flow(port=12345)
    .add(uses=StableLM, timeout_ready=-1)
    .add(uses=TextToImage, timeout_ready=-1)
)

with flow:
    flow.block()
jtype: Flow
with:
    port: 12345
executors:
  - uses: StableLM
    timeout_ready: -1
    py_modules:
      - executor.py
  - uses: TextToImage
    timeout_ready: -1
    py_modules:
      - text_to_image.py

Then run the YAML Flow with the CLI: jina flow --uses flow.yml

Then, use Jina Client to make requests to the Flow:

from jina import Client
from docarray import DocList, BaseDoc
from docarray.documents import ImageDoc


class Prompt(BaseDoc):
    text: str


prompt = Prompt(
    text='suggest an interesting image generation prompt for a mona lisa variant'
)

client = Client(port=12345)  # use port from output above
response = client.post(on='/', inputs=[prompt], return_type=DocList[ImageDoc])

response[0].display()

Easy scalability and concurrency

Why not just use standard Python to build that service and pipeline? Jina accelerates time to market of your application by making it more scalable and cloud-native. Jina also handles the infrastructure complexity in production and other Day-2 operations so that you can focus on the data application itself.

Increase your application's throughput with scalability features out of the box, like replicas, shards and dynamic batching.

Let's scale a Stable Diffusion Executor deployment with replicas and dynamic batching:

  • Create two replicas, with a GPU assigned for each.
  • Enable dynamic batching to process incoming parallel requests together with the same model inference.
Normal Deployment Scaled Deployment
jtype: Deployment
with:
  uses: TextToImage
  timeout_ready: -1
  py_modules:
    - text_to_image.py
jtype: Deployment
with:
  uses: TextToImage
  timeout_ready: -1
  py_modules:
    - text_to_image.py
  env:
   CUDA_VISIBLE_DEVICES: RR
  replicas: 2
  uses_dynamic_batching: # configure dynamic batching
    /default:
      preferred_batch_size: 10
      timeout: 200

Assuming your machine has two GPUs, using the scaled deployment YAML will give better throughput compared to the normal deployment.

These features apply to both Deployment YAML and Flow YAML. Thanks to the YAML syntax, you can inject deployment configurations regardless of Executor code.

Deploy to the cloud

Containerize your Executor

In order to deploy your solutions to the cloud, you need to containerize your services. Jina provides the Executor Hub, the perfect tool to streamline this process taking a lot of the troubles with you. It also lets you share these Executors publicly or privately.

You just need to structure your Executor in a folder:

TextToImage/
├── executor.py
├── config.yml
├── requirements.txt
config.yml requirements.txt
jtype: TextToImage
py_modules:
  - executor.py
metas:
  name: TextToImage
  description: Text to Image generation Executor based on StableDiffusion
  url:
  keywords: []
diffusers
accelerate
transformers

Then push the Executor to the Hub by doing: jina hub push TextToImage.

This will give you a URL that you can use in your Deployment and Flow to use the pushed Executors containers.

jtype: Flow
with:
    port: 12345
executors:
  - uses: jinai+docker://<user-id>/StableLM
  - uses: jinai+docker://<user-id>/TextToImage

Get on the fast lane to cloud-native

Using Kubernetes with Jina is easy:

jina export kubernetes flow.yml ./my-k8s
kubectl apply -R -f my-k8s

And so is Docker Compose:

jina export docker-compose flow.yml docker-compose.yml
docker-compose up

Note
You can also export Deployment YAML to Kubernetes and Docker Compose.

That's not all. We also support OpenTelemetry, Prometheus, and Jaeger.

What cloud-native technology is still challenging to you? Tell us and we'll handle the complexity and make it easy for you.

Deploy to JCloud

You can also deploy a Flow to JCloud, where you can easily enjoy autoscaling, monitoring and more with a single command.

First, turn the flow.yml file into a JCloud-compatible YAML by specifying resource requirements and using containerized Hub Executors.

Then, use jina cloud deploy command to deploy to the cloud:

wget https://raw.githubusercontent.com/jina-ai/jina/master/.github/getting-started/jcloud-flow.yml
jina cloud deploy jcloud-flow.yml

Warning

Make sure to delete/clean up the Flow once you are done with this tutorial to save resources and credits.

Read more about deploying Flows to JCloud.

Streaming for LLMs

Large Language Models can power a wide range of applications from chatbots to assistants and intelligent systems. However, these models can be heavy and slow and your users want systems that are both intelligent and fast!

Large language models work by turning your questions into tokens and then generating new token one at a time until it decides that generation should stop. This means you want to stream the output tokens generated by a large language model to the client. In this tutorial, we will discuss how to achieve this with Streaming Endpoints in Jina.

Service Schemas

The first step is to define the streaming service schemas, as you would do in any other service framework. The input to the service is the prompt and the maximum number of tokens to generate, while the output is simply the token ID:

from docarray import BaseDoc


class PromptDocument(BaseDoc):
    prompt: str
    max_tokens: int


class ModelOutputDocument(BaseDoc):
    token_id: int
    generated_text: str

Service initialization

Our service depends on a large language model. As an example, we will use the gpt2 model. This is how you would load such a model in your executor

from jina import Executor, requests
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')


class TokenStreamingExecutor(Executor):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.model = GPT2LMHeadModel.from_pretrained('gpt2')

Implement the streaming endpoint

Our streaming endpoint accepts a PromptDocument as input and streams ModelOutputDocuments. To stream a document back to the client, use the yield keyword in the endpoint implementation. Therefore, we use the model to generate up to max_tokens tokens and yield them until the generation stops:

class TokenStreamingExecutor(Executor):
    ...

    @requests(on='/stream')
    async def task(self, doc: PromptDocument, **kwargs) -> ModelOutputDocument:
        input = tokenizer(doc.prompt, return_tensors='pt')
        input_len = input['input_ids'].shape[1]
        for _ in range(doc.max_tokens):
            output = self.model.generate(**input, max_new_tokens=1)
            if output[0][-1] == tokenizer.eos_token_id:
                break
            yield ModelOutputDocument(
                token_id=output[0][-1],
                generated_text=tokenizer.decode(
                    output[0][input_len:], skip_special_tokens=True
                ),
            )
            input = {
                'input_ids': output,
                'attention_mask': torch.ones(1, len(output[0])),
            }

Learn more about streaming endpoints from the Executor documentation.

Serve and send requests

The final step is to serve the Executor and send requests using the client. To serve the Executor using gRPC:

from jina import Deployment

with Deployment(uses=TokenStreamingExecutor, port=12345, protocol='grpc') as dep:
    dep.block()

To send requests from a client:

import asyncio
from jina import Client


async def main():
    client = Client(port=12345, protocol='grpc', asyncio=True)
    async for doc in client.stream_doc(
        on='/stream',
        inputs=PromptDocument(prompt='what is the capital of France ?', max_tokens=10),
        return_type=ModelOutputDocument,
    ):
        print(doc.generated_text)


asyncio.run(main())
The
The capital
The capital of
The capital of France
The capital of France is
The capital of France is Paris
The capital of France is Paris.

Support

Join Us

Jina is backed by Jina AI and licensed under Apache-2.0.

More Repositories

1

clip-as-service

🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
Python
12,150
star
2

reader

Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
TypeScript
6,640
star
3

dalle-flow

🌊 A Human-in-the-Loop workflow for creating HD images from text
Python
2,831
star
4

dev-gpt

Your Virtual Development Team
Python
1,756
star
5

langchain-serve

⚡ Langchain apps in production using Jina & FastAPI
Python
1,601
star
6

finetuner

🎯 Task-oriented embedding tuning for BERT, CLIP, etc.
Python
1,455
star
7

thinkgpt

Agent techniques to augment your LLM and push it beyong its limits
Python
1,402
star
8

auto-gpt-web

Set Your Goals, AI Achieves Them.
TypeScript
749
star
9

agentchain

Chain together LLMs for reasoning & orchestrate multiple large models for accomplishing complex tasks
Python
583
star
10

docarray

The data structure for unstructured data
Python
522
star
11

vectordb

A Python vector database you just need - no more, no less.
Python
519
star
12

jcloud

Simplify deploying and managing Jina projects on Jina Cloud
Python
294
star
13

jina-video-chat

Python
266
star
14

jinabox.js

A lightweight, customizable omnibox in Javascript, for use with a Jina backend.
JavaScript
219
star
15

annlite

⚡ A fast embedded library for approximate nearest neighbor search
Python
216
star
16

rungpt

An open-source cloud-native of large multi-modal models (LMMs) serving framework.
Python
147
star
17

fastapi-serve

FastAPI to the Cloud, Batteries Included! ☁️🔋🚀
Python
139
star
18

jina-hub

An open-registry for hosting Jina executors via container images
Python
103
star
19

dashboard

Interactive UI for analyzing Jina logs, designing Flows and viewing Hub images
TypeScript
100
star
20

GoldRetriever

Create and host retrieval plugins for ChatGPT in one click
Python
63
star
21

jinaai-py

Python
48
star
22

example-multimodal-fashion-search

Input text or image, get back matching image fashion results, using Jina, DocArray, and CLIP
Python
45
star
23

streamlit-jina

Streamlit component for Jina neural search
Python
37
star
24

docs

Jina V1 Official Documentation. For the latest one, please check out https://docs.jina.ai
HTML
35
star
25

jinaai-js

TypeScript
28
star
26

executors

internal-only
Python
28
star
27

jerboa

LLM finetuning
Python
27
star
28

jina-ai.github.io

Homepage of Jina AI Limited
HTML
27
star
29

example-meme-search

Meme search engine built with Jina neural search framework. Search with captions or image files to find matching memes.
Python
23
star
30

example-app-store

App store search example, using Jina as backend and Streamlit as frontend
Python
21
star
31

docsQA-ui

Web UI for docsQA. Main branch: https://jina-docqa-ui.netlify.app/
TypeScript
20
star
32

example-speech-to-image

An example of building a speech to image generation pipeline with Jina, Whisper and StableDiffusion
Python
20
star
33

workshops

Jupyter Notebook
19
star
34

jina-hubble-sdk

Python API for authentication, resource management with Hubble
Python
19
star
35

product-recommendation-redis-docarray

Python
18
star
36

career

Find out job opportunities at Jina AI
17
star
37

executor-3d-encoder

An executor that wraps 3D mesh models and encodes 3D content documents to d-dimension vector.
Python
16
star
38

client-go

Golang Client for Jina (https://github.com/jina-ai/jina)
Go
16
star
39

benchmark

Benchmark environment and results of different versions of Jina.
Python
14
star
40

action-hub-builder

Simple interface for building & validating Jina Hub executors.
Python
12
star
41

inference-client

Python
12
star
42

executor-hnsw-postgres

A production-ready, scalable Indexer for the Jina neural search framework, based on HNSW and PSQL
Python
12
star
43

now

Python
11
star
44

cookiecutter-jina

Cookiecutter template for a Jina project
Python
10
star
45

simple-jina-examples

Python
9
star
46

executor-simpleindexer

Simple Indexer
Python
9
star
47

executor-clip-encoder

Encoder that embeds documents using either the CLIP vision encoder or the CLIP text encoder, depending on the content type of the document.
Python
9
star
48

cloud-ops

Python
8
star
49

good-first-issues

Issues that don't fit under Jina's other repos!
8
star
50

api

API schema of Jina command line interface exposed as JSON and YAML files.
HTML
8
star
51

inference-client-js

TypeScript
7
star
52

executor-text-transformers-dprreader-ranker

DPRReaderRanker
Python
7
star
53

executor-video-loader

Python
7
star
54

executor-image-clip-encoder

CLIPImageEncoder is an image encoder that wraps the image embedding functionality using the CLIP
Python
7
star
55

.github

This repository stores github actions templates as described https://docs.github.com/en/actions/learn-github-actions/sharing-workflows-with-your-organization
7
star
56

GSoC

Google Summer of Code
7
star
57

example-wikipedia-recommendation

An example of graph embeddings for wikipedia page recommendations
Jupyter Notebook
6
star
58

executor-U100KIndexer

An Indexer that works out-of-the-box when you have less than 100K stored Documents
Python
6
star
59

devrel-heartmaker

Heart mosaics of your GitHub contributors
Python
6
star
60

executor-text-transformers-torch-encoder

**TransformerTorchEncoder** wraps the torch-version of transformers from huggingface. It encodes text data into dense vectors.
Python
6
star
61

executor-cases

Summarize all Executor patterns for Hubble
Python
5
star
62

executor-normalizer

Jina executor package normalizer
Python
5
star
63

auth

deprecated, use `jina-hubble-sdk`
Python
5
star
64

jina-commons

A collection of shared function for Jina Executor
Python
5
star
65

tutorial-notebooks

Jupyter Notebook
5
star
66

jina-paddle-hackathon

极纳 x 百度飞桨 黑客马拉松
Python
5
star
67

executor-image-preprocessor

An executor that performs standard pre-processing and normalization on images.
Python
5
star
68

jina-hackathon

Support repo for Jina X Hackathon - Sep 2020
5
star
69

executor-featurehasher

FeatureHasher
Python
4
star
70

jina-sagemaker

Jina Embedding Models on AWS SageMaker
Jupyter Notebook
4
star
71

stress-test

A collection of stress tests of Jina infrastructure
Python
4
star
72

executor-image-clip-classifier

Python
4
star
73

executor-text-transformerqa

**TransformerQAExecutor* wraps a question-answering model from huggingface and return relevant answers given questions and contexts/paragraphs.
Python
4
star
74

executor-faissindexer

A similarity search indexer based on Faiss. https://hub.jina.ai/executor/8gsd0tts
Python
4
star
75

hub-integration

Integration test for hub
Python
4
star
76

example-audio-search

Python
3
star
77

example-video-qa

This is an example of building a video QA with jina
TypeScript
3
star
78

jinad

Management of Jina on remote
Python
3
star
79

executor-indexers

Indexer Executors for Jina
Python
3
star
80

executor-text-dpr-encoder

Encode text into embeddings using the DPR model.
Python
3
star
81

legacy-examples

Unmaintained examples for Jina
Python
3
star
82

executor-clip-image

Executor for the pre-trained clip model. https://openai.com/blog/clip/
Python
3
star
83

executor-weaviate-indexer

Python
3
star
84

executor-doc2query

Python
3
star
85

executor-image-paddle-encoder

Python
3
star
86

jupyter-notebooks

Jupyter Notebook
3
star
87

executor-evaluator-ranking

Python
3
star
88

executor-yolov5

Python
3
star
89

executor-lightgbm-ranker

Python
3
star
90

terraform-jina-jinad-aws

Module for deploying JinaD on AWS
HCL
3
star
91

encoder-image-torch

The ImageTorchEncoder encodes Document content from a ndarray to an d-dimensional vector.
Python
3
star
92

example-odqa

Roff
2
star
93

executor-text-clip-encoder

Encode text into embeddings using the CLIP model.
Python
2
star
94

jina-ui

Monorepo for JinaJS and frontend projects
TypeScript
2
star
95

executor-audio-clip-encoder

Wraps the AudioCLIP model for generating embeddings for audio data for the Jina framework
Python
2
star
96

executor-matchmerger

**MatchMerger** Merges the results of shards by appending all matches.
Python
2
star
97

executor-image-niireader

Python
2
star
98

executor-image-normalizer

Executor that reads, resizes, crops and normalizes images.
Python
2
star
99

executor-vgg-audio-encoder

Python
2
star
100

executor-image-hasher

An executor to encode images using comparable hashing techniques. Useful for duplicate detection
Python
2
star