• This repository has been archived on 19/Oct/2023
  • Stars
    star
    1,573
  • Rank 28,884 (Top 0.6 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 1 year ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

โšก Langchain apps in production using Jina & FastAPI

โšก LangChain Apps on Production with Jina & FastAPI ๐Ÿš€

PyPI PyPI - Downloads from official pypistats Github CD status

Jina is an open-source framework for building scalable multi modal AI apps on Production. LangChain is another open-source framework for building applications powered by LLMs.

langchain-serve helps you deploy your LangChain apps on Jina AI Cloud in a matter of seconds. You can benefit from the scalability and serverless architecture of the cloud without sacrificing the ease and convenience of local development. And if you prefer, you can also deploy your LangChain apps on your own infrastructure to ensure data privacy. With langchain-serve, you can craft REST/Websocket APIs, spin up LLM-powered conversational Slack bots, or wrap your LangChain apps into FastAPI packages on cloud or on-premises.

Give us a โญ and tell us what more you'd like to see!

โ˜๏ธ LLM Apps as-a-service

langchain-serve currently wraps following apps as a service to be deployed on Jina AI Cloud with one command.

๐Ÿ”ฎ AutoGPT-as-a-service

AutoGPT is an "AI agent" that given a goal in natural language, will attempt to achieve it by breaking it into sub-tasks and using the internet and other tools in an automatic loop.

Show usage
  • Deploy autogpt on Jina AI Cloud with one command

    lc-serve deploy autogpt
    Show command output
    โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
    โ”‚ App ID       โ”‚                                           autogpt-6cbd489454                                           โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ Phase        โ”‚                                                Serving                                                 โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ Endpoint     โ”‚                                 wss://autogpt-6cbd489454.wolf.jina.ai                                  โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ App logs     โ”‚                                        dashboards.wolf.jina.ai                                         โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ Swagger UI   โ”‚                              https://autogpt-6cbd489454.wolf.jina.ai/docs                              โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ OpenAPI JSON โ”‚                          https://autogpt-6cbd489454.wolf.jina.ai/openapi.json                          โ”‚
    โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
    
  • Integrate autogpt with external services using APIs. Get a flavor of the integration on your CLI with

    lc-serve playground autogpt
    Show playground

๐Ÿง  Babyagi-as-a-service

Babyagi is a task-driven autonomous agent that uses LLMs to create, prioritize, and execute tasks. It is a general-purpose AI agent that can be used to automate a wide variety of tasks.

Show usage
  • Deploy babyagi on Jina AI Cloud with one command

    lc-serve deploy babyagi
  • Integrate babyagi with external services using our Websocket API. Get a flavor of the integration on your CLI with

    lc-serve playground babyagi
    Show playground

๐Ÿผ pandas-ai-as-a-service

pandas-ai integrates LLM capabilities into Pandas, to make dataframes conversational in Python code. Thanks to langchain-serve, we can now expose pandas-ai APIs on Jina AI Cloud in just a matter of seconds.

Show usage
  • Deploy pandas-ai on Jina AI Cloud

    lc-serve deploy pandas-ai
    Show command output
    โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
    โ”‚ App ID       โ”‚                               pandasai-06879349ca                               โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ Phase        โ”‚                                     Serving                                     โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ Endpoint     โ”‚                     wss://pandasai-06879349ca.wolf.jina.ai                      โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ App logs     โ”‚                             dashboards.wolf.jina.ai                             โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ Swagger UI   โ”‚                  https://pandasai-06879349ca.wolf.jina.ai/docs                  โ”‚
    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
    โ”‚ OpenAPI JSON โ”‚              https://pandasai-06879349ca.wolf.jina.ai/openapi.json              โ”‚
    โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
    
  • Upload your DataFrame to Jina AI Cloud (Optional - you can also use a publicly available CSV)

    • Define your DataFrame in a Python file

      # dataframe.py
      import pandas as pd
      df = pd.DataFrame(some_data)
    • Upload your DataFrame to Jina AI Cloud using <module>:<variable> syntax

      lc-serve util upload-df dataframe:df
  • Conversationalize your DataFrame using pandas-ai APIs. Get a flavor of the integration with a local playground on your CLI with

    lc-serve playground pandas-ai <host>
    Show playground

๐Ÿ’ฌ Question Answer Bot on PDFs

pdfqna is a simple question answering bot that uses LLMs to answer questions on PDF documents, showcasing the how easy it is to integrate langchain apps on Jina AI Cloud.

Show usage
  • Deploy pdf_qna on Jina AI Cloud with one command

    lc-serve deploy pdf-qna
  • Get a flavor of the integration with Streamlit playground on your CLI with

    lc-serve playground pdf-qna
    Show playground
  • Expand the Q&A bot to multiple languages, different document types & integrate with external services using simple REST APIs.

    @serving
    def ask(urls: Union[List[str], str], question: str) -> str:
    content = load_pdf_content(urls)
    chain = get_qna_chain(OpenAI())
    return chain.run(input_document=content, question=question)

๐Ÿ’ช Features

๐ŸŽ‰ LLM Apps on production

๐Ÿ”ฅ Secure, Scalable, Serverless, Streaming REST/Websocket APIs on Jina AI Cloud.

  • ๐ŸŒŽ Globally available REST/Websocket APIs with automatic TLS certs.
  • ๐ŸŒŠ Stream LLM interactions in real-time with Websockets.
  • ๐Ÿ‘ฅ Enable human in the loop for your agents.
  • ๐Ÿ’ฌ Build, deploy & distribute Slack bots built with langchain.
  • ๐Ÿ”‘ Protect your APIs with API authorization using Bearer tokens.
  • ๐Ÿ“„ Swagger UI, and OpenAPI spec included with your APIs.
  • โšก๏ธ Serverless, autoscaling apps that scales automatically with your traffic.
  • ๐Ÿ“ Persistent storage (EFS) mounted on your app for your data.
  • ๐Ÿ“Š Builtin logging, monitoring, and traces for your APIs.
  • ๐Ÿค– No need to change your code to manage APIs, or manage dockerfiles, or worry about infrastructure!

๐Ÿ  Self-host LLM Apps with Docker Compose or Kubernetes

๐Ÿงฐ Usage

Let's first install langchain-serve using pip.

pip install langchain-serve

๐Ÿ”„ REST APIs using @serving decorator

๐Ÿ‘‰ Let's go through a step-by-step guide to build, deploy and use a REST API using @serving decorator.


๐Ÿค–๐Ÿ’ฌ Build, Deploy & Distribute Slack bots built with LangChain

langchain-serve exposes a @slackbot decorator to quickly build, deploy & distribute LLM-powered Slack bots without worrying about the infrastructure. It provides a simple interface to any langchain app on and makes them super accessible to users a platform they're already comfortable with.

โœจ Ready to dive in?

  • There's a step-by-step guide in the repository to help you build your own bot for helping with reasoning.
  • Here's another step-by-step guide to help you chat over own internal HR-realted documents (like onboarding, policies etc.) with your employees right inside your Slack workspace.

๐Ÿ” Authorize your APIs

To add an extra layer of security, we can integrate any custom API authorization by adding a auth argument to the @serving decorator.

Show code & gotchas
from lcserve import serving

def authorizer(token: str) -> Any:
    if not token == 'mysecrettoken':            # Change this to add your own authorization logic
        raise Exception('Unauthorized')         # Raise an exception if the request is not authorized

    return 'userid'                             # Return any user id or object

@serving(auth=authorizer)
def ask(question: str, **kwargs) -> str:
    auth_response = kwargs['auth_response']     # This will be 'userid'
    return ...

@serving(websocket=True, auth=authorizer)
async def talk(question: str, **kwargs) -> str:
    auth_response = kwargs['auth_response']     # This will be 'userid'
    return ...
๐Ÿค” Gotchas about the auth function
  • Should accept only one argument token.
  • Should raise an Exception if the request is not authorized.
  • Can return any object, which will be passed to the auth_response object under kwargs to the functions.
  • Expects Bearer token in the Authorization header of the request.
  • Sample HTTP request with curl:
    curl -X 'POST' 'http://localhost:8080/ask' -H 'Authorization: Bearer mysecrettoken' -d '{ "question": "...", "envs": {} }'
  • Sample WebSocket request with wscat:
    wscat -H "Authorization: Bearer mysecrettoken" -c ws://localhost:8080/talk

๐Ÿ™‹โ€โ™‚๏ธ Enable streaming & human-in-the-loop (HITL) with WebSockets

HITL for LangChain agents on production can be challenging since the agents are typically running on servers where humans don't have direct access. langchain-serve bridges this gap by enabling websocket APIs that allow for real-time interaction and feedback between the agent and a human operator.

Check out this example to see how you can enable HITL for your agents.

๐Ÿ“ Persistent storage on Jina AI Cloud

Every app deployed on Jina AI Cloud gets a persistent storage (EFS) mounted locally which can be accessed via workspace kwarg in the @serving function.

Show code
from lcserve import serving

@serving
def store(text: str, **kwargs):
    workspace: str = kwargs.get('workspace')
    path = f'{workspace}/store.txt'
    print(f'Writing to {path}')
    with open(path, 'a') as f:
        f.writelines(text + '\n')
    return 'OK'


@serving(websocket=True)
async def stream(**kwargs):
    workspace: str = kwargs.get('workspace')
    websocket: WebSocket = kwargs.get('websocket')
    path = f'{workspace}/store.txt'
    print(f'Streaming {path}')
    async with aiofiles.open(path, 'r') as f:
        async for line in f:
            await websocket.send_text(line)
    return 'OK'

Here, we are using the workspace to store the incoming text in a file via the REST endpoint and streaming the contents of the file via the WebSocket endpoint.

๐Ÿš€ Bring your own FastAPI app

If you already have a FastAPI app with pre-defined endpoints, you can use lc-serve to deploy it on Jina AI Cloud.

lc-serve deploy jcloud --app filename:app 
Show details

Let's take an example of a simple FastAPI app with directory structure

.
โ””โ”€โ”€ endpoints.py
# endpoints.py
from typing import Union

from fastapi import FastAPI

app = FastAPI()


@app.get("/status")
def read_root():
    return {"Hello": "World"}


@app.get("/items/{item_id}")
def read_item(item_id: int, q: Union[str, None] = None):
    return {"item_id": item_id, "q": q}
lc-serve deploy jcloud --app endpoints:app

๐Ÿ’ป lc-serve CLI

lc-serve is a simple CLI that helps you to deploy your agents on Jina AI Cloud (JCloud)

Description Command
Deploy your app locally lc-serve deploy local app
Export your app as Kubernetes YAML lc-serve export app --kind kubernetes --path .
Export your app as Docker Compose YAML lc-serve export app --kind docker-compose --path .
Deploy your app on JCloud lc-serve deploy jcloud app
Deploy FastAPI app on JCloud lc-serve deploy jcloud --app <app-name>:<app-object>
Update existing app on JCloud lc-serve deploy jcloud app --app-id <app-id>
Get app status on JCloud lc-serve status <app-id>
List all apps on JCloud lc-serve list
Remove app on JCloud lc-serve remove <app-id>
Pause app on JCloud lc-serve pause <app-id>
Resume app on JCloud lc-serve resume <app-id>

๐Ÿ’ก JCloud Deployment

โš™๏ธ Configurations

For JCloud deployment, you can configure your application infrastructure by providing a YAML configuration file using the --config option. The supported configurations are:

  • Instance type (instance), as defined by Jina AI Cloud.
  • Minimum number of replicas for your application (autoscale_min). Setting it 0 enables serverless.
  • Disk size (disk_size), in GB. The default value is 1 GB.

For example:

instance: C4
autoscale_min: 0
disk_size: 1.5G

You can alternatively include a jcloud.yaml file in your application directory with the desired configurations. However, please note that if the --config option is explicitly used in the command line interface, the local jcloud.yaml file will be disregarded. The command line provided configuration file will take precedence.

If you don't provide a configuration file or a specific configuration isn't specified, the following default settings will be applied:

instance: C3
autoscale_min: 1
disk_size: 1G

๐Ÿ’ฐ Pricing

Applications hosted on JCloud are priced in two categories:

Base credits

  • Base credits are charged to ensure high availability for your application by maintaining at least one instance running continuously, ready to handle incoming requests. If you wish to stop the serving application, you can either remove the app completely or put it on pause, the latter allows you to resume the app serving based on persisted configurations (refer to lc-serve CLI section for more information). Both options will halt the consumption of credits.
  • Actual credits charged for base credits are calculated based on the instance type as defined by Jina AI Cloud.
  • By default, instance type C3 is used with a minimum of 1 instance and Amazon EFS disk of size 1G, which means that if your application is served on JCloud, you will be charged ~10 credits per hour.
  • You can change the instance type and the minimum number of instances by providing a YAML configuration file using the --config option. For example, if you want to use instance type C4 with a minimum of 0 replicas, and 2G EFS disk, you can provide the following configuration file:
    instance: C4
    autoscale_min: 0
    disk_size: 2G

Serving credits

  • Serving credits are charged when your application is actively serving incoming requests.
  • Actual credits charged for serving credits are calculated based on the credits for the instance type multiplied by the duration for which your application serves requests.
  • You are charged for each second your application is serving requests.

Total credits charged = Base credits + Serving credits. (Jina AI Cloud defines each credit as โ‚ฌ0.005)

Examples

Example 1

Consider an HTTP application that has served requests for 10 minutes in the last hour and uses a custom config:

instance: C4
autoscale_min: 0
disk_size: 2G

Total credits per hour charged would be 3.33. The calculation is as follows:

C4 instance has an hourly credit rate of 20.
EFS has hourly credit rate of 0.104 per GB.
Base credits = 0 + 2 * 0.104 = 0.208 (since `autoscale_min` is 0)
Serving credits = 20 * 10/60 = 3.33
Total credits per hour = 0.208 + 3.33 = 3.538
Example 2

Consider a WebSocket application that had active connections for 20 minutes in the last hour and uses the default configuration.

instance: C3
autoscale_min: 1
disk_size: 1G

Total credits per hour charged would be 13.33. The calculation is as follows:

C3 instance has an hourly credit rate of 10.
EFS has hourly credit rate of 0.104 per GB.
Base credits = 10 + 1 * 0.104 = 10.104 (since `autoscale_min` is 1)
Serving credits = 10 * 20/60 = 3.33
Total credits per hour = 10.104 + 3.33 = 13.434

โ“ Frequently Asked Questions

lc-serve command not found

Expand

lc-serve command is registered during langchain-serve installation. If you get command not found: lc-serve error, please replace lc-serve command with python -m lcserve & retry.

My client that connects to the JCloud hosted App gets timed-out, what should I do?

Expand

If you make long HTTP/ WebSocket requests, the default timeout value (2 minutes) might not be suitable for your use case. You can provide a custom timeout value during JCloud deployment by using the --timeout argument.

Additionally, for HTTP, you may also experience timeouts due to limitations in the OSS we used in langchain-serve. While we are working to permanently address this issue, we recommend using HTTP/1.1 in your client as a temporary workaround.

For WebSocket, please note that the connection will be closed if idle for more than 5 minutes.

How to pass environment variables to the app?

Expand

We provide 2 options to pass environment variables:

  1. Use --env during app deployment to load env variables from a .env file. For example, lc-serve deploy jcloud app --env some.env will load all env variables from some.env file and pass them to the app. These env variables will be available in the app as os.environ['ENV_VAR_NAME'].

  2. You can also pass env variables while sending requests to the app both in HTTP and WebSocket. envs field in the request body is used to pass env variables. For example

    {
        "question": "What is the meaning of life?",
        "envs": {
            "ENV_VAR_NAME": "ENV_VAR_VALUE"
        }
    }

JCloud deployment failed at pushing image to Jina Hubble, what should I do?

Expand

Please use --verbose and retry to get more information. If you are operating on computer with arm64 arch, please retry with --platform linux/amd64 so the image can be built correctly.

Debug babyagi playground request/response for external integration

Expand 1. Start textual console in a terminal (exclude following groups to reduce the noise in logging)
```bash
textual console -x EVENT -x SYSTEM -x DEBUG
```
  1. Start the playground with --verbose flag. Start interacting and see the logs in the console.

    lc-serve playground babyagi --verbose

๐Ÿ“ฃ Reach out to us

Want to deploy your LLM apps on your own infrastructure with all capabilities of Jina AI Cloud?

  • Serverless
  • Autoscaling
  • TLS certs
  • Persistent storage
  • End to end LLM observability
  • and more on auto-pilot!

Join us on Discord and we'd be happy to hear more about your use case.

More Repositories

1

jina

โ˜๏ธ Build multimodal AI applications with cloud-native stack
Python
20,171
star
2

clip-as-service

๐Ÿ„ Scalable embedding, reasoning, ranking for images and sentences with CLIP
Python
12,150
star
3

reader

Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
TypeScript
3,126
star
4

dalle-flow

๐ŸŒŠ A Human-in-the-Loop workflow for creating HD images from text
Python
2,826
star
5

dev-gpt

Your Virtual Development Team
Python
1,658
star
6

finetuner

๐ŸŽฏ Task-oriented embedding tuning for BERT, CLIP, etc.
Python
1,443
star
7

thinkgpt

Agent techniques to augment your LLM and push it beyong its limits
Python
1,402
star
8

auto-gpt-web

Set Your Goals, AI Achieves Them.
TypeScript
749
star
9

agentchain

Chain together LLMs for reasoning & orchestrate multiple large models for accomplishing complex tasks
Python
570
star
10

docarray

The data structure for unstructured data
Python
522
star
11

vectordb

A Python vector database you just need - no more, no less.
Python
481
star
12

jcloud

Simplify deploying and managing Jina projects on Jina Cloud
Python
294
star
13

jina-video-chat

Python
266
star
14

jinabox.js

A lightweight, customizable omnibox in Javascript, for use with a Jina backend.
JavaScript
219
star
15

annlite

โšก A fast embedded library for approximate nearest neighbor search
Python
214
star
16

rungpt

An open-source cloud-native of large multi-modal models (LMMs) serving framework.
Python
140
star
17

fastapi-serve

FastAPI to the Cloud, Batteries Included! โ˜๏ธ๐Ÿ”‹๐Ÿš€
Python
139
star
18

jina-hub

An open-registry for hosting Jina executors via container images
Python
103
star
19

dashboard

Interactive UI for analyzing Jina logs, designing Flows and viewing Hub images
TypeScript
100
star
20

GoldRetriever

Create and host retrieval plugins for ChatGPT in one click
Python
61
star
21

example-multimodal-fashion-search

Input text or image, get back matching image fashion results, using Jina, DocArray, and CLIP
Python
44
star
22

jinaai-py

Python
44
star
23

streamlit-jina

Streamlit component for Jina neural search
Python
37
star
24

docs

Jina V1 Official Documentation. For the latest one, please check out https://docs.jina.ai
HTML
35
star
25

executors

internal-only
Python
28
star
26

jerboa

LLM finetuning
Python
27
star
27

jina-ai.github.io

Homepage of Jina AI Limited
HTML
27
star
28

jinaai-js

TypeScript
27
star
29

example-meme-search

Meme search engine built with Jina neural search framework. Search with captions or image files to find matching memes.
Python
23
star
30

example-app-store

App store search example, using Jina as backend and Streamlit as frontend
Python
21
star
31

docsQA-ui

Web UI for docsQA. Main branch: https://jina-docqa-ui.netlify.app/
TypeScript
20
star
32

example-speech-to-image

An example of building a speech to image generation pipeline with Jina, Whisper and StableDiffusion
Python
20
star
33

jina-hubble-sdk

Python API for authentication, resource management with Hubble
Python
19
star
34

product-recommendation-redis-docarray

Python
18
star
35

career

Find out job opportunities at Jina AI
17
star
36

executor-3d-encoder

An executor that wraps 3D mesh models and encodes 3D content documents to d-dimension vector.
Python
16
star
37

client-go

Golang Client for Jina (https://github.com/jina-ai/jina)
Go
16
star
38

workshops

Jupyter Notebook
15
star
39

benchmark

Benchmark environment and results of different versions of Jina.
Python
14
star
40

action-hub-builder

Simple interface for building & validating Jina Hub executors.
Python
12
star
41

inference-client

Python
12
star
42

executor-hnsw-postgres

A production-ready, scalable Indexer for the Jina neural search framework, based on HNSW and PSQL
Python
12
star
43

now

Python
11
star
44

cookiecutter-jina

Cookiecutter template for a Jina project
Python
10
star
45

simple-jina-examples

Python
9
star
46

executor-simpleindexer

Simple Indexer
Python
9
star
47

cloud-ops

Python
8
star
48

good-first-issues

Issues that don't fit under Jina's other repos!
8
star
49

executor-clip-encoder

Encoder that embeds documents using either the CLIP vision encoder or the CLIP text encoder, depending on the content type of the document.
Python
8
star
50

api

API schema of Jina command line interface exposed as JSON and YAML files.
HTML
8
star
51

inference-client-js

TypeScript
7
star
52

executor-text-transformers-dprreader-ranker

DPRReaderRanker
Python
7
star
53

executor-video-loader

Python
7
star
54

executor-image-clip-encoder

CLIPImageEncoder is an image encoder that wraps the image embedding functionality using the CLIP
Python
7
star
55

.github

This repository stores github actions templates as described https://docs.github.com/en/actions/learn-github-actions/sharing-workflows-with-your-organization
7
star
56

GSoC

Google Summer of Code
7
star
57

example-wikipedia-recommendation

An example of graph embeddings for wikipedia page recommendations
Jupyter Notebook
6
star
58

executor-U100KIndexer

An Indexer that works out-of-the-box when you have less than 100K stored Documents
Python
6
star
59

devrel-heartmaker

Heart mosaics of your GitHub contributors
Python
6
star
60

executor-text-transformers-torch-encoder

**TransformerTorchEncoder** wraps the torch-version of transformers from huggingface. It encodes text data into dense vectors.
Python
6
star
61

executor-cases

Summarize all Executor patterns for Hubble
Python
5
star
62

executor-normalizer

Jina executor package normalizer
Python
5
star
63

auth

deprecated, use `jina-hubble-sdk`
Python
5
star
64

jina-commons

A collection of shared function for Jina Executor
Python
5
star
65

tutorial-notebooks

Jupyter Notebook
5
star
66

jina-paddle-hackathon

ๆž็บณ x ็™พๅบฆ้ฃžๆกจ ้ป‘ๅฎข้ฉฌๆ‹‰ๆพ
Python
5
star
67

executor-image-preprocessor

An executor that performs standard pre-processing and normalization on images.
Python
5
star
68

jina-hackathon

Support repo for Jina X Hackathon - Sep 2020
5
star
69

executor-featurehasher

FeatureHasher
Python
4
star
70

stress-test

A collection of stress tests of Jina infrastructure
Python
4
star
71

executor-image-clip-classifier

Python
4
star
72

executor-text-transformerqa

**TransformerQAExecutor* wraps a question-answering model from huggingface and return relevant answers given questions and contexts/paragraphs.
Python
4
star
73

hub-integration

Integration test for hub
Python
4
star
74

executor-faissindexer

A similarity search indexer based on Faiss. https://hub.jina.ai/executor/8gsd0tts
Python
4
star
75

example-audio-search

Python
3
star
76

example-video-qa

This is an example of building a video QA with jina
TypeScript
3
star
77

jinad

Management of Jina on remote
Python
3
star
78

executor-indexers

Indexer Executors for Jina
Python
3
star
79

executor-text-dpr-encoder

Encode text into embeddings using the DPR model.
Python
3
star
80

jina-sagemaker

Jina Embedding Models on AWS SageMaker
Jupyter Notebook
3
star
81

executor-clip-image

Executor for the pre-trained clip model. https://openai.com/blog/clip/
Python
3
star
82

executor-weaviate-indexer

Python
3
star
83

executor-doc2query

Python
3
star
84

executor-evaluator-ranking

Python
3
star
85

legacy-examples

Unmaintained examples for Jina
Python
3
star
86

executor-image-paddle-encoder

Python
3
star
87

jupyter-notebooks

Jupyter Notebook
3
star
88

executor-yolov5

Python
3
star
89

executor-lightgbm-ranker

Python
3
star
90

terraform-jina-jinad-aws

Module for deploying JinaD on AWS
HCL
3
star
91

encoder-image-torch

The ImageTorchEncoder encodes Document content from a ndarray to an d-dimensional vector.
Python
3
star
92

executor-image-niireader

Python
2
star
93

example-odqa

Roff
2
star
94

jina-ui

Monorepo for JinaJS and frontend projects
TypeScript
2
star
95

executor-audio-clip-encoder

Wraps the AudioCLIP model for generating embeddings for audio data for the Jina framework
Python
2
star
96

executor-text-clip-encoder

Encode text into embeddings using the CLIP model.
Python
2
star
97

executor-image-normalizer

Executor that reads, resizes, crops and normalizes images.
Python
2
star
98

executor-vgg-audio-encoder

Python
2
star
99

executor-image-hasher

An executor to encode images using comparable hashing techniques. Useful for duplicate detection
Python
2
star
100

executor-image-clothing-segmenter

An executor that performs image segmentation on fashion items
Python
2
star