• Stars
    star
    249
  • Rank 162,987 (Top 4 %)
  • Language
    Python
  • License
    Other
  • Created over 1 year ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Akcio is a demonstration project for Retrieval Augmented Generation (RAG). It leverages the power of LLM to generate responses and uses vector databases to fetch relevant documents to enhance the quality and relevance of the output.

Akcio: Enhancing LLM-Powered ChatBot with CVP Stack

OSSChat | Documentation | Contact | LICENSE

Index

ChatGPT has constraints due to its limited knowledge base, sometimes resulting in hallucinating answers when asked about unfamiliar topics. We are introducing the new AI stack, ChatGPT+Vector database+prompt-as-code, or the CVP Stack, to overcome this constraint.

We have built OSSChat as a working demonstration of the CVP stack. Now we are presenting the technology behind OSSChat in this repository with a code name of Akcio.

With this project, you are able to build a knowledge-enhanced ChatBot using LLM service like ChatGPT. By the end, you will learn how to start a backend service using FastAPI, which provides standby APIs to support further applications. Alternatively, we show how to use Gradio to build an online demo.

Overview

Akcio allows you to create a ChatGPT-like system with added intelligence obtained through semantic search of customized knowledge base. Instead of sending the user query directly to LLM service, our system firstly retrieves relevant information from stores by semantic search or keyword match. Then it feeds both user needs and helpful information into LLM. This allows LLM to better tailor its response to the user's needs and provide more accurate and helpful information.

You can find more details and instructions at our documentation.

Akcio offers two AI platforms to choose from: Towhee or LangChain. It also supports different integrations of LLM service and databases:

Towhee LangChain
LLM OpenAI โœ“ โœ“
Llama-2 โœ“
Dolly โœ“ โœ“
Ernie โœ“ โœ“
MiniMax โœ“ โœ“
DashScope โœ“
ChatGLM โœ“
SkyChat โœ“
Embedding OpenAI โœ“ โœ“
HuggingFace โœ“ โœ“
Vector Store Zilliz Cloud โœ“ โœ“
Milvus โœ“ โœ“
Scalar Store (Optional) Elastic โœ“ โœ“
Memory Store Postgresql โœ“ โœ“
MySQL and MariaDB โœ“
SQLite โœ“
Oracle โœ“
Microsoft SQL Server โœ“
Rerank MS MARCO Cross-Encoders โœ“

Option 1: Towhee

The option using Towhee simplifies the process of building a system by providing pre-defined pipelines. These built-in pipelines require less coding and make system building much easier. If you require customization, you can either simply modify configuration or create your own pipeline with rich options of Towhee Operators.

  • Pipelines

    • Insert: The insert pipeline builds a knowledge base by saving documents and corresponding data in database(s).
    • Search: The search pipeline enables the question-answering capability powered by information retrieval (semantic search and optional keyword match) and LLM service.
    • Prompt: a prompt operator prepares messages for LLM by assembling system message, chat history, and the user's query processed by template.
  • Memory: The memory storage stores chat history to support context in conversation. (available: most SQL)

Option 2: LangChain

The option using LangChain employs the use of Agent in order to enable LLM to utilize specific tools, resulting in a greater demand for LLM's ability to comprehend tasks and make informed decisions.

  • Agent
    • ChatAgent: agent ensembles all modules together to build up qa system.
    • Other agents (todo)
  • LLM
    • ChatLLM: large language model or service to generate answers.
  • Embedding
    • TextEncoder: encoder converts each text input to a vector.
    • Other encoders (todo)
  • Store
    • VectorStore: vector database stores document chunks in embeddings, and performs document retrieval via semantic search.
    • ScalarStore: optional, database stores metadata for each document chunk, which supports additional information retrieval. (available: Elastic)
    • MemoryStore: memory storage stores chat history to support context in conversation.
  • DataLoader
    • DataParser: tool loads data from given source and then splits documents into processed doc chunks.

Deployment

  1. Downloads

    $ git clone https://github.com/zilliztech/akcio.git
    $ cd akcio
  2. Install dependencies

    $ pip install -r requirements.txt
  3. Configure modules

    You can configure all arguments by modifying config.py to set up your system with default modules.

    • LLM

      By default, the system will use OpenAI service as the LLM option. To set your OpenAI API key without modifying the configuration file, you can pass it as environment variable.

      $ export OPENAI_API_KEY=your_keys_here
      Check how to SWITCH LLM. If you want to use another supported LLM service, you can change the LLM option and set up for it. Besides directly modifying the configuration file, you can also set up via environment variables.
      • For example, to use Llama-2 at local which does not require any account, you just need to change the LLM option:

        $ export LLM_OPTION=llama_2
      • For example, to use Ernie instead of OpenAI, you need to change the option and set up Ernie API key & secret key:

        $ export LLM_OPTION=ernie
        $ export ERNIE_API_KEY=your_ernie_api_key
        $ export ERNIE_SECRET_KEY=your_ernie_secret_key
    • Embedding

      By default, the embedding module uses methods from Sentence Transformers to convert text inputs to vectors. Here are some information about the default embedding method:

    • Store

      Before getting started, all database services used for store must be running and be configured with write and create access.

      • Vector Store: You need to prepare the service of vector database in advance. For example, you can refer to Milvus Documents or Zilliz Cloud to learn about how to start a Milvus service.
      • Scalar Store (Optional): This is optional, only work when USE_SCALAR is true in configuration. If this is enabled (i.e. USE_SCALAR=True), the default scalar store will use Elastic. In this case, you need to prepare the Elasticsearch service in advance.
      • Memory Store: You need to prepare the database for memory storage as well. By default, LangChain mode supports Postgresql and Towhee mode allows interaction with any database supported by SQLAlchemy 2.0.

      The system will use default store configs. To set up your special connections for each database, you can also export environment variables instead of modifying the configuration file.

      For the Vector Store, set MILVUS_URI:

      $ export MILVUS_URI=https://localhost:19530

      For the Memory Store, set SQL_URI:

      $ export SQL_URI={database_type}://{user}:{password}@{host}/{database_name}

      LangChain mode only supports Postgresql as database type.

      By default, scalar store (elastic) is disabled. Click to check how to enable Elastic.

      The following commands help to connect your Elastic cloud.

      $ export USE_SCALAR=True
      $ export ES_CLOUD_ID=your_elastic_cloud_id
      $ export ES_USER=your_elastic_username
      $ export ES_PASSWORD=your_elastic_password

      To use host & port instead of cloud id, you can manually modify the VECTORDB_CONFIG in config.py.


  1. Start service

    The main script will run a FastAPI service with default address localhost:8900.

    • Option 1: using Towhee
      $ python main.py --towhee
    • Option 2: using LangChain
      $ python main.py --langchain
  2. Access via browser

    You can open url https://localhost:8900/docs in browser to access the web service.

    /: Check service status

    /answer: Generate answer for the given question, with assigned session_id and project

    /project/add: Add data to project (will create the project if not exist)

    /project/drop: Drop project including delete data in both vector and memory storages.

    Check Online Operations to learn more about these APIs.

Load data

The insert function in operations loads project data from url(s) or file(s).

There are 2 options to load project data:

Option 1: Offline

We recommend this method, which loads data in separate steps. There is also advanced options to load document, for example, generating and inserting potential questions for each doc chunk. Refer to offline_tools for instructions.

Option 2. Online

When the FastAPI service is up, you can use the POST request http://localhost:8900/project/add to load data.

Parameters:

{
  "project": "project_name",
  "data_src": "path_to_doc",
  "source_type": "file"
}

or

{
  "project": "project_name",
  "data_src": "doc_url",
  "source_type": "url"
}

This method is only recommended to load a small amount of data, but not for a large amount of data.


LICENSE

Akcio is published under the Server Side Public License (SSPL) v1.

More Repositories

1

GPTCache

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
Python
7,164
star
2

attu

The GUI for Milvus
TypeScript
1,246
star
3

VectorDBBench

A Benchmark Tool for VectorDB
Python
523
star
4

feder

Visualize hnsw, faiss and other anns index
Jupyter Notebook
380
star
5

knowhere

Knowhere is an open-source vector search engine, integrating FAISS, HNSW, etc.
C++
172
star
6

phantoscope

Open Source, Cloud Native, RESTful Search Engine Powered by Neural Networks
Python
140
star
7

milvus-backup

Backup and restore tool for Milvus
Go
124
star
8

pyglass

Graph Library for Approximate Similarity Search
C++
85
star
9

milvus-helm

Mustache
56
star
10

milvus-operator

The Kubernetes Operator of Milvus.
Go
47
star
11

MolSearch

An opensource molecule analyze software
JavaScript
43
star
12

starling

C++
35
star
13

BBAnn

Block-based Approximate Nearest Neighbor
C++
31
star
14

awesome-milvus

A curated list of awesome Milvus projects and resources.
27
star
15

cloud-vectordb-examples

Zilliz Cloud examples
Java
27
star
16

milvus-cdc

Milvus-CDC is a change data capture tool for Milvus. It can capture the changes of upstream Milvus collections and sink them to downstream Milvus.
Go
24
star
17

vector-index-visualization-tool

visualization tool for vector search index
TypeScript
22
star
18

milvus-migration

Go
18
star
19

vectordb-benchmark

Python
18
star
20

kafka-connect-milvus

kafka-connect-milvus sink connector
Java
17
star
21

Retriever-for-GPTs

An external retriever for GPTs implemented with Zilliz Cloud Pipelines, a more flexible and economic alternative to default GPTs knowledge base.
16
star
22

terraform-provider-zillizcloud

Go
12
star
23

spark-milvus

Java
7
star
24

phantoscope-bootcamp

Bootcamp for Phantoscope
Python
6
star
25

md2md

tool for generating markdown file, support fragment, variables
JavaScript
6
star
26

zilliz-cloud-typescript-example

TypeScript
6
star
27

infini-client

arctern client
TypeScript
4
star
28

milvus-bulkload

Python
1
star
29

arctern-webdocs

1
star
30

codelabs

Zilliz codelabs
JavaScript
1
star
31

zdoc-demos

Jupyter Notebook
1
star
32

milvus_gobench

1
star
33

milvus-gather

Go
1
star