• Stars
    star
    531
  • Rank 83,526 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An LLM Based Diagnosis System (https://arxiv.org/pdf/2312.01454.pdf)

LLM As Database Administrator

Dialogues Dialogues Dialogues Dialogues Dialogues

Demo β€’ QuickStart β€’ Alerts And Anomalies β€’ Knowledge And Tools β€’ FAQ β€’ Community β€’ Citation β€’ Contributors

πŸ‘« Join Us on WeChat! πŸ† Top 100 Open Project!

【English | 中文】

🦾 Build your personal database administrator (D-Bot)πŸ§‘β€πŸ’», which is good at reading documents, using various tools, writing analysis reports!

An important, major version update is coming soon, stay tuned! πŸ“£ πŸ”œ

πŸ—Ί Online Demo

In the online website (http://dbgpt.dbmind.cn), you can browse all the historical diagnosis results, used metrics, and the detailed diagnosis processes.

frontend_v2

frontend_v2

Demo Video

Here is the Old Version of D-Bot.

πŸ“° Updates

  • New Frontend

    • Knowledgebase + Diagnosis + Report Replay new branch
  • Extreme Speed Version for localized llms

    • 4-bit quantized LLM (reducing inference time by 1/3)

    • Synchronous Concurrency Mechanism during LLM inference

    • Document knowledge embedding and matching

      • Vector database (ChromaDB) new branch

      • RESTful Search Engine (Elasticsearch)

  • Expert prompt generation using document knowledge new branch

  • Upgrade the LLM-based diagnosis mechanism:

    • Task Dispatching -> Concurrent Diagnosis -> Cross Review -> Report Generation
  • Support monitoring and optimization tools in multiple levels πŸ”— link

    • Monitoring metrics (Prometheus)
    • Flame graph in code level
    • Diagnosis knowledge retrieval (dbmind)
    • Logical query transformations (Calcite)
    • Index optimization algorithms (for PostgreSQL)
    • Physical operator hints (for PostgreSQL)
    • Backup and Point-in-time Recovery (Pigsty)
  • Papers and experimental reports are continuously updated

This project is evolving with new features πŸ‘«πŸ‘«
Don't forget to star ⭐ and watch πŸ‘€ to stay up to date :)

πŸ•Ή QuickStart

Folder Structure

.
β”œβ”€β”€ multiagents
β”‚   β”œβ”€β”€ agent_conf                        # Settings of each agent
β”‚   β”œβ”€β”€ agents                            # Implementation of different agent types 
β”‚   β”œβ”€β”€ environments                      # E.g., diag orders / diag update / terminal conditions
β”‚   β”œβ”€β”€ knowledge                         # Diagnosis experience from documents
β”‚   β”œβ”€β”€ llms                              # Supported models
β”‚   β”œβ”€β”€ memory                            # The content and summary of chat history
β”‚   β”œβ”€β”€ reasoning_algorithms              # The available algorithm for single LLM reasoning
β”‚   β”œβ”€β”€ response_formalize_scripts        # Useless content removal of model response
β”‚   β”œβ”€β”€ tools                             # External monitoring/optimization tools for models
β”‚   └── utils                             # Other functions (e.g., database/json/yaml operations)
β”œβ”€β”€ web_service                           # Web services to view diagnostic reports
β”‚   β”œβ”€β”€ backend                           # Web services backend
β”‚   β”œβ”€β”€ frontend                          # Web services frontend 
β”œβ”€β”€ webhook                               # Use the webhook to save the results of the alert to a file

Website Side

We provide a local website to browse historical diagnosis reports and procedures. You can launch it in three steps:

  • install nodejs (16.13.1 and above is recommended)
npm install -g n
sudo n 16.18.1
  • install dependencies for the first run。
# install frontend environment
cd web_service/frontend
rm -rf node_modules/
rm -r package-lock.json
# install dependencies for the first run
npm install  --legacy-peer-deps
  • configuring the Backend Address for Frontend Requests
vim web_service/frontend/.env.development
# Replace 127.0.0.1 with the Backend ID address, the corresponding 8024 port needs to be consistent with the backend port.
VUE_APP_BASE_API = 'http://127.0.0.1:8024/'

* run

```shell
# cd service directory
cd web_service
# launch the local server and open the website
sh run_service.sh

Modify the "python app.py" command within run_service.sh if there are multiple Python versions (e.g., "python3.9 app.py").

After successfully launching the local server, visit http://127.0.0.1:8025/ to browse the diagnosis reports.

Diagnosis Side

1. Prerequisites

  • PostgreSQL v12 or higher

    Make sure your database supports remote connection (link)

    Additionally, install extensions like pg_stat_statements (track frequent queries), pg_hint_plan (optimize physical operators), and hypopg (create hypothetical Indexes).

    Note pg_stat_statements continuosly accumulate query statistics over time. So you need to clear the statistics from time to time: 1) To discard all the statistics, execute "SELECT pg_stat_statements_reset();"; 2) To discard the statistics of specific query, execute "SELECT pg_stat_statements_reset(userid, dbid, queryid);".

  • Enable slow query log in PostgreSQL (link)

    (1) For "systemctl restart postgresql", the service name can be different (e.g., postgresql-12.service);

    (2) Use absolute log path name like "log_directory = '/var/lib/pgsql/12/data/log'";

    (3) Set "log_line_prefix = '%m [%p] [%d]'" in postgresql.conf (to record the database names of different queries).

  • Prometheus

    Check prometheus.md for detailed installation guides.

2. Package Installation

Step 1: Install python packages.

pip install -r requirements.txt

You can comment the Llama2Chat/CodeLlamaChat/Baichuan2Chat imports in ./llms/init.py if using openai models only

If fail to install google-colab, try conda install -c conda-forge google-colab

Step 2: Configure environment variables.

  • Export your OpenAI API key
# macos
export OPENAI_API_KEY="your_api_key_here"
# windows
set OPENAI_API_KEY="your_api_key_here"

Step 3: Add database/anomaly/prometheus settings into tool_config_example.yaml and rename into tool_config.yaml:

```bash
POSTGRESQL:
  host: 182.92.xxx.x
  port: 5432
  user: xxxx
  password: xxxxx
  dbname: postgres

DATABASESERVER:
  server_address: 182.92.xxx.x
  username: root
  password: xxxxx
  remote_directory: /var/lib/pgsql/12/data/log

PROMETHEUS:
  api_url: http://8.131.xxx.xx:9090/
  postgresql_exporter_instance: 172.27.xx.xx:9187
  node_exporter_instance: 172.27.xx.xx:9100
```

remote_directory in the DATABASESERVER setting indicates where the slow query log file is located at (link).

  • If accessing openai service via vpn, execute this command:
# macos
export https_proxy=http://127.0.0.1:7890 http_proxy=http://127.0.0.1:7890 all_proxy=socks5://127.0.0.1:7890
  • Test your openai key
cd others
python openai_test.py

Step 4: Download Sentence Trasformer model parameters

  • Create new directory ./localized_llms/sentence_embedding

  • Move the downloaded sentence-transformer.zip to ./localized_llms/sentence_embedding/ directory, and unzip it.

3. Generate New Diagnosis Report

  • Test single case
python main.py
  • Test in batch
python batch_main.py

🎩 Alerts And Anomalies

Alert Management

We support AlertManager for Prometheus. You can find more information about how to configure alertmanager here: alertmanager.md.

  • We provide AlertManager-related configuration files, including alertmanager.yml, node_rules.yml, and pgsql_rules.yml. The path is in the config folder in the root directory, which you can deploy to your Prometheus server to retrieve the associated exceptions.
  • We also provide webhook server that supports getting alerts. The path is a webhook folder in the root directory that you can deploy to your server to get and store Prometheus's alerts in files.
  • Currently, the alert file is obtained using SSh. You need to configure your server information in the tool_config.yaml in the config folder.
  • node_rules.yml and pgsql_rules.yml is a reference https://github.com/Vonng/pigsty code in this open source project, their monitoring do very well, thank them for their effort.

Anomaly Simulation

Script-Triggered Anomalies

We offer scripts that could incur typical anomalies. Check out different anomaly cases in http://dbgpt.dbmind.cn

Root Cause Description Potential Alerts
INSERT_LARGE_DATA Long execution time for large data insert
FETCH_LARGE_DATA Long execution time for large data fetch
REDUNDANT_INDEX Unnecessary and redundant indexes in tables
VACUUM Unused space caused by data modifications
POOR_JOIN_PERFORMANCE Poor performance of join operators
CORRELATED_SUBQUERY Non-promotable subqueries in SQL statements ,
LOCK_CONTENTION Lock contention issues
CPU_CONTENTION Severe CPU resource contention
IO_CONTENTION IO resource contention affecting SQL performance
COMMIT_CONTENTION Highly concurrent commits affecting SQL execution
SMALL_MEMORY_ALLOC Too small allocated memory space

Manually Designed Anomalies

Click to check 29 typical anomalies together with expert analysis (supported by the DBMind team)

πŸ“Ž Customize Knowledge And Tools

1. Knowledge Extraction

(Basic version by Zui Chen)

Step 1. Rename doc2knowledge/config_template.json into doc2knowledge/config.json. And add the value for "api_key" ("organization" is optional)

GPT-4 is necessary to utilize the function calling feature. I will try to solve this limit.

Step 2. Split documents into separated section files by the section indexes (e.g., section1, section1.1, section2 ...). And copy the section files into the docs/<your_document_name>/raw/. For example:

.
β”œβ”€β”€ docs
β”‚   β”œβ”€β”€ report_example
|   β”‚   β”œβ”€β”€ raw    
|   β”‚   |    β”œβ”€β”€ 1 title.txt    
|   β”‚   |    β”œβ”€β”€ 1.1 category.txt

It is a laborious work and hard to find a better way than manually splitting the given document

You can jump over this step and directly run the report_example case

Step 3. Modify the arguments in doc2knowledge.py script and run the script:

cd doc2knowledge/
python doc2knowledge.py

The summary for the same document sections is cached. You can delete this file if do not like to reuse the previous caches.

Step 4. With the extracted knowledge, you can visualize their clustering results:

cd doc2knowledge/
python knowledge_clustering.py

2. Tool Preparation

  • Tool APIs (for optimization)

    Module Functions
    index_selection (equipped) heuristic algorithm
    query_rewrite (equipped) 45 rules
    physical_hint (equipped) 15 parameters

    For functions within [query_rewrite, physical_hint], you can use api_test.py script to verify the effectiveness.

    If the function actually works, append it to the api.py of corresponding module.

πŸ’ FAQ

🀨 The '.sh' script command cannot be executed on windows system. Switch the shell to *git bash* or use *git bash* to execute the '.sh' script.
🀨 "No module named 'xxx'" on windows system. This error is caused by issues with the Python runtime environment path. You need to perform the following steps:

Step 1: Check Environment Variables.

You must configure the "Scripts" in the environment variables.

Step 2: Check IDE Settings.

For VS Code, download the Python extension for code. For PyCharm, specify the Python version for the current project.

⏱ Todo

  • Project cleaning
  • Support more anomalies
  • Support more knowledge sources
  • Query log option (potential to take up disk space and we need to consider it carefully)
  • Add more communication mechanisms
  • Localized model that reaches D-bot(gpt4)'s capability
  • Localized llms that are tailored with domain knolwedge and can generate precise and straigtforward analysis.
  • Prometheus-as-a-Service
  • Support other databases (e.g., mysql/redis)

πŸ‘« Community

πŸ€— Relevant Projects

https://github.com/OpenBMB/AgentVerse

https://github.com/Vonng/pigsty

https://github.com/UKPLab/sentence-transformers

πŸ“’ Citation

Feel free to cite us (paper link) if you like this project.

@misc{zhou2023llm4diag,
      title={D-Bot: Database Diagnosis System using Large Language Models}, 
      author={Xuanhe Zhou, Guoliang Li, Zhaoyan Sun, Zhiyuan Liu, Weize Chen, Jianming Wu, Jiesi Liu, Ruohang Feng, Guoyang Zeng},
      year={2023},
      eprint={2312.01454},
      archivePrefix={arXiv},
      primaryClass={cs.DB}
}
@misc{zhou2023dbgpt,
      title={DB-GPT: Large Language Model Meets Database}, 
      author={Xuanhe Zhou, Zhaoyan Sun, Guoliang Li},
      year={2023},
      archivePrefix={Data Science and Engineering},
}

πŸ“§ Contributors

Other Collaborators: Wei Zhou, Kunyi Li.

We thank all the contributors to this project. Do not hesitate if you would like to get involved or contribute!

Contact Information

πŸ‘πŸ»Welcome to our wechat group!

More Repositories

1

AIDB

ai4db and db4ai work
663
star
2

CloudDB

continuously update cloud database papers
73
star
3

AI4DBCode

Codes for building an AI-native database
Scala
69
star
4

nvBench

HTML
50
star
5

DITA

Distributed In-Memory Trajectory Analytics
Scala
37
star
6

GTree

shortest path query and kNN search on road network
C++
35
star
7

DBMeeting

Putting Database Meeting Reports Together
32
star
8

Dima

simialrity join or search on spark core directly
Scala
25
star
9

VTree

an indexing structure for knn search for moving objects on road networks
C++
20
star
10

Similarity-Search-and-Join

similarity join and search algorithms for edit distance and jaccard
C++
18
star
11

CrowdOTA

An online task assignment tool on Amazon Mturk
HTML
15
star
12

DeepEye

a source code for automatic data visualization and recommendation
Python
12
star
13

CDB

A crowd-powered database system, with SQL-like query interface, multi-goal optimization
Java
10
star
14

datasets

datasets for database research
10
star
15

SocialInfluenceMaximization

location-aware and topic-aware influence maximization
C++
10
star
16

CrowdTI

A truth inference tool in crowdsourcing
MATLAB
8
star
17

Crowd-topk

crowdsourcing operators, systems, and inference for cost, quality, and latency optimiations
MATLAB
6
star
18

Autocompletion

source code for autocompletion, approximate autocompletion, search as your type
C++
4
star
19

Grep

Automatic Database Partition using Graph Embedding
Python
3
star
20

VisClean

3
star
21

Google-Scholar-Cleaner

Google Scholar Cleaner is a tool that helps you to easily discover and delete the publications that do not belong to you, but appear in your Google Scholar page.
JavaScript
2
star
22

DBEase

Making database ease to use (sql suggestion and type-ahead search)
1
star
23

EM-Rule

A tool to learn entity matching rules
C++
1
star
24

FEBench

a benchmark for db4ai - feature extraction
Java
1
star