• Stars
    star
    683
  • Rank 65,704 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 1 year ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Chat with your documents offline using AI.

ChatDocs PyPI tests

Chat with your documents offline using AI. No data leaves your system. Internet connection is only required to install the tool and download the AI models. It is based on PrivateGPT but has more features.

Web UI

Features

  • Supports GGML models via C Transformers
  • Supports 🤗 Transformers models
  • Supports GPTQ models
  • Web UI
  • GPU support
  • Highly configurable via chatdocs.yml
Show supported document types
Extension Format
.csv CSV
.docx, .doc Word Document
.enex EverNote
.eml Email
.epub EPub
.html HTML
.md Markdown
.msg Outlook Message
.odt Open Document Text
.pdf Portable Document Format (PDF)
.pptx, .ppt PowerPoint Document
.txt Text file (UTF-8)

Installation

Install the tool using:

pip install chatdocs

Download the AI models using:

chatdocs download

Now it can be run offline without internet connection.

Usage

Add a directory containing documents to chat with using:

chatdocs add /path/to/documents

The processed documents will be stored in db directory by default.

Chat with your documents using:

chatdocs ui

Open http://localhost:5000 in your browser to access the web UI.

It also has a nice command-line interface:

chatdocs chat
Show preview

Demo

Configuration

All the configuration options can be changed using the chatdocs.yml config file. Create a chatdocs.yml file in some directory and run all commands from that directory. For reference, see the default chatdocs.yml file.

You don't have to copy the entire file, just add the config options you want to change as it will be merged with the default config. For example, see tests/fixtures/chatdocs.yml which changes only some of the config options.

Embeddings

To change the embeddings model, add and change the following in your chatdocs.yml:

embeddings:
  model: hkunlp/instructor-large

Note: When you change the embeddings model, delete the db directory and add documents again.

C Transformers

To change the C Transformers GGML model, add and change the following in your chatdocs.yml:

ctransformers:
  model: TheBloke/Wizard-Vicuna-7B-Uncensored-GGML
  model_file: Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin
  model_type: llama

Note: When you add a new model for the first time, run chatdocs download to download the model before using it.

You can also use an existing local model file:

ctransformers:
  model: /path/to/ggml-model.bin
  model_type: llama

🤗 Transformers

To use 🤗 Transformers models, add the following to your chatdocs.yml:

llm: huggingface

To change the 🤗 Transformers model, add and change the following in your chatdocs.yml:

huggingface:
  model: TheBloke/Wizard-Vicuna-7B-Uncensored-HF

Note: When you add a new model for the first time, run chatdocs download to download the model before using it.

GPTQ

To use GPTQ models, install the auto-gptq package using:

pip install chatdocs[gptq]

and add the following to your chatdocs.yml:

llm: gptq

To change the GPTQ model, add and change the following in your chatdocs.yml:

gptq:
  model: TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ
  model_file: Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors

Note: When you add a new model for the first time, run chatdocs download to download the model before using it.

GPU

Embeddings

To enable GPU (CUDA) support for the embeddings model, add the following to your chatdocs.yml:

embeddings:
  model_kwargs:
    device: cuda

You may have to reinstall PyTorch with CUDA enabled by following the instructions here.

C Transformers

Note: Currently only LLaMA GGML models have GPU support.

To enable GPU (CUDA) support for the C Transformers GGML model, add the following to your chatdocs.yml:

ctransformers:
  config:
    gpu_layers: 50

You should also reinstall the ctransformers package with CUDA enabled:

pip uninstall ctransformers --yes
CT_CUBLAS=1 pip install ctransformers --no-binary ctransformers
Show commands for Windows

On Windows PowerShell run:

$env:CT_CUBLAS=1
pip uninstall ctransformers --yes
pip install ctransformers --no-binary ctransformers

On Windows Command Prompt run:

set CT_CUBLAS=1
pip uninstall ctransformers --yes
pip install ctransformers --no-binary ctransformers

🤗 Transformers

To enable GPU (CUDA) support for the 🤗 Transformers model, add the following to your chatdocs.yml:

huggingface:
  device: 0

You may have to reinstall PyTorch with CUDA enabled by following the instructions here.

GPTQ

To enable GPU (CUDA) support for the GPTQ model, add the following to your chatdocs.yml:

gptq:
  device: 0

You may have to reinstall PyTorch with CUDA enabled by following the instructions here.

After installing PyTorch with CUDA enabled, you should also reinstall the auto-gptq package:

pip uninstall auto-gptq --yes
pip install chatdocs[gptq]

License

MIT

More Repositories

1

ctransformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.
C
1,780
star
2

material-icons

Latest icon fonts and CSS for self-hosting material design icons.
CSS
311
star
3

material-symbols

Latest variable icon fonts and optimized SVGs for Material Symbols.
CSS
163
star
4

material-design-icons

Latest icon fonts and optimized SVGs for material design icons.
JavaScript
158
star
5

gpt4all-j

Python bindings for the C++ port of GPT4All-J model.
Python
38
star
6

jekyll-theme-documentation

A Jekyll theme for hosting documentation on GitHub Pages.
HTML
17
star
7

gptj.cpp

Port of GPT-J model in C/C++
C++
11
star
8

new-url-loader

A tiny alternative to url-loader and file-loader for webpack 5.
JavaScript
8
star
9

node-grpc-web

gRPC Web proxy and Express / Connect middleware for Node.js
JavaScript
7
star
10

svgv

Transform SVGs into Vue components.
JavaScript
3
star
11

exllama

Python
2
star
12

phd

PHP Database library.
PHP
2
star
13

redux-reflex

Reduce boilerplate code by automatically creating action creators and action types from reducers.
JavaScript
2
star
14

jekyll-theme-github

A Jekyll theme for GitHub Pages based on GitHub's Primer styles.
SCSS
2
star
15

nbimport

An IPython magic command to import and run external notebooks using public URLs.
Python
2
star
16

evaluate

A tool to evaluate the performance of various machine learning algorithms and preprocessing steps to find a good baseline for a given task.
Python
2
star
17

nn

A neural network library built on top of TensorFlow for quickly building deep learning models.
Python
2
star
18

phython

Call Python modules and functions from PHP.
PHP
2
star
19

webpack-setup

[DEPRECATED] Webpack config simplified.
JavaScript
1
star
20

react-redux-async

Load react components and redux reducers asynchronously. Useful for code splitting and lazy loading.
JavaScript
1
star
21

godb

A simple key-value store server written in Go language.
Go
1
star
22

code-guidelines

1
star
23

pwk

Sample scripts and manifests for playing with Kubernetes.
Shell
1
star
24

train

A library to build and train reinforcement learning agents in OpenAI Gym environments.
Python
1
star
25

guides

General contributing guidelines and coding standards.
1
star
26

external

Run tasks on external processes to overcome Python's global interpreter lock.
Python
1
star
27

shr

Simple HTTP requests for browser. "Simple requests" don't trigger a CORS preflight.
JavaScript
1
star
28

test-lumen

Test monolith and microservices implementation in lumen.
PHP
1
star
29

modernize

normalize.css with useful defaults for modern browsers.
CSS
1
star