• This repository has been archived on 22/Jun/2024
  • Stars
    star
    303
  • Rank 137,655 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created over 1 year ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Starter App to Build Your Own App to Query Doc Collections with Large Language Models (LLMs) using LlamaIndex, Langchain, OpenAI and more (MIT Licensed)

Delphic

A simple framework to use LlamaIndex to build and deploy LLM agents that can be used to analyze and manipulate text data from documents.

Built with Cookiecutter Django Black code style

License: MIT

Getting Setup

Word of Caution / Note

The initial release of Delphic is based solely on OpenAI's API. We fully plan to support other large language models (LLMs), whether self-hosted or powered by third-party API. At the moment, however, as of April 2023, Open AI's API remains perhaps the most capable and easiest to deploy. Since this framework is based on LlamaIndex and is fully compatible with Langchain, it will be pretty easy to use other LLMs. At the moment, however, your text WILL be processed with OpenAI, even if you're self-hosting this tool. If OpenAI's terms of service present a problem for you, we leave that to you to resolve. WE ARE NOT RESPONSIBLE FOR ANY ISSUES ARRISING FROM YOUR USE OF THIS TOOL AND OPENAI API.

Getting Started Locally

Just Run the Application

The fastest way to get up and running is to clone this repo and then deploy the application locally.

You will need Docker and Docker Compose to follow these instructions. DigitalOcean, besides being an excellent cloud host, has some of the easiest-to-follow instructions on setting these up. Please check them out here or go to the Docker official instructions.

  1. First, clone the repo:
git clone
  1. Then, change into the directory:
cd delphic
  1. Don't forget to copy the sample env file to ./.envs/.local/ (you may need to be a super user / use sudo depending on your desired location)
mkdir -p ./.envs/.local/
cp -a ./docs/sample_envs/local/.frontend ./frontend
cp -a ./docs/sample_envs/local/.django ./.envs/.local
cp -a ./docs/sample_envs/local/.postgres ./.envs/.local
  1. And, next update your .django configuration (you'll probably want to edit .postgres as well to give your database user a unique password) to include your OPENAI API KEY

  2. Then, build the docker images:

sudo docker-compose --profile fullstack -f local.yml build
  1. Finally, to launch the application, type:
sudo docker-compose --profile fullstack -f local.yml up

Go to localhost:3000 to see the frontend.

I Want to Develop / Modify the Frontend

If you want to actively develop the frontend, we suggest you NOT use the --profile=fullstack flag as every change will require a full container rebuild. Instead, see the Development Environment instead of step #5 above,

Production Deploy

This assumes you want to make the application available to the internet at some kind of fully qualified domain like delphic.opensource.legal. In order to do this, you need to update a couple configurations.

TODO - insert documentation

Using the Application

Setup Users

In order to actually use the application (at the moment, we intend to make it possible to share certain models with unauthenticated users), you need a login. You can use either a superuser or non-superuser. In either case, someone needs to first create a superuser using the console:

Why set up a Django superuser? A Django superuser has all the permissions in the application and can manage all aspects of the system, including creating, modifying, and deleting users, collections, and other data. Setting up a superuser allows you to fully control and manage the application.

Warning / Disclaimer

**At the moment, any user who is logged in will have full permissions. We plan to implement the more precise, roles-based access control module we developed for OpenContracts, but, for now be aware that anyone with any type of login credentials can create and delete collections. Creating collections uses OpenAI credits / costs money

First, Setup a Django superuser:

  1. Run the following command to create a superuser:
sudo docker-compose -f local.yml run django python manage.py createsuperuser
  1. You will be prompted to provide a username, email address, and password for the superuser. Enter the required information.

Second (if desired), Setup Additional Users

Start your Delphic application locally following the deployment instructions.

  1. Visit the Django admin interface by navigating to http://localhost:8000/admin in your browser.
  2. Log in with the superuser credentials you created earlier.
  3. Click on the "Users" link in the “Users” section.
  4. Click on the “Add User +” button in the top right corner.
  5. Enter the required information for the new user, such as username and password. Click “Save” to create the user.
  6. To grant the new user additional permissions or make them a superuser, click on their username in the user list, scroll down to the “Permissions” section, and configure their permissions accordingly. Save your changes.

Creating and Querying a Collection

WARNING - If you're using OpenAI as your LLM engine, any Collection interaction will use API credits / cost money. If you're using your own OpenAI API key, you've also accepted their terms of service which may not be suitable for your use-case. Please do your own diligence.

To access the question-answering interface, bring up the fullstack, and go to http://localhost:3000

Delphic.Demo.mp4

Development Environment

If you want to contribute to Delphic or roll your own version, you'll want to ensure you setup the development environment.

Backend Setup

On the backend, you'll need to have a working python environment to run the pre-commit formatting checks. You can use your system python interpreter, but we recommend using pyenv and creating a virtual env based off of Python>=3.10.

Pre-Commit Setup

Then, in the root of your local repo, run these commands:

pip install -r ./requirements/local.txt
pre-commit install

Now, when you stage your commits, ou ar code formatting and style checks will run automatically.

Running Tests

We have a basic test suite in ./tests. You can run the tests by typing:

sudo docker-compose -f local.yml run django python manage.py test

Frontend Setup

On the frontend, we're using node v18.15.0. We assume you're using nvm. We don't have any frontend tests yet (sorry).

Setup and Launch Node Development Server

Cd into the frontend directory, install your frontend dependencies and start a development server (Note, we assume you have nvm installed. If you don't install it now):

cd frontend
nvm use
npm install yarn
yarn install

Typing yarn start will bring up your frontend development server at http://localhost:3000. You still need to launch the backend in order for it to work properly.

Run Backend Compose Stack Without fullstack profile flag

Launch the backend without the fullstack flag:

sudo docker-compose -f local.yml up

More Repositories

1

OpenContracts

Mass document analytics platform based on LlamaIndex, Pgvector, React and Django.
Python
691
star
2

Python-Redlines

Docx tracked change redlines for the Python ecosystem.
Python
46
star
3

GremlinServer

A low-code microservices platform designed for legal engineers. Given a document, Gremlin will apply a series of Python scripts to it and return transformed documents and/or extracted data. Use with GremlinUI for an open source, modern, React-based low-code experience (https://github.com/JSv4/GremlinGUI)
Python
24
star
4

AtticusClassifier

Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus
Python
10
star
5

OCRUSREX

A simple Python script to turn non-OCRed PDFs into searchable, OCRed PDFs under an enterprise-friendly, open source license.
Python
8
star
6

GremlinGUI

React GUI for the GREMLIN low code microservices framework
JavaScript
6
star
7

PDF-Preprocessors

Collection of tools to provide text extract outputs for all PDFs that include x,y coordinate data as well as text
Python
2
star
8

awesome-legaltech

Community-currated list of awesome tools, data and resources related to the use of technology to improve legal services
2
star
9

LlamaParser

Use LLMs to Clean and Parse Natural Language from Documents
Python
1
star
10

FortWorthCrawler

Django-based crawler to crawl over the fort worth document repository
Python
1
star
11

OpenPyxl-Recursive

Very quick and dirty fix for Openpyxl's inability to create circular formulas that don't display VALUE errors when opened in Excel. This is here for ease of deployment and IS NOT MAINTAINED.
Python
1
star
12

OpenContractTypes

Python Types for OpenContracts and Related Projects
Python
1
star
13

OpenContractsClient

Standalone Python API Client for OpenContracts API
Python
1
star
14

BotsOnRails

BotsOnRails makes it easy to write LLM-controlled workflows that leave you in control of the execution flow with clear decision-gates and resumable workflows.
Python
1
star