• Stars
    star
    905
  • Rank 48,693 (Top 1.0 %)
  • Language
    Python
  • License
    MIT License
  • Created 11 months ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

Clearly explained guide for running quantized open-source LLM applications on CPUs using LLama 2, C Transformers, GGML, and LangChain

Step-by-step guide on TowardsDataScience: https://towardsdatascience.com/running-llama-2-on-cpu-inference-for-document-q-a-3d636037a3d8


Context

  • Third-party commercial large language model (LLM) providers like OpenAI's GPT4 have democratized LLM use via simple API calls.
  • However, there are instances where teams would require self-managed or private model deployment for reasons like data privacy and residency rules.
  • The proliferation of open-source LLMs has opened up a vast range of options for us, thus reducing our reliance on these third-party providers. 
  • When we host open-source LLMs locally on-premise or in the cloud, the dedicated compute capacity becomes a key issue. While GPU instances may seem the obvious choice, the costs can easily skyrocket beyond budget.
  • In this project, we will discover how to run quantized versions of open-source LLMs on local CPU inference for document question-and-answer (Q&A).

    Alt text

Quickstart

  • Ensure you have downloaded the GGML binary file from https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML and placed it into the models/ folder
  • To start parsing user queries into the application, launch the terminal from the project directory and run the following command: poetry run python main.py "<user query>"
  • For example, poetry run python main.py "What is the minimum guarantee payable by Adidas?"
  • Note: Omit the prepended poetry run if you are NOT using Poetry

    Alt text

Tools

  • LangChain: Framework for developing applications powered by language models
  • C Transformers: Python bindings for the Transformer models implemented in C/C++ using GGML library
  • FAISS: Open-source library for efficient similarity search and clustering of dense vectors.
  • Sentence-Transformers (all-MiniLM-L6-v2): Open-source pre-trained transformer model for embedding text to a 384-dimensional dense vector space for tasks like clustering or semantic search.
  • Llama-2-7B-Chat: Open-source fine-tuned Llama 2 model designed for chat dialogue. Leverages publicly available instruction datasets and over 1 million human annotations.
  • Poetry: Tool for dependency management and Python packaging

Files and Content

  • /assets: Images relevant to the project
  • /config: Configuration files for LLM application
  • /data: Dataset used for this project (i.e., Manchester United FC 2022 Annual Report - 177-page PDF document)
  • /models: Binary file of GGML quantized LLM model (i.e., Llama-2-7B-Chat)
  • /src: Python codes of key components of LLM application, namely llm.py, utils.py, and prompts.py
  • /vectorstore: FAISS vector store for documents
  • db_build.py: Python script to ingest dataset and generate FAISS vector store
  • main.py: Main Python script to launch the application and to pass user query via command line
  • pyproject.toml: TOML file to specify which versions of the dependencies used (Poetry)
  • requirements.txt: List of Python dependencies (and version)

References

More Repositories

1

AWS-Certified-Cloud-Practitioner-Notes

Notes compiled based on AWS E-Learning lessons and transcripts
790
star
2

Failed-ML

Compilation of high-profile real-world examples of failed machine learning projects
608
star
3

Neural-Network-Architecture-Diagrams

Diagrams for visualizing neural network architecture (Created with diagrams.net)
552
star
4

MLOps-Specialization-Notes

Notes for Machine Learning Engineering for Production (MLOps) Specialization course by DeepLearning.AI & Andrew Ng
325
star
5

Generative-AI-Pharmacist

Generative AI Pharmacist (For Demo Purposes Only)
70
star
6

End-to-End-AutoML-Insurance

An End-to-End Implementation of AutoML with H2O, MLflow, FastAPI, and Streamlit for Insurance Cross-Sell
Jupyter Notebook
64
star
7

Pyvis-Network-Graph-Streamlit

Deploying Pyvis Interactive Network Graphs in Streamlit
HTML
39
star
8

Drug-Interactions-Network-Analysis-and-Visualization

Network analysis and visualization of drug-drug interactions with NetworkX and Pyvis
Jupyter Notebook
28
star
9

OCR-Metrics-CER-WER

Sample implementation of OCR metrics (CER, WER) calculation with TesseractOCR and fastwer
Jupyter Notebook
25
star
10

Logistic-Regression-Assumptions

Assumptions of Logistic Regression, Clearly Explained
Jupyter Notebook
24
star
11

Image-Metadata-Exif

Read and modify image metadata in Python with exif
Jupyter Notebook
23
star
12

kennethleungty

Data Science Portfolio
22
star
13

Data-Centric-AI-Competition

Codes for a Top 5% finish in the Data-Centric AI Competition organized by Andrew Ng and DeepLearning.AI
Jupyter Notebook
20
star
14

Anomaly-Detection-Pipeline-Kedro

Anomaly Detection Pipeline with Isolation Forest model and Kedro framework
Python
17
star
15

Car-Plate-Detection-OpenCV-TesseractOCR

Russian Car License Plate Detection with OpenCV and TesseractOCR in Python
Jupyter Notebook
16
star
16

FIFA-Football-World-Rankings

Analyzing FIFA World Football Rankings with Python and R
Jupyter Notebook
16
star
17

Wikipedia-Scraping-with-LLM-Agents

Scraping Wikipedia by combining LangChain's agents and tools with OpenAI's LLMs and function calling
Jupyter Notebook
15
star
18

Singapore-Condo-Rental-Market-Analysis

Singapore Condo Rental Prices - From Data Acquisition to Prediction
Jupyter Notebook
13
star
19

Text-to-Audio-with-Bark

Exploring Bark, the Open-Source Text-to-Audio Generative Model
Jupyter Notebook
13
star
20

DataWig-Missing-Data-Imputation

Imputation of Missing Data in Tables
Jupyter Notebook
12
star
21

PyTorch-Ignite-Tiny-ImageNet-Classification

Tiny ImageNet Classification Exercise with PyTorch
Jupyter Notebook
12
star
22

Fortune-Global-500-Bar-Chart-Race

Using Python and Flourish to visualize rank and revenue trends of the world’s largest companies
Jupyter Notebook
11
star
23

Principal-Component-Regression

Principal Component Regression - Clearly Explained and Implemented
Jupyter Notebook
11
star
24

AWS-RDS-MySQL-Python

Integrating Amazon RDS, MySQL Workbench, and PyMySQL to build and deploy a database on the cloud
Jupyter Notebook
11
star
25

Credit-Card-Fraud-Detection-AutoXGB

Utilizing AutoXGB for Credit Card Financial Fraud Detection
Jupyter Notebook
11
star
26

Keyword-Analysis-with-KeyBERT-and-Taipy

Keyword Extraction and Analysis Pipeline & Application with KeyBERT and Taipy
Python
11
star
27

Simulated-Annealing-Feature-Selection

Feature Selection using Simulated Annealing
Jupyter Notebook
10
star
28

COVID19-Vaccine-Sentiment-Analysis

Sentiment Analysis of COVID-19 Vaccine-related Twitter Data
Jupyter Notebook
10
star
29

Alcohol-Image-Classifier-fastai

Utilizing fastai to classify images of various types of alcoholic beverages
Jupyter Notebook
9
star
30

English-Premier-League-VAR-Analysis

Analyzing Video Assistant Referee (VAR) decisions in the English Premier League (2019 - 2021)
Jupyter Notebook
9
star
31

Image-Augmentation-Libraries

Sample implementation codes for a variety of popular image augmentation Python packages
Jupyter Notebook
8
star
32

ChatPod

ChatPod - Q&A over your Podcasts
Jupyter Notebook
7
star
33

PyMySQL-Demo

PyMySQL - Connecting Python and SQL for Data Science
Jupyter Notebook
7
star
34

TensorFlow-Transfer-Learning-Image-Classification

Practical Guide to Transfer Learning in TensorFlow for Multiclass Image Classification
Jupyter Notebook
7
star
35

StatsAssume

Automating Assumption Checks for Regression Models (Work in Progress, Currently Paused)
Python
6
star
36

Domain-LLMs

Comprehensive Compilation of Customized LLMs for Specific Domains and Industries
4
star
37

Common-Python-Codes

A list of common Python commands I use for data wrangling
Jupyter Notebook
4
star
38

Exploring-Illegal-Drugs

Exploratory data analysis of the counterfeit drugs as reported in Singapore by HSA
Jupyter Notebook
3
star
39

Web-Scraping-Walkthrough-HCP-Info

Web scraping script (with Python and Selenium) to automatically compile list of licensed healthcare professionals along with their respective public details
Jupyter Notebook
3
star
40

Post-Vaccine-Timer

Post-COVID-19 Vaccine Timer
HTML
2
star
41

ODE-Modelling-with-Differential-Evolution

Pharmacokinetic modelling of drug concentration trajectories with ordinary differential equations and differential evolution
Jupyter Notebook
1
star