• Stars
    star
    552
  • Rank 77,859 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created about 1 year ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Large Language Models (LLMs) tutorials & sample scripts, ft. langchain, openai, llamaindex, gpt, chromadb & pinecone

llm-python

A set of instructional materials, code samples and Python scripts featuring LLMs (GPT etc) through interfaces like llamaindex, langchain, Chroma (Chromadb), Pinecone etc. Mainly used to store reference code for my LangChain tutorials on YouTube.

LangChain youtube tutorials

Learn LangChain from my YouTube channel (~8 hours of LLM hands-on building tutorials); Each lesson is accompanied by the corresponding code in this repo and is designed to be self-contained -- while still focused on some key concepts in LLM (large language model) development and tooling.

Feel free to pick and choose your starting point based on your learning goals:

Part LLM Tutorial Link Video Duration
1 OpenAI tutorial and video walkthrough Tutorial Video 26:56
2 LangChain + OpenAI tutorial: Building a Q&A system w/ own text data Tutorial Video 20:00
3 LangChain + OpenAI to chat w/ (query) own Database / CSV Tutorial Video 19:30
4 LangChain + HuggingFace's Inference API (no OpenAI credits required!) Tutorial Video 24:36
5 Understanding Embeddings in LLMs Tutorial Video 29:22
6 Query any website with LLamaIndex + GPT3 (ft. Chromadb, Trafilatura) Tutorial Video 11:11
7 Locally-hosted, offline LLM w/LlamaIndex + OPT (open source, instruction-tuning LLM) Tutorial Video 32:27
8 Building an AI Language Tutor: Pinecone + LlamaIndex + GPT-3 + BeautifulSoup Tutorial Video 51:08
9 Building a queryable journal πŸ’¬ w/ OpenAI, markdown & LlamaIndex πŸ¦™ Tutorial Video 40:29
10 Making a Sci-Fi game w/ Cohere LLM + Stability.ai: Generative AI tutorial Tutorial Video 1:02:20
11 GPT builds entire party invitation app from prompt (ft. SMOL Developer) Tutorial Video 41:33
12 A language for LLM prompt design: Guidance Tutorial Video 43:15
13 You should use LangChain's Caching! Tutorial Video 25:37
14 Build Chat AI apps with Steamlit + LangChain Tutorial Video 32:11

The full lesson playlist can be found here.

Quick Start

  1. Clone this repo
  2. Install requirements: pip install -r requirements.txt
  3. Some sample data are provided to you in the news foldeer, but you can use your own data by replacing the content (or adding to it) with your own text files.
  4. Create a .env file which contains your OpenAI API key. You can get one from here. HUGGINGFACEHUB_API_TOKEN and PINECONE_API_KEY are optional, but they are used in some of the lessons.
    • Lesson 10 uses Cohere and Stability AI, both of which offers a free tier (no credit card required). You can add the respective keys as COHERE_API_KEY and STABILITY_API_KEY in the .env file.

The .env file should look like this:

OPENAI_API_KEY=your_api_key_here

# optionals (not required for most of the series)
HUGGINGFACEHUB_API_TOKEN=your_api_token_here
PINECONE_API_KEY=your_api_key_here

HuggingFace and Pinecone are optional but is recommended if you want to use the Inference API and explore those models outside of the OpenAI ecosystem. This is demonstrated in Part 3 of the tutorial series. 5. Run the examples in any order you want. For example, python 6_team.py will run the website Q&A example, which uses GPT-3 to answer questions about a company and the team of people working at Supertype.ai. Watch the corresponding video to follow along each of the examples.

Dependencies

πŸ’‘ Thanks to the work of @VanillaMacchiato, this project is updated as of 2023-06-30 to use the latest version of LlamaIndex (0.6.31) and LangChain (0.0.209). Installing the dependencies should be as simple as pip install -r requirements.txt. If you encounter any issues, please let me know.

If you're watching the LLM video tutorials, they may have very minor differences (typically 1-2 lines of code that needs to be changed) from the code in this repo since these videos have been released with the respective versions at the time of recording (LlamaIndex 0.5.7 and LangChain 0.0.157). Please refer to the code in this repo for the latest version of the code.

I will try to keep this repo up to date with the latest version of the libraries, but if you encounter any issues, please: (1) raise a discussion through Issues or (2) volunteer a PR to update the code.

Mentorship and Support

I run a mentorship program under Supertype Fellowship. The program is self-paced and free, with a community of other learners and practitioners around the world (English-speaking). You can optionally book a 1-on-1 session with my team of mentors to help you through video tutoring and code reviews.

License

MIT Β© Supertype 2023

In a big data era increasingly defined by the velocity and volume of information, businesses are turning to streaming analytics to make sense of their data in real-time. Streaming data pipelines are designed for ingesting, processing, and analyzing data as it arrives from different sources, affording businesses the opportunity to act on the most up-to-date information available.

In this article, we’ll explore the benefits of streaming analytics and how to build an end-to-end streaming data pipeline using Apache Kafka, Apache Spark, Cassandra, MySQL, Streamlit and Docker.

More Repositories

1

cvessentials

Tutorial Series (60 hour course): Essentials of computer vision
HTML
165
star
2

elang

Word Embedding utilities for Language Models (English & Indonesian)
Python
39
star
3

dataanalysis

Course Materials for Practical Data Analysis with Python and SQL
Jupyter Notebook
32
star
4

pedagogy

Pedagogy is a feedback-driven performance management app for education professionals built with Flask, Altair (Altair-viz) and pandas
Python
21
star
5

textmining

Beginner's Introduction to Text Mining: An App Store Reviews Exercise
HTML
21
star
6

tacticaldataprep

Knowledge Review: Tactical Data Preparation (Python and R)
HTML
19
star
7

emailnetwork

Network graphing utilities for email/mailbox (.mbox) data
Python
17
star
8

darkershiny

A Shiny web app template using a dark theme with support for custom CSS
R
12
star
9

taskquant

A python CLI that extends taskwarrior for productivity scoreboard & gamification (quantified self)
Python
12
star
10

safeskies

Reproduce an Economist graph found on the article: [Safe Skies]
HTML
11
star
11

youtube_api_python

Working with the official YouTube's API in python
Python
10
star
12

ggplot2cheatsheet

A reproduction of the Beautiful Plotting in R: A ggplot2 cheatsheet by Zev Ross
HTML
10
star
13

coronavirus

A Shiny Web App tutorial inspecting the COVID-19 (2019-nCoV) epidemic, data from https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series
R
9
star
14

sqlalchemy-tutorial

Ground Up tutorial to SQLAlchemy
HTML
8
star
15

tokopedia-fundamentals

Data Science Fundamentals in Python
Jupyter Notebook
8
star
16

elangdev

Development: Word Embedding utilities for Indonesian Language Models (NLP)
Python
8
star
17

miband

One-pager dashboard to visualize my runs from miband (mi fit) using python
HTML
7
star
18

steganography

Implementation of Least Significant Bits in Steganography (YouTube tutorial)
Python
6
star
19

summer

A summary bot that retrieves summarized wiki article on any topic, built with pyscript
HTML
6
star
20

dsf2019

Data Science Fundamentals (EDA, Data Visualization and Machine Learning in R) 2019 edition
HTML
6
star
21

pyscript-demo

A demo of pyscript (python in the browser)
HTML
6
star
22

automatetheboringstuff

Python 3.6 code references and solutions for projects in Automate The Boring Stuff with Python
Python
6
star
23

py-networking

Networking with Python
Python
5
star
24

infratools

Kickstart Session: Infrastructure and Tools for Data Science workshop materials
HTML
5
star
25

python4bankers

Python for the Banking industry (Learning path and resources)
Jupyter Notebook
5
star
26

pricemate

A simple scraper for departure time and prices from Jakarta to Bandung from Tiket.com
Python
5
star
27

logisticregressionPy

Logistic Regression in Python
HTML
4
star
28

nblite-pyscript

A demo of Eduardo's NBLite pyscript app
Jupyter Notebook
4
star
29

blockchain

Interactive workbook on core blockchain concepts
JavaScript
4
star
30

textcomplete

A next word prediction app ala Swiftkey
R
4
star
31

soliditydocs

Implementation of examples from docs.soliditylang.org
Solidity
4
star
32

rgraphics

Recreating an Economist-style plot with materials from Harvard's IQSS workshop
HTML
4
star
33

datavisualization

Code notebooks and reference materials for the Data Visualization series on YouTube
Jupyter Notebook
4
star
34

finhacks_bandung

Materials for the workshop conducted for Finhack 18
HTML
4
star
35

automate2019

A python course on office automation w/ data science
3
star
36

ballotapp

Ballot DApp (decentralized app) with React 18, web3.js and usedapp
JavaScript
3
star
37

lebaran

Kickstart Data Science workshops: Lebaran theme
R
3
star
38

stockmonitor

A lightweight CLI script that pulls stock performance data and chart them
Python
3
star
39

generations-frontend

Front end for Fellowship by @supertypeai
JavaScript
3
star
40

pyscript-guestbook

Building a guestbook with pyscript
HTML
3
star
41

Medicare

Examining US medical expenditures dataset to identify the difference in costs for different medical conditions and in different areas of the country
R
3
star
42

socialanalytics

Social Media Analytics dashboard (front end)
JavaScript
2
star
43

firsto

Django 2.0 tutorial from official documentation
Python
2
star
44

TFDL

Companion notes for the TensorFlow for Deep Learning book by Ramsundah and Zadeh
Python
2
star
45

assessment

For Algoritma's pre-interview assessment
HTML
2
star
46

advisory

Advisory investigates the underlying pattern of YouTube trending videos
R
2
star
47

cybersec

Materials for Workshop: Cybersecurity and EDA on security incidents
HTML
2
star
48

verisr2

Convenience functions for exploratory analysis on VERIS database
R
2
star
49

academy-da

Data Analytics Specialization offered by Algoritma
2
star
50

tensorflow

TensorFlow Tutorials
Jupyter Notebook
2
star
51

webscraping

Web scraping practice + exercise
Python
2
star
52

WebAnalytics

Data Analysis with Hotjar Web Analytics
Jupyter Notebook
2
star
53

asciify

Reference code and materials for the asciify video tutorial on my youtube channel
Python
2
star
54

googlecc

Google Machine Learning Crash Course
Jupyter Notebook
2
star
55

accomplish

A multi-series tutorial walking through the development of a task manager app, CRUD operations, and a cohesive UI design using the latest from Bootstrap and Material Design.
HTML
2
star
56

clfords

Command Line for Data Science
HTML
1
star
57

pyscript-altair

A demo of a live PyScript dashboard made with Altair
JavaScript
1
star
58

revconnexion

RevConnextion is a RESTful API application built on top of Connexion and can be used as a standalone post-workshop survey system
JavaScript
1
star
59

chained

Understanding blockchain
JavaScript
1
star
60

learnaltair

Learning altair
Jupyter Notebook
1
star
61

newsflash

Following the latest announcement from the central bank of Indonesia
HTML
1
star
62

onlyphantom

1
star
63

python-api-service

Python
1
star
64

covidRT

Code answers, references for a real-time covid 19 dashboard tutorial series in R
HTML
1
star