Large Language Models
This repo contains the notebooks and slides for the Large Language Models: Application through Production course on edX & Databricks Academy.
Notebooks
How to Import the Repo into Databricks?
-
You first need to add Git credentials to Databricks. Refer to documentation here.
-
Click
Repos
in the sidebar. ClickAdd Repo
on the top right. -
Clone the "HTTPS" URL from GitHub, or copy
https://github.com/databricks-academy/large-language-models.git
and paste into the boxGit repository URL
. The rest of the fields, i.e.Git provider
andRepository name
, will be automatically populated. ClickCreate Repo
on the bottom right.
.dbc
releases on GitHub
How to Import the files from -
You can download the notebooks from a release by navigating to the releases section on the GitHub page:
-
From the releases page, download the
.dbc
file. This contains all of the course notebooks, with the structure and meta data. -
In your Databricks workspace, navigate to the Workspace menu, click on Home and select
Import
: -
Using the import tool, navigate to the location on your computer where the
.dbc
file was dowloaded from Step 1. Once you select the file, clickImport
, and the files will be loaded and extracted to your workspace:
Cluster settings
Which Databricks cluster should I use?
-
First, select
Single Node
-
This courseware has been tested on Databricks Runtime 13.1 for Machine Learning. If you do not have access to a 13.1 ML Runtime cluster, you will need to install many additional libraries (as the ML Runtime pre-installs many commonly used machine learning packages), and this courseware is not guaranteed to run.
For all of the notebooks except
LLM 04a - Fine-tuning LLMs
andLLM04L - Fine-tuning LLMs Lab
, you can run them on a CPU just fine. We recommend eitheri3.xlarge
ori3.2xlarge
(i3.2xlarge will have slightly faster performance).For these notebooks:
LLM 04a - Fine-tuning LLMs
andLLM04L - Fine-tuning LLMs Lab
, you will need the Databricks Runtime 13.1 for Machine Learning with GPU.Select GPU instance type of
g5.2xlarge
.
Install datasets and models
How do I install the datasets and models locally?
Slides
Where do I download course slides?
Please click the latest version under the Releases
section. You will be able to download the slides in PDF.