• Stars
    star
    220
  • Rank 175,276 (Top 4 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Enterprise Scale NLP with Hugging Face & SageMaker Workshop series

Workshop: Enterprise-Scale NLP with Hugging Face & Amazon SageMaker

Earlier this year we announced a strategic collaboration with Amazon to make it easier for companies to use Hugging Face Transformers in Amazon SageMaker, and ship cutting-edge Machine Learning features faster. We introduced new Hugging Face Deep Learning Containers (DLCs) to train and deploy Hugging Face Transformers in Amazon SageMaker.

In addition to the Hugging Face Inference DLCs, we created a Hugging Face Inference Toolkit for SageMaker. This Inference Toolkit leverages the pipelines from the transformers library to allow zero-code deployments of models, without requiring any code for pre-or post-processing.

In October and November, we held a workshop series on “Enterprise-Scale NLP with Hugging Face & Amazon SageMaker”. This workshop series consisted out of 3 parts and covers:

  • Getting Started with Amazon SageMaker: Training your first NLP Transformer model with Hugging Face and deploying it
  • Going Production: Deploying, Scaling & Monitoring Hugging Face Transformer models with Amazon SageMaker
  • MLOps: End-to-End Hugging Face Transformers with the Hub & SageMaker Pipelines

We recorded all of them so you are now able to do the whole workshop series on your own to enhance your Hugging Face Transformers skills with Amazon SageMaker or vice-versa.

Below you can find all the details of each workshop and how to get started.

🧑🏻‍💻 Github Repository: https://github.com/philschmid/huggingface-sagemaker-workshop-series

📺  Youtube Playlist: https://www.youtube.com/playlist?list=PLo2EIpI_JMQtPhGR5Eo2Ab0_Vb89XfhDJ

Note: The Repository contains instructions on how to access a temporary AWS, which was available during the workshops. To be able to do the workshop now you need to use your own or your company AWS Account.

In Addition to the workshop we created a fully dedicated Documentation for Hugging Face and Amazon SageMaker, which includes all the necessary information. If the workshop is not enough for you we also have 15 additional getting samples Notebook Github repository, which cover topics like distributed training or leveraging Spot Instances.

Workshop 1: Getting Started with Amazon SageMaker: Training your first NLP Transformer model with Hugging Face and deploying it

In Workshop 1 you will learn how to use Amazon SageMaker to train a Hugging Face Transformer model and deploy it afterwards.

  • Prepare and upload a test dataset to S3
  • Prepare a fine-tuning script to be used with Amazon SageMaker Training jobs
  • Launch a training job and store the trained model into S3
  • Deploy the model after successful training

🧑🏻‍💻 Code Assets: https://github.com/philschmid/huggingface-sagemaker-workshop-series/tree/main/workshop_1_getting_started_with_amazon_sagemaker

📺 Youtube: https://www.youtube.com/watch?v=pYqjCzoyWyo&list=PLo2EIpI_JMQtPhGR5Eo2Ab0_Vb89XfhDJ&index=6&t=5s&ab_channel=HuggingFace

Workshop 2: Going Production: Deploying, Scaling & Monitoring Hugging Face Transformer models with Amazon SageMaker

In Workshop 2 learn how to use Amazon SageMaker to deploy, scale & monitor your Hugging Face Transformer models for production workloads.

  • Run Batch Prediction on JSON files using a Batch Transform
  • Deploy a model from hf.co/models to Amazon SageMaker and run predictions
  • Configure autoscaling for the deployed model
  • Monitor the model to see avg. request time and set up alarms

🧑🏻‍💻 Code Assets: https://github.com/philschmid/huggingface-sagemaker-workshop-series/tree/main/workshop_2_going_production

📺 Youtube: https://www.youtube.com/watch?v=whwlIEITXoY&list=PLo2EIpI_JMQtPhGR5Eo2Ab0_Vb89XfhDJ&index=6&t=61s

Workshop 3: MLOps: End-to-End Hugging Face Transformers with the Hub & SageMaker Pipelines

In Workshop 3 learn how to build an End-to-End MLOps Pipeline for Hugging Face Transformers from training to production using Amazon SageMaker.

We are going to create an automated SageMaker Pipeline which:

  • processes a dataset and uploads it to s3
  • fine-tunes a Hugging Face Transformer model with the processed dataset
  • evaluates the model against an evaluation set
  • deploys the model if it performed better than a certain threshold

🧑🏻‍💻 Code Assets: https://github.com/philschmid/huggingface-sagemaker-workshop-series/tree/main/workshop_3_mlops

📺 Youtube: https://www.youtube.com/watch?v=XGyt8gGwbY0&list=PLo2EIpI_JMQtPhGR5Eo2Ab0_Vb89XfhDJ&index=7

Access Workshop AWS Account

For this workshop you’ll get access to a temporary AWS Account already pre-configured with Amazon SageMaker Notebook Instances. Follow the steps in this section to login to your AWS Account and download the workshop material.

1. To get started navigate to - https://dashboard.eventengine.run/login

setup1

Click on Accept Terms & Login

2. Click on Email One-Time OTP (Allow for up to 2 mins to receive the passcode)

setup2

3. Provide your email address

setup3

4. Enter your OTP code

setup4

5. Click on AWS Console

setup5

6. Click on Open AWS Console

setup6

7. In the AWS Console click on Amazon SageMaker

setup7

8. Click on Notebook and then on Notebook instances

setup8

9. Create a new Notebook instance

setup9

10. Configure Notebook instances

  • Make sure to increase the Volume Size of the Notebook if you want to work with big models and datasets
  • Add your IAM_Role with permissions to run your SageMaker Training And Inference Jobs
  • Add the Workshop Github Repository to the Notebook to preload the notebooks: https://github.com/philschmid/huggingface-sagemaker-workshop-series.git

setup10

11. Open the Lab and select the right kernel you want to do and have fun!

Open the workshop you want to do (workshop_1_getting_started_with_amazon_sagemaker/) and select the pytorch kernel

setup11

More Repositories

1

deep-learning-pytorch-huggingface

Jupyter Notebook
425
star
2

clipper.js

HTML to Markdown converter and crawler.
TypeScript
402
star
3

easyllm

Jupyter Notebook
401
star
4

document-ai-transformers

Jupyter Notebook
242
star
5

sagemaker-huggingface-llama-2-samples

Jupyter Notebook
82
star
6

cdk-samples

Python
56
star
7

serverless-bert-huggingface-aws-lambda-docker

Python
40
star
8

terraform-aws-sagemaker-huggingface

HCL
38
star
9

advanced-pii-huggingface-sagemaker

Jupyter Notebook
33
star
10

serverless-bert-with-huggingface-aws-lambda

Python
30
star
11

amazon-sagemaker-gpt-j-sample

Jupyter Notebook
28
star
12

deep-learning-habana-huggingface

Jupyter Notebook
28
star
13

optimum-static-quantization

Jupyter Notebook
27
star
14

knowledge-distillation-transformers-pytorch-sagemaker

Jupyter Notebook
27
star
15

optimum-transformers-optimizations

Jupyter Notebook
25
star
16

efsync

Python
23
star
17

setfit-few-shot-classification-sample

Jupyter Notebook
21
star
18

aws-lambda-with-docker-image

Python
21
star
19

llm-sagemaker-sample

Jupyter Notebook
20
star
20

text-generation-inference-tests

Jupyter Notebook
17
star
21

fine-tune-GPT-2

Jupyter Notebook
17
star
22

deepspeed-sagemaker-example

Jupyter Notebook
17
star
23

deep-learning-remote-runner

Python
16
star
24

serverless-machine-learning

collection of serverless machine learning use cases and examples including Hugging Face transformers, timm, Gradio
Python
15
star
25

keras-vision-transformer-huggingface

Jupyter Notebook
14
star
26

transformers-pytorch-text-classification

Jupyter Notebook
14
star
27

aws-sagemaker-huggingface-llm

Jupyter Notebook
12
star
28

new-serverless-bert-aws-lambda

Python
11
star
29

aws-neuron-samples

Python
11
star
30

blog-github-actions-aws-lambda-python

Python
10
star
31

huggingface-container

Dockerfile
9
star
32

multilingual-serverless-qa-aws-lambda

Python
9
star
33

sentence-transformers-huggingface-inferentia

Jupyter Notebook
8
star
34

sagemaker-falcon-180b-samples

Jupyter Notebook
8
star
35

evaluate-llms

Includes examples on how to evaluate LLMs
Jupyter Notebook
8
star
36

amazon-sagemaker-flan-t5-xxl

Example how to deploy FLAN-T5-XXL on Amazon SageMaker
Jupyter Notebook
8
star
37

transformers-deepspeed

Jupyter Notebook
7
star
38

huggingface-inferentia2-samples

Jupyter Notebook
7
star
39

aws-marketplace-example

TypeScript
6
star
40

serverless-efs-and-aws-lambda

Python
6
star
41

sample-huggingface-sagemaker-cdk

Python
6
star
42

open-source-function-calling

Jupyter Notebook
6
star
43

blog-custom-github-action

Dockerfile
6
star
44

huggingface-mongodb-example

6
star
45

scale-machine-learning-w-pytorch

Python
5
star
46

transformers-inference-experiments

Jupyter Notebook
5
star
47

keras-financial-summarization-huggingface

Jupyter Notebook
5
star
48

rust-machine-learning

Rust
4
star
49

onnx-transformers

Python
4
star
50

keras-layoutlm-transformers

Jupyter Notebook
4
star
51

github-actions

4
star
52

blog-github-action-cicd-aws-s3

Vue
4
star
53

model-recommender

Jupyter Notebook
4
star
54

amazon-sagemaker-flan-ul2

Jupyter Notebook
4
star
55

sentence-transformers-tensorflow

Jupyter Notebook
4
star
56

rust-hf-hub-loader

Rust
3
star
57

langchain-tests

Jupyter Notebook
3
star
58

rust-stuff

Rust
3
star
59

philschmid.de

TypeScript
3
star
60

aws-bedrock-titan-mteb

Repository to evaluate Amazon Bedrock Titan text-embeddings on MTEB
Python
3
star
61

philschmid-de-v2

JavaScript
3
star
62

huggingface-sagemaker-llm-private-vpc

Jupyter Notebook
3
star
63

prosus-sagemaker-huggingface-workshop

Jupyter Notebook
3
star
64

huggingface-sagemaker-multi-container-endpoint

Jupyter Notebook
2
star
65

pytorch-bert-e2e-model

Jupyter Notebook
2
star
66

aws-devcontainer-test

Dockerfile
2
star
67

accelerate-transformers-example

2
star
68

tmls-sagemaker-huggingface-workshop

Jupyter Notebook
2
star
69

gradio-docker

2
star
70

sagemaker-huggingface-idefics-sample

Jupyter Notebook
2
star
71

langchain-samples-and-experiments

Jupyter Notebook
2
star
72

sagemaker-cdk-samples

TypeScript
2
star
73

rust-vs-python

Python
2
star
74

transformers-keras-e2e-ner

Jupyter Notebook
2
star
75

open-llm-stack

Open LLM Stack to easily deploy open source Generative AI application in the cloud and for production
2
star
76

sagemaker-debug-xla

Python
1
star
77

transformers-inferentia

Python
1
star
78

german-sentiment-bert

Jupyter Notebook
1
star
79

huggingface-course-sagemaker-talk

Jupyter Notebook
1
star
80

huggingface_sagemaker_tensorflow_distributed

Python
1
star
81

transformers-deepspeed-expermiments

Python
1
star
82

download-release-assets

Shell
1
star
83

sagemaker-beta-inference

Jupyter Notebook
1
star
84

python-project-template

Python
1
star
85

train-6-b-gpt-j-amazon-sagemaker

Jupyter Notebook
1
star
86

lambda-apollo-dynamodb-template

TypeScript
1
star
87

BYOC-Amazon-Sagemaker

Python
1
star
88

stable-diffusion-tests

Jupyter Notebook
1
star
89

sample-custom-inference-sagemaker-huggingface

Python
1
star
90

epfllm-megatron-llm

Jupyter Notebook
1
star
91

philschmid-blog

TypeScript
1
star
92

sdxl-inf2-demo-spaces-gradio

Python
1
star
93

nividia-triton-distilbert-bls-classification-example

Python
1
star
94

sentence-transformers-optimizations

Jupyter Notebook
1
star
95

philschmid

1
star
96

sagemaker-text-generation-inference

Jupyter Notebook
1
star
97

rust-lambda-example

Rust AWS Lambda API Gateway CDK example
Rust
1
star
98

sentence-transformers-examples

Python
1
star
99

distilroberta-token-classification

Jupyter Notebook
1
star
100

nextjs-amplify-cdk-sample

TypeScript
1
star