TorchServe
TorchServe is a flexible and easy to use tool for serving and scaling PyTorch models in production.
Requires python >= 3.8
curl http://127.0.0.1:8080/predictions/bert -T input.txt
π Quick start with TorchServe
# Install dependencies
# cuda is optional
python ./ts_scripts/install_dependencies.py --cuda=cu102
# Latest release
pip install torchserve torch-model-archiver torch-workflow-archiver
# Nightly build
pip install torchserve-nightly torch-model-archiver-nightly torch-workflow-archiver-nightly
π Quick start with TorchServe (conda)
# Install dependencies
# cuda is optional
python ./ts_scripts/install_dependencies.py --cuda=cu102
# Latest release
conda install -c pytorch torchserve torch-model-archiver torch-workflow-archiver
# Nightly build
conda install -c pytorch-nightly torchserve torch-model-archiver torch-workflow-archiver
π³ Quick Start with Docker
# Latest release
docker pull pytorch/torchserve
# Nightly build
docker pull pytorch/torchserve-nightly
Refer to torchserve docker for details.
β‘ Why TorchServe
- Model Management API: multi model management with optimized worker to model allocation
- Inference API: REST and gRPC support for batched inference
- TorchServe Workflows: deploy complex DAGs with multiple interdependent models
- Default way to serve PyTorch models in
- Export your model for optimized inference. Torchscript out of the box, ORT and ONNX, IPEX, TensorRT, FasterTransformer
- Performance Guide: builtin support to optimize, benchmark and profile PyTorch and TorchServe performance
- Expressive handlers: An expressive handler architecture that makes it trivial to support inferencing for your usecase with many supported out of the box
- Metrics API: out of box support for system level metrics with Prometheus exports, custom metrics and PyTorch profiler support
π€ How does TorchServe work
- Model Server for PyTorch Documentation: Full documentation
- TorchServe internals: How TorchServe was built
- Contributing guide: How to contribute to TorchServe
π Highlighted Examples
- π€ HuggingFace Transformers with a Better Transformer Integration
- Model parallel inference
- MultiModal models with MMF combining text, audio and video
- Dual Neural Machine Translation for a complex workflow DAG
- TorchServe Integrations
- TorchServe Internals
- TorchServe UseCases
For more examples
π€ Learn More
π« Contributing
We welcome all contributions!
To learn more about how to contribute, see the contributor guide here.
π° News
- Torchserve Performance Tuning, Animated Drawings Case-Study
- Walmart Search: Serving Models at a Scale on TorchServe
π₯ Scaling inference on CPU with TorchServeπ₯ TorchServe C++ backend- Grokking Intel CPU PyTorch performance from first principles: a TorchServe case study
- Grokking Intel CPU PyTorch performance from first principles( Part 2): a TorchServe case study
- Case Study: Amazon Ads Uses PyTorch and AWS Inferentia to Scale Models for Ads Processing
- Optimize your inference jobs using dynamic batch inference with TorchServe on Amazon SageMaker
- Using AI to bring children's drawings to life
- π₯ Model Serving in PyTorch
- Evolution of Cresta's machine learning architecture: Migration to AWS and PyTorch
π₯ Explain Like Iβm 5: TorchServeπ₯ How to Serve PyTorch Models with TorchServe- How to deploy PyTorch models on Vertex AI
- Quantitative Comparison of Serving Platforms
- Efficient Serverless deployment of PyTorch models on Azure
- Deploy PyTorch models with TorchServe in Azure Machine Learning online endpoints
- Dynaboard moving beyond accuracy to holistic model evaluation in NLP
- A MLOps Tale about operationalising MLFlow and PyTorch
- Operationalize, Scale and Infuse Trust in AI Models using KFServing
- How Wadhwani AI Uses PyTorch To Empower Cotton Farmers
- TorchServe Streamlit Integration
- Dynabench aims to make AI models more robust through distributed human workers
- Announcing TorchServe
π All Contributors
Made with contrib.rocks.
βοΈ Disclaimer
This repository is jointly operated and maintained by Amazon, Meta and a number of individual contributors listed in the CONTRIBUTORS file. For questions directed at Meta, please send an email to [email protected]. For questions directed at Amazon, please send an email to [email protected]. For all other questions, please open up an issue in this repository here.
TorchServe acknowledges the Multi Model Server (MMS) project from which it was derived