TorchServe v0.3.0 Release Notes (Beta)
This is the release of TorchServe v0.3.0
Highlights:
- Native windows support - Added support for TorchServe on Windows 10 pro and Windows Server 2019
- KFServing Integration - Added support for v1 KFServing predict and explain APIs with auto-scaling and canary deployments for serving models in Kubeflow/KFServing
- MLFlow-TorchServe: New MLflow TorchServe deployment plugin for serving models for MLflow MLOps lifecycle
- Captum explanations - Added explain API for Captum model interpretability of different models
- AKS Support - Added support for TorchServe deployment on Azure Kubernetes Service
- GKE Support - Added support for TorchServe deployment on Google Kubernetes Service
- gRPC support - Added support for gRPC based management and inference APIs
- Request Envelopes - Added support for request envelopes which parses request from multiple Model serving frameworks like Seldon, KFServing, without any modifications in the handler code
- PyTorch 1.7.1 support - TorchServe is now certified working with torch 1.7.1, torchvision 0.8.2, torchtext 0.8.1, and torchaudio 0.7.2
- TorchServe Profiling - Added end-to-end profiling of inference requests. The time taken for different events by TorchServe for an inference request is captured in TorchServe metrics logs
- Serving SDK - Release TorchServe Serving SDK 0.4.0 on maven with contracts/interfaces for Metric Endpoint plugin and Snapshot plugins
- Naked DIR support - Added support for Model Archives as Naked DIRs with the
--archive-format no-archive
- Local file URL support - Added support for registering model through local file (
file:///
) URLs - Install dependencies - Added a more robust install dependency script certified across different OS platforms (Ubuntu 18.04, MacOS, Windows 10 Pro, Windows Server 2019)
- Link Checker - Added link checker in sanity script to report any broken links in documentation
- Enhanced model description - Added GPU usage info and worker PID in model description
- FAQ guides - Added most frequently asked questions by community users
- Troubleshooting guide - Added documentation for troubleshooting common problems related to model serving by TorchServe
- Use case guide - Provides the reference use cases i.e. different ways in which TorchServe can be deployed for serving different types of PyTorch models
Other PRs since v0.2.0
Bug Fixes:
- Fixed unbound variable issue while creating binaries from script #595
- Fixed model latency calculation logic #630
- Treat application/x-www-form-urlencoded as binary data #705
- Fix socket.send does not guarantee that all data will be send. #765
- Fixed bug in create_mar.sh script of Text_to_Speech_Synthesizer #704
- Docker fixes #709 #724 #642 #823 #839 #853 #880
- Unit and regression test fixes #774 #775 #827 #845 #858 #852
- Install scripts fixes #798 #837 #844 #836
- Benchmark fixes #768
- Dependency fixes #757 #820
- Temp path fixes #877 #638
- Migrate model urls #697 #696 #695
Others
- Added metrics endpoint to cfn templates and k8s setup #670 #747
- Environment information header in regression and sanity suite #622 #865 #863
- Documentation changes and fixes #754 #470 #816 #584 #872 #871 #879 #739
- FairSeq language translation example #592
- Additional regression tests for KFServing #855
Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4)
Getting Started with TorchServe
Additionally, you can get started at https://pytorch.org/serve/ with installation instructions, tutorials and docs.
Lastly, if you have questions, please drop it into the PyTorch discussion forums using the ‘deployment’ tag or file an issue on GitHub with a way to reproduce.