Skip to content

DJLServing v0.26.0 Release

Compare
Choose a tag to compare
@siddvenk siddvenk released this 27 Feb 21:02
· 6 commits to 0.26.0-dlc since this release

Key Changes

  • TensorRT-LLM 0.7.1 Upgrade, including support for Mixtral 8x7B MOE model
  • Optimum Neuron Support
  • Transformers-NeuronX 2.16 Upgrade, including support for continuous batching
  • LlamaCPP support
  • Many Documentation updates with updated model deployment configurations
  • Refactor of configuration management across different backends
  • CUDA 12.1 support for DeepSpeed and TensorRT-LLM containers

Enhancements

Bug Fixes

Documentation Updates

CI/CD Updates

Full Changelog: v0.25.0...v0.26