Releases: microsoft/DeepSpeed-MII
Releases · microsoft/DeepSpeed-MII
DeepSpeed-MII v0.1.0
DeepSpeed MII v0.0.9
What's Changed
- Update version.txt after 0.0.8 release by @loadams in #246
- Introduce pydantic_v1 compatibility module for pydantic>=2.0.0 support by @ringohoffman in #240
- Improved the code quality to ease future maintenance by @blackmambaza in #248
New Contributors
- @ringohoffman made their first contribution in #240
- @blackmambaza made their first contribution in #248
Full Changelog: v0.0.8...v0.0.9
DeepSpeed MII v0.0.8
What's Changed
- Fixes for AML metatensor loading by @mrwyattii in #244
- Update autoPR creation in release script by @loadams in #245
New Contributors
Full Changelog: v0.0.7...v0.0.8
DeepSpeed MII v0.0.7
What's Changed
- Cache HF API results by @mrwyattii in #233
- Refactor Configs by @mrwyattii in #218
- Re-enable non persistent test cases by @mrwyattii in #238
- Add PyPI release workflow by @mrwyattii in #239
Full Changelog: v0.0.6...v0.0.7
DeepSpeed MII v0.0.6
What's Changed
- Fix error of creating session by @tohtana in #217
- Auto-generate host files by @mrwyattii in #228
- update txt2img example to sd 1.5 by @lalalune in #73
- refactor(example): conversational-query by @Thytu in #79
- Free torch cache after loading model by @mrwyattii in #121
- Add distilbert support by @jeffra in #126
- Update Stable Diffusion to match latest DeepSpeed-Inference by @mrwyattii in #229
New Contributors
Full Changelog: v0.0.5...v0.0.6
DeepSpeed MII v0.0.5
What's Changed
- Tests use transformers cache for model storage by @mrwyattii in #122
- Decouple conversions for gRPC from server/client code by @tohtana in #138
- Separate server and client by @tohtana in #142
- Add max_tokens option to mii_config by @mallorbc in #129
- Fix for CPU device error by @mrwyattii in #148
- Load balancing and multiple replicas by @tohtana in #147
- RESTful API support by @tohtana in #154
- Add Apache 2.0 License by @mrwyattii in #165
- Fix condition to terminate RESTful API gateway by @tohtana in #175
- Add lock to serialize pipeline execution by @tohtana in #176
- Add session to enable multi-turn conversation by @tohtana in #177
- Update CI by @mrwyattii in #196
- Fix hostfile generation for replicas by @tohtana in #192
- Fix deployment name in AML examples by @novaturient95 in #193
- Refactored all grpc methods in method_table by @TosinSeg in #202
- Add Non-persistent deployment type by @TosinSeg in #197
- add llama and update readme counts by @jeffra in #206
- Generalize meta tensor pipeline by @mrwyattii in #199
- Always enable load balancing by @TosinSeg in #205
- Improve unit tests by @mrwyattii in #209
- Adding trust_remote_code support by @msinha251 in #203
- Update AML Deployment by @mrwyattii in #211
New Contributors
- @tohtana made their first contribution in #138
- @mallorbc made their first contribution in #129
- @novaturient95 made their first contribution in #193
- @TosinSeg made their first contribution in #202
- @msinha251 made their first contribution in #203
Full Changelog: v0.0.4...v0.0.5
DeepSpeed MII v0.0.4
What's Changed
- Add community tutorial by @jeffra in #104
- Add xlm-roberta by @jeffra in #113
- Use new DS-Inference config API by @mrwyattii in #108
- Improvements to VRAM usage when loading HF models by @mrwyattii in #105
- Add Bloom model config checks for HF provider by @mrwyattii in #107
Full Changelog: v0.0.3...v0.0.4
DeepSpeed MII v0.0.3
What's Changed
- fix typo by @TahaBinhuraib in #70
- move deserialize logic into wrapper to match diffusers api by @jeffra in #77
- bump min deepspeed to 0.7.4 by @jeffra in #82
- docs(deploy): update handled tasks by @Thytu in #81
- fix(example): QnA model name by @Thytu in #80
- Fix aio config dict in ZeRO text-gen example by @mrwyattii in #88
- Support latest Transformers and new cache design by @mrwyattii in #69
- use hf_auth_token in mii_config to huggingface provider by @l-yohai in #76
- SD tutorial and benchmarks by @jeffra in #90
- enable cuda graph for diffusers models by @jeffra in #96
New Contributors
- @TahaBinhuraib made their first contribution in #70
- @Thytu made their first contribution in #81
- @l-yohai made their first contribution in #76
Full Changelog: v0.0.2...v0.0.3
DeepSpeed MII v0.0.2
What's Changed
- Add pkg, versioning, formatting, etc. by @jeffra in #2
- [CI] add formatter job by @jeffra in #3
- Local deployment support by @jeffra in #4
- [docs] add basic install instructions by @jeffra in #5
- [docs] add contribution guide by @jeffra in #6
- Samyam/generic server client score by @samyam in #7
- add cache-dir, detect crash, clean-up by @jeffra in #8
- Azure local deployment by @jeffra in #9
- Changing task names to ENUM by @samyam in #10
- Samyamr/model registration by @samyam in #11
- Samyamr/add features by @samyam in #12
- Samyamr/parallelism by @samyam in #13
- Samyamr/non grpc deployment test by @samyam in #14
- Samyamr/aks examples by @samyam in #15
- Samyamr/core documentation by @samyam in #17
- Add huggingface models and tasks that use gpt2 and robert model type by @cli99 in #16
- refactor model type check and add bert by @cli99 in #19
- [docs] readme updates by @jeffra in #20
- add gpt_neo and gptj support by @cli99 in #21
- Add Pydantic for checking configurations by @mrwyattii in #28
- [config] add dtype+validation for config and pass to ds-init by @jeffra in #30
- gpt-neox support by @jeffra in #22
- Add unit tests for local deployment by @mrwyattii in #31
- Bug Fix for FP16 models by @mrwyattii in #32
- GPT-NeoX docs by @mrwyattii in #33
- Fix for GPT-NeoX model by @mrwyattii in #34
- force hf to use pytorch and not use tf by @jeffra in #36
- Enable zero inference in a separate code path from ds_optimize by @cli99 in #24
- Remove non-local deployments for upcoming release by @mrwyattii in #37
- Doc and api update by @jeffra in #39
- Add Bloom Models by @mrwyattii in #38
- Multi-batch support for text-generation by @jeffra in #40
- expose all kwargs for generate by @jeffra in #42
- AML deployment + major refactor by @jeffra in #41
- fix hyperlink to paper by @kamalkraj in #48
- Enable generation of scripts and configs for AML deployments by @mrwyattii in #55
- Fixed bigscience examples by @cderinbogaz in #53
- Allow pinning deployments to specific rank(s) by @jeffra in #56
- various clean-up by @jeffra in #59
- AML deployment improvements by @mrwyattii in #62
- update supported models by @jeffra in #63
- ds-azure as preview release by @samyam in #64
- New landing + plots by @jeffra in #65
- Landing page changes by @jeffra in #67
- Fix examples with wrong param names by @mrwyattii in #68
- add support for txt2img task by @jeffra in #57
- use the fp16 revision of SD by @jeffra in #72
New Contributors
- @cli99 made their first contribution in #16
- @kamalkraj made their first contribution in #48
- @cderinbogaz made their first contribution in #53
Full Changelog: https://github.com/microsoft/DeepSpeed-MII/commits/v0.0.2