What's Changed
- bump dev version by @winglian in #2342
- Doc fix: TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL not necessary to use Triton kernel patches by @djsaunde in #2343
- make sure chatml dpo dataset loading works by @winglian in #2333
- Fix sample packing producing longer sequences than specified by
sequence_len
by @tobmi1 in #2332 - quick formatting fix for LoRA optims doc by @djsaunde in #2349
- calculate sample length fixes and SFT splitting fixes by @winglian in #2351
- feat: update transformers version to 4.49.0 by @NanoCode012 in #2340
- Bumping 0.15.1 TRL version for GRPO+PEFT fix by @SalmanMohammadi in #2344
- support for passing init_lora_weights to lora_config by @winglian in #2352
- fix(doc): add missing auto_find_batch_size by @NanoCode012 in #2339
- don't install extraneous old version of pydantic in ci and make sre to run multigpu ci by @winglian in #2355
- Relicense the logprob KD loss functions as Apache 2.0 by @winglian in #2358
- Correctly reference mount paths by @reissbaker in #2347
- bump liger to 0.5.3 by @winglian in #2353
- feat: add deepseek_v3 sample packing by @NanoCode012 in #2230
- Feat(doc): Reorganize documentation, fix broken syntax, update notes by @NanoCode012 in #2348
- Fix(doc): address missing doc changes by @NanoCode012 in #2362
New Contributors
- @tobmi1 made their first contribution in #2332
- @reissbaker made their first contribution in #2347
Full Changelog: v0.7.0...v0.7.1