Skip to content

v0.2.0 Release Note

Compare
Choose a tag to compare
@yundai424 yundai424 released this 29 Aug 18:51
· 230 commits to main since this release
c6fb35e

Opening Thoughts 🫢

Thank You!

We'd love to take this chance to express our sincere gratefulness to the community! 2500+ ⭐ , 10+ new contributors, 50+ PRs, plus integration into Hugging Face πŸ€—, axolotl and LLaMA-Factory in less than one week since going open sourced is totally beyond our expectation. Being able to work together with all the cool people in the community is a bliss and we can't wait for further collaborations down the road!

Looking Ahead

We look forward to further enhancing our collaboration with the community, to work together on a lot of cool stuff -- support for more model families, squeeze out all optimization opportunities for kernels, and, why not, llama.triton? πŸ˜‰

Get Involved and Stay Tuned

Please feel free to join our discord channel hosted in CUDA MODE server, and follow our repo's official account on X: https://x.com/liger_kernel !

Welcome Phi3 and Qwen2 πŸš€

This release ships with support for other popular models including Phi3 and Qwen2. All existing kernels in Liger repo can be leveraged to boost your training with models from these families now. Please refer to our API guide for how to use.

Even Easier API ❀️

Experimenting with different model families and tired of having if-else everywhere just to switch between kernel patching functions? You can now try out our new model-agnostic API to apply Liger kernels. Still a one-liner, but more elegant :) For example:

from liger_kernel.transformers import AutoLigerKernelForCausalLM

# This AutoModel wrapper class automatically monkey-patches the
# model with the optimized Liger kernels if the model is supported.
model = AutoLigerKernelForCausalLM.from_pretrained(...)

More Features

  • Support optional bias term in FusedLinearCrossEntropy (#144)
  • Mistral is now equipped with the humongous memory reduction from FusedLinearCrossEntropy now (#93)
  • Gemma is now equipped with the humongous memory reduction from FusedLinearCrossEntropy now (#111)

Bug Fixes

  • Fixed import error when using triton>=3.0.0 on NGC containers (#79)
  • Fixed the missing offset in Gemma RMSNorm (#85) oops
  • Added back missing dataclass entries in efficiency callback (#116)
  • There was some confusion on which Gemma do we support, we now support all! (#125)
  • Fallback to torch native linear + CrossEntropy when without label (#128)
  • Match the exact dtype up and downcasting in Llama & Gemma for RMSNorm (#92)
  • Address the bug that RoPE gets very slow when using dynamic sequence length (#149)

What's Changed

New Contributors

Full Changelog: v0.1.1...v0.2.0