Skip to content

v1.12

Compare
Choose a tag to compare
@oobabooga oobabooga released this 25 Jul 15:19
· 95 commits to main since this release
dd97a83

Backend updates

  • Transformers: bump to 4.43 (adds Llama 3.1 support).
  • ExLlamaV2: bump to 0.1.8 (adds Llama 3.1 support).
  • AutoAWQ: bump to 0.2.6 (adds Llama 3.1 support).

UI updates

  • Make text between quote characters colored in chat and chat-instruct modes.
  • Prevent LaTeX from being rendered for inline "$", as that caused problems for phrases like "apples cost $1, oranges cost $2".
  • Make the markdown cache infinite and clear it when switching to another chat. This cache exists because the markdown conversion is CPU-intensive. By making it infinite, messages in a full 128k context will be cached, making the UI more responsive for long conversations.

Bug fixes

  • Fix a race condition that caused the default character to not be loaded correctly on startup.
  • Fix Linux shebangs (#6110). Thanks @LuNeder.

Other changes

  • Make the Google Colab notebook use the one-click installer instead of its own Python environment for better stability.
  • Disable flash-attention on Google Colab by default, as its GPU models do not support it.