Releases: OpenNMT/CTranslate2
Releases · OpenNMT/CTranslate2
CTranslate2 4.5.0
Note: The Ctranslate2 Python package now supports CUDNN 9 and is no longer compatible with CUDNN 8.
New features
Fixes and improvements
CTranslate2 4.4.0
Removed: Flash Attention support in the Python package due to significant package size increase with minimal performance gain.
Note: Flash Attention remains supported in the C++ package with the WITH_FLASH_ATTN
option.
Flash Attention may be re-added in the future if substantial improvements are made.
New features
- Support Llama3 (#1751)
- Support Gemma2 (#1772)
- Add log probs for all tokens in vocab (#1755)
- Grouped conv1d (#1749 + #1758)
Fixes and improvements
CTranslate2 4.3.1
CTranslate2 4.3.0
CTranslate2 4.2.1
CTranslate2 4.2.0
CTranslate2 4.1.1
Fixes and improvements
- Fix classifiers in setup.py to push pypi package
CTranslate2 4.1.0
CTranslate2 4.0.0
This major version introduces the breaking change while updating to cuda 12.
Breaking changes
Python
- Support cuda 12
New features
- Add feature to_device() in class StorageView in Python to move data between host <-> device
Fixes and improvements
- Implement Conv1D with im2col and GEMM to improvement in performance
- Get tokens in the range of the vocab size for LlaMa models
- Fix loss of performance
- Update cibuildwheel to 2.16.5
CTranslate2 3.24.0
New features
- Support of new option offset to ignore token score of special tokens