`bitsandbytes`

The bitsandbytes library is a lightweight Python wrapper around CUDA custom functions, in particular 8-bit optimizers, matrix multiplication (LLM.int8()), and 8 & 4-bit quantization functions.

The library includes quantization primitives for 8-bit & 4-bit operations, through bitsandbytes.nn.Linear8bitLt and bitsandbytes.nn.Linear4bit and 8-bit optimizers through bitsandbytes.optim module.

This fork is actively developed for ROCm and updates are being pushed into multi-backend-refactor branch of upstream bitsandbytes. Users can use either of these to run bitsandbytes on AMD GPUs.

Installation for ROCm:

For latest develop version:

git clone --recurse https://github.com/ROCm/bitsandbytes
cd bitsandbytes
git checkout rocm_enabled
pip install -r requirements-dev.txt
cmake -DCOMPUTE_BACKEND=hip -S . #Use -DBNB_ROCM_ARCH="gfx90a;gfx942" to target specific gpu arch
make
pip install .

For more details, please head to the official documentation page:

https://huggingface.co/docs/bitsandbytes/main

License

bitsandbytes is MIT licensed.

We thank Fabio Cannizzo for his work on FastBinarySearch which we use for CPU quantization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

`bitsandbytes`

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

bitsandbytes

License

`bitsandbytes`