We offer no explanation as to why these architectures seem to work; we attribute their success, as all else, to divine benevolence.
— Noam Shazeer, in GLU Variants Improve Transformer
Keras layers without using matrix multiplications.
This is a Keras based implementation of some layers mentioned in the papers The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits and Scalable MatMul-free Language Modeling. Find the documentation here.
Traditional, matrix multiplication based layers suffer from a few issues.
- They have high inference and computational costs due to the use of matrix multiplications. This hinders the speed at which inference is performed on GPU-less machines.
- The memory use for storing full precision weights is very high.
- The energy costs of running matrix multiplications is very high.
Matrix multiplication free layers addresses these pain points by removing the key source of costs — matrix multiplications.
Keras-MML has a few requirements, namely
- Python 3.9 (or above);
- Keras; and
- the Keras backend (either Tensorflow, PyTorch, or Jax).
Instructions on how to install Keras can be found here.
If you use pip, you can install Keras-MML using the command
pip install keras-matmulless
To install pre-release versions, use the command
pip install --pre keras-matmulless
Nightly releases for Keras-MML are primarily found on the TestPyPi page. To install them, use the command
pip install -i https://test.pypi.org/simple/ keras-matmulless
First, clone the repository using
git clone https://github.com/PhotonicGluon/Keras-MatMulLess.git
cd Keras-MatMulLess
We recommend to create a virtual environment to install Poetry and the other dependencies into.
python -m venv venv # If `python` doesn't work, try `python3`
Activate the virtual environment using
source venv/bin/activate
or, if you are on Windows,
venv/Scripts/activate
Now we install Poetry.
pip install poetry
Finally, install the development dependencies. The development dependencies are split into several groups.
- The
test
group contains dependencies that are used to perform testing. - The
docs
group contains dependencies that are used to generate the documentation. - The
build
group contains dependencies that are used to create a distributable. - The
notebook
group is required to run the Jupyter notebooks in the documentation folder.
Simply include the desired groups in the install.py
call. For example, to install test
, docs
, and build
(the main development dependencies), run the following command.
python install.py test docs build
If you have not installed a backend (i.e., Tensorflow, PyTorch, or Jax) you can do so here.
python install.py test docs build --backend BACKEND_NAME
Note that the BACKEND_NAME
to be specified here is
tensorflow
for the Tensorflow backend;torch
for the PyTorch backend; andjax
for the Jax backend.
If you need to install with CUDA support, run
python install.py test docs build --backend BACKEND_NAME --with-cuda
That's it! You should now have access to the keras_mml
package.
Read the tutorial.
We welcome contributions! Please read more about contributing to Keras-MML in the contribution guidelines.
Keras-MML is licensed under the Apache 2.0 license.