This repository contains example Jupyter notebooks demonstrating how to use the quantized versions of the PLLuM-8x7B-chat model in GGUF format.
PLLuM (Polish Large Language Model) is an advanced family of Polish language models developed by the Polish Ministry of Digital Affairs. The quantized versions in GGUF format allow for efficient running on consumer hardware with reduced memory requirements while maintaining good quality of generated text.
This repository provides practical examples of how to use these models with different libraries and approaches.
To run the examples, you'll need:
pip install -r requirements.txt
The requirements.txt file includes:
- jupyter
- llama-cpp-python
- transformers
- accelerate
- sentencepiece
- matplotlib
- pandas
- gradio (optional, for web UI examples)
The repository contains the following example notebooks:
basic_inference.ipynb
- Basic text generation using llama-cpp-pythonchat_completion.ipynb
- Chat completion interface with conversation historymodel_comparison.ipynb
- Comparison between different quantization levelsgradio_web_ui.ipynb
- Simple web interface using Gradio
The examples work with all available quantized versions:
Quantization | Size | Recommended for |
---|---|---|
Q2_K | 17 GB | Minimal resources, lower quality |
IQ3_S | 20.4 GB | Low resources with acceptable quality |
Q3_K_M | 22.5 GB | Good balance for CPU usage |
Q4_K_M | 28.4 GB | Recommended for most applications |
Q5_K_M | 33.2 GB | High quality with reasonable size |
Q8_0 | 49.6 GB | Highest quantized quality |
F16/BF16 | ~85 GB | Reference without quantization |
You can download the models from Hugging Face using:
# Install huggingface-cli
pip install -U "huggingface_hub[cli]"
# Download a specific model (e.g., q4_k_m)
huggingface-cli download piotrmaciejbednarski/PLLuM-8x7B-chat-GGUF --include "PLLuM-8x7B-chat-gguf-q4_k_m.gguf" --local-dir ./models/
# For faster downloads
pip install hf_transfer
HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download piotrmaciejbednarski/PLLuM-8x7B-chat-GGUF --include "PLLuM-8x7B-chat-gguf-q4_k_m.gguf" --local-dir ./models/
- Clone this repository:
git clone https://github.com/piotrmaciejbednarski/pllum-examples.git
cd pllum-examples
- Install dependencies:
pip install -r requirements.txt
-
Download a model using the instructions above
-
Start Jupyter:
jupyter notebook
- Open one of the example notebooks in the
notebooks/
directory
This repository is provided under the Apache License 2.0, the same license as the original PLLuM model.
- Original PLLuM model developed by CYFRAGOVPL
- GGUF quantization by Piotr Bednarski
- Examples in this repository by Piotr Bednarski