Releases: leafspark/AutoGGUF
v1.4.0 (prerel)
AutoGGUF v1.4.0 prerelease
Changelog:
- LoRA Conversion Section: Added a new section for converting HuggingFace PEFT LoRA adapters to GGML or GGUF formats.
- Output Type Selection: Added a dropdown menu to choose between GGML and GGUF output formats for LoRA conversion.
- Base Model Selection: Added a folder selection dialog to choose the base model (in safetensors format) when GGUF output is selected.
- LoRA Adapter List: Added a list widget to add multiple LoRA adapters with individual scaling factors.
- Export LoRA Section: Added a section to merge LoRA adapters into a base model, supporting multiple adapters and scaling.
- Task Names: Updated the task names in the task list to be more informative, including the input and output file names.
- IMatrix Generation Check: Added a check to prevent IMatrix generation from starting if the model path is empty.
Currently, this build contains an src folder with the conversion tools. In the future I might simply add a feature to download the conversion scripts from the llama.cpp GitHub, but they are currently hardcoded here.
Planned features in next release:
- exe favicon
v1.3.1
AutoGGUF v1.3.1
Changelog:
- Added
AUTOGGUF_CHECK_BACKEND
environment variable to disable backend check on start (enabled/disabled) --onefile
build with PyInstaller,_internal
directory is no longer required
Full Changelog: v1.3.0...v1.3.1
v1.3.0
AutoGGUF v1.3.0
Changelog:
- Added support for new llama-imatrix parameters:
- Implemented context size (--ctx-size) input
- Added threads (--threads) control
- Added new parameters to IMatrix section layout for better usability:
- Converted context size input to a QSpinBox for better control
- Implemented a slider-spinbox combination for thread count selection
- Changed output frequency input to a QSpinBox with a range of 1-100 and percentage suffix
- Remove duplicated functions
- Fixed error when loading presets containing KV overrides
- Updated generate_imatrix() method to use new UI element values
- Improved error handling in preset loading
- Enhanced localization support for new UI elements (fallback to English)
- Added new localization keys for:
- Context size
- Thread count
- Updated IMatrix generation related strings
- Adjusted default values and ranges for new numeric inputs, input types
- Improved code readability and maintainability
Full Changelog: v1.2.1...v1.3.0
v1.2.1
AutoGGUF v1.2.1
Changelog:
- Added Refresh Models button
- Fixed iostream llama.cpp issue,
quantized_models
directory created on launch - Added Linux build (built on Ubuntu 24.04 LTS)
Full Changelog: v1.2.0...v1.2.1
v1.2.0
AutoGGUF v1.2.0
Changelog:
- Added more robust logging, find them at
latest_<timestamp>.log
in your logs folder - Added localizations with support for 28 languages (machine translated using Gemini Experimental 0801)
Full Changelog: v1.1.0...v1.2.0
v1.1.0
AutoGGUF v1.1.0
Changelog:
New Features:
- Added support for multiple KV overrides for more flexible model metadata customization
- Improved CUDA checking ability and extraction to the backend folder
UI Improvements:
- Implemented a scrollable area for KV overrides with add/delete capabilities
- Enhanced the visibility and usability of the Output Tensor Type and Token Embedding Type options
Bug Fixes:
- Corrected behavior of Output Tensor Type and Token Embedding Type dropdown menus
- Resolved various minor UI inconsistencies and improved error handling
Other Changes:
- Refactored code for better modularity and reduced circular dependencies
- Improved error messages and user feedback
Full Changelog: v1.0.1...v1.1.0
v1.0.1
Hotfix and binaries
Changelog:
- Added Windows binary (created using PyInstaller)
- Fixed issue where quantization errored with "AutoGGUF does not have x attribute"
Full Changelog: v1.0.0...v1.0.1
v1.0.0
AutoGGUF initial release
Changelog
Added:
- GUI interface for automated GGUF model quantization
- System resource monitoring (RAM and CPU usage)
- Llama.cpp backend selection and management
- Automatic download of llama.cpp releases from GitHub
- Model selection from local directory
- Comprehensive quantization options:
- Quantization type selection
- Allow requantize option
- Leave output tensor option
- Pure quantization option
- IMatrix support
- Include/exclude weights options
- Output tensor type selection
- Token embedding type selection
- Keep split option
- Override KV option
- Task list for managing multiple quantization jobs
- Real-time log viewing for quantization tasks
- IMatrix generation feature with customizable settings
- GPU offload settings for IMatrix generation
- Context menu for task management (Properties, Cancel, Retry, Delete)
- Detailed model information dialog
- Error handling and user notifications
- Confirmation dialogs for task deletion and application exit
Full Changelog: https://github.com/leafspark/AutoGGUF/commits/v1.0.0