0.1.4
- Option to keep calibration states in VRAM while measuring
- Fix for Q4 cache for odd key/value sizes (MiniCPM specifically)
- Alternative
fasttensors
option on Windows to solve system memory issues - Prefix filter with multiple prefixes
Full Changelog: v0.1.3...v0.1.4