Small lib implementing LLMs for educational purposes
- Implement Transformer architecture
- Load models from Huggingface and reproduce results
- Generation with sampling, beam search
- Tokenizers
- Dataset (Shakespeare?)
- Basic fine tuning
- Better Transformer++ (from mamba)
- KV cache for inference
- Accelerate inference with smaller predictor
- Generation guided by grammar
- SSM / Mamba architecture
- Visual Transformers
- Efficient models
- Triton?