Skip to content

Latest commit

 

History

History
20 lines (17 loc) · 546 Bytes

README.md

File metadata and controls

20 lines (17 loc) · 546 Bytes

LLM Education

Small lib implementing LLMs for educational purposes

Objectives

  • Implement Transformer architecture
  • Load models from Huggingface and reproduce results
  • Generation with sampling, beam search
  • Tokenizers
  • Dataset (Shakespeare?)
  • Basic fine tuning
  • Better Transformer++ (from mamba)
  • KV cache for inference
  • Accelerate inference with smaller predictor
  • Generation guided by grammar
  • SSM / Mamba architecture
  • Visual Transformers
  • Efficient models
  • Triton?