Skip to content

Latest commit

 

History

History
25 lines (18 loc) · 1.33 KB

File metadata and controls

25 lines (18 loc) · 1.33 KB

SOSP 2023

Meta Info

Homepage: https://sosp2023.mpi-sws.org/

Papers

Large Language Models (LLMs)

  • Efficient Memory Management for Large Language Model Serving with PagedAttention [Paper] [arXiv] [Code] [Homepage]
    • UC Berkeley & Stanford & UCSD
    • vLLM, PagedAttention
  • Oobleck: Resilient Distributed Training of Large Models Using Pipeline Templates [Paper] [arXiv] [Code]
    • UMich SymbioticLab & AWS & PKU
  • Gemini: Fast Failure Recovery in Distributed Training with In-Memory Checkpoints [Paper]
    • Rice & AWS

Deep Learning Recommendation Models (DLRMs)

  • UGache: A Unified GPU Cache for Embedding-based Deep Learning [Personal Notes] [Paper]
    • SJTU
    • Multi-GPU embedding cache; exploit cross-GPU interconnects (NVLink, NVSwitch).
  • Bagpipe: Accelerating Deep Recommendation Model Training [Paper]
    • UW-Madison & UChicago