Skip to content

Latest commit

 

History

History
67 lines (36 loc) · 2.96 KB

README.md

File metadata and controls

67 lines (36 loc) · 2.96 KB

Embodied-AI

This repository mainly organizes resources related to embodied intelligence, including data, models, hardware, and software infrastructure.

Data

3D Grounding
  • SCENEVERSE: Scaling 3D Vision-Language Learning for Grounded Scene Understanding [arxiv] [github] [2024]

  • ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language [arxiv] [website] [2019]

LLM-Robot
  • AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents [paper] [website] [2024]

  • CALVIN: A Benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks [arxiv] [github] [2021]

  • Open X-Embodiment: Robotic Learning Datasets and RT-X Models [website] [github]

Model

3D Grounding
  • LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent [arxiv] [github] [2023]

  • Multi-View Transformer for 3D Visual Grounding [paper] [github] [2022]

LLM-Robot
  • Generative Expressive Robot Behaviors Using Large Language Models [arxiv] [website] [2024]

  • OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics [arxiv] [github] [2024]

  • MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World [arxiv] [website] [2024]

  • VISION-LANGUAGE FOUNDATION MODELS AS EFFEC-TIVE ROBOT IMITATORS [arxiv] [github] [2023]

Task Planning
  • Embodied Task Planning with Large Language Models [arxiv] [website] [2023]

Hardware

Software

AI Infrastructure

Large Scale Training Framework(based Pytorch)
Edge Inference Engine