Skip to content

Latest commit

 

History

History
33 lines (27 loc) · 1.05 KB

os-atlas.md

File metadata and controls

33 lines (27 loc) · 1.05 KB

OS-Atlas

Overview

OS-Atlas is an open-source foundation action model for generalist GUI agents, developed by Shanghai AI Lab. The project is officially hosted and described in their research paper.

Key Features

  • Multiple model sizes (4B and 7B)
  • GPT-4o integration option
  • Cross-platform support (mobile, desktop, web)
  • Large-scale GUI grounding corpus (13M+ elements)
  • Developed by Shanghai AI Lab

Performance

OSWorld Results

  • OS-Atlas-Base-7B w/ GPT-4o: 14.63%
  • OS-Atlas-Base-4B w/ GPT-4o: 11.65%

ScreenSpot Results

  • OS-Atlas-7B: 82.5% accuracy
  • OS-Atlas-4B: 71.9% accuracy

Technical Details

  • Model Sizes: 4B and 7B parameters
  • Optional GPT-4o integration
  • Focus: GUI grounding and interaction
  • Supports multiple operating systems
  • Execution-based evaluation

References