EgoNormia is a comprehensive benchmark evaluating agentic VLM capabilities in grounded reasoning scenarios.
- Comprehensive evaluation of grounded agentic abilities
- Support for onboarding and evaluation on custom dataset
- Support for both reasoning and vision-language models
- Integration with popular AI APIs
- Easy integration for custom agents
Use conda or venv for installation. (Or install it all locally if you're brave.)
conda create -n egonormia python=3.10 -y
conda activate egonormia
git clone https://github.com/Open-Social-World/EgoNormia
cd EgoNormia
pip install -e .
To run using a custom VLM, replace self.modelname in eval/custom_eval_api.py
with the model name of your openai API-compatible VLM,
and fill in any remaining fields as necessary.
Then, simply evaluate using API the evaluation API with --modelname custom
.
## Evaluate using API
To run only eval scripts, you can provide either an OpenAI API key or a Gemini API key (depending on the model you intend to run)
(To run all scripts related to EgoNormia, you need to populate *both* an OpenAI key and a Gemini API key)
This can be directly exported:
```bash
export OPENAI_API_KEY=<KEY>
export ANTHROPIC_API_KEY=<KEY>
Or you can modify the SECRETS.env
file, adding your api keys.
You can then run the evaluation from the egonormia/src
directory with the following command:
python3 evaluate.py --modelname gemini-1.5-flash-002 --jsonfile final_data.json (--blind) (--description)
Include the --blind
flag to run the evaluation without the ground truth, and the --description
flag to include the description in the evaluation.
--blind
and --description
flags are mutually exclusive.
This project is licensed under the Apache License - see the LICENSE file for details.
If you use EgoNormia in any of your work, please cite:
@misc{rezaei2025egonormiabenchmarkingphysicalsocial,
title={EgoNormia: Benchmarking Physical Social Norm Understanding},
author={MohammadHossein Rezaei and Yicheng Fu and Phil Cuvin and Caleb Ziems and Yanzhe Zhang and Hao Zhu and Diyi Yang},
year={2025},
eprint={2502.20490},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2502.20490},
}