Skip to content

Latest commit

 

History

History
25 lines (16 loc) · 624 Bytes

README.md

File metadata and controls

25 lines (16 loc) · 624 Bytes

On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability

This is the official implementation for NeurIPS 2024 paper On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability.

Dependencies

conda env create -f environment.yaml

Hyperparameters Configuration

Detailed hyperparameters config can be found in Appendix B.

Simulation Experiments

bash main_train_ar.sh #with hyperparameters in Appendix B

Visualization

python plot.py #specify the output