README

Codebase accompanying the paper entitled "Transformers, parallel computation, and logarithmic depth." This repository contains the code necessary to train the models detailed in the paper, copies of the trained models used in the paper, and notebooks to generate the respective figures given locally trained models.

The code is written to be run locally on Apple Silicon. This can be adapted to other hardware by changing the mps device to your preferred type.

Setup

First, create a conda environment with the proper dependencies. To do so, run the following command from this directory:

conda env create --name [NAME] --file=environment.yml

Training models

Models can be trained according to the experimental specs specified by the file conf/runs/[CONFIG].yaml by running the following command:

python src/train.py conf/runs/[CONFIG].yaml

Plots

Run the jupyter notebooks src/error_plots.ipynb and src/interpretability.ipynb to reproduce the plots in Section 5 and Appendix G.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
conf/runs		conf/runs
local		local
models/induction_hops_full_seq		models/induction_hops_full_seq
src		src
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

Setup

Training models

Plots

About

Releases

Packages

Languages

chsanford/hop-induction-heads

Folders and files

Latest commit

History

Repository files navigation

README

Setup

Training models

Plots

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages