Code for Multidomain Language Models for Green NLP.
- ACL-ARC
- AG-News
- ChemProt
- Clothing Reviews
- HyperPartisan
- IMDB
- MultiNLI
- PubMed-RCT
- SARC
- SciCite
- TalkDown
Code is split in multiple evaluation files, one for each task. Models are not provided, but can be pretrained separately using the run_language_modeling.py
script provided here (or by HuggingFace).
Each script is indicative of the code run in our machines. Train/dev/test splits are not provided, as they were randomly sampled. Nevertheless, the scripts were tested with multiple samples and performance was as similar as it can be to reported results.