You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
XLM-R (XLM-RoBERTa) is a generic cross lingual sentence encoder that obtains state-of-the-art results on many cross-lingual understanding (XLU) benchmarks. It is trained on 2.5T of filtered CommonCrawl data in 100 languages (list below).
importtorchxlmr=torch.hub.load('pytorch/fairseq', 'xlmr.large')
xlmr.eval() # disable dropout (or leave in train mode to finetune)
Load XLM-R (for PyTorch 1.0 or custom models):
# Download xlmr.large modelwgethttps://dl.fbaipublicfiles.com/fairseq/models/xlmr.large.tar.gztar-xzvfxlmr.large.tar.gz# Load the model in fairseqfromfairseq.models.robertaimportXLMRModelxlmr=XLMRModel.from_pretrained('/path/to/xlmr.large', checkpoint_file='model.pt')
xlmr.eval() # disable dropout (or leave in train mode to finetune)
Apply sentence-piece-model (SPM) encoding to input text:
# Extract the last layer's featureslast_layer_features=xlmr.extract_features(zh_tokens)
assertlast_layer_features.size() ==torch.Size([1, 6, 1024])
# Extract all layer's features (layer 0 is the embedding layer)all_layers=xlmr.extract_features(zh_tokens, return_all_hiddens=True)
assertlen(all_layers) ==25asserttorch.all(all_layers[-1] ==last_layer_features)
Citation
@article{conneau2019unsupervised,
title={Unsupervised Cross-lingual Representation Learning at Scale},
author={Conneau, Alexis and Khandelwal, Kartikay and Goyal, Naman and Chaudhary, Vishrav and Wenzek, Guillaume and Guzm{\'a}n, Francisco and Grave, Edouard and Ott, Myle and Zettlemoyer, Luke and Stoyanov, Veselin},
journal={arXiv preprint arXiv:1911.02116},
year={2019}
}
@article{goyal2021larger,
title={Larger-Scale Transformers for Multilingual Masked Language Modeling},
author={Goyal, Naman and Du, Jingfei and Ott, Myle and Anantharaman, Giri and Conneau, Alexis},
journal={arXiv preprint arXiv:2105.00572},
year={2021}
}