Tibetan To Englsih Machine Translation

This project is the first step in the long-term project to create a Tibetan to English machine translation. The project aims to contribute to the preservation of Tibetan language and Tibetan Buddhism.

We first used SentencePiece to tokenize the data and then simultaneously ran three different approaches to train our data:

We imported a transformer model from FairSeq, an open-source sequence modeling toolkit, and fed it with our own data and hyperparameters for training.
We fine-tuned a pre-trained T5 transformer model provided by Huggingface Transformers library with our data.
We built a transformer from scratch with Pytorch. Tutorial link: https://lionbridge.ai/articles/transformers-in-nlp-creating-a-translator-model-from-scratch/

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
Fairseq		Fairseq
T5_Transformers		T5_Transformers
Transformer_From_Scratch		Transformer_From_Scratch
data		data
preProcessing		preProcessing
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tibetan To Englsih Machine Translation

About

Releases

Packages

Contributors 2

Languages

Tendol/Bo-Eng-Machine-Transation

Folders and files

Latest commit

History

Repository files navigation

Tibetan To Englsih Machine Translation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages