This model is a Thai text encoder which is a part of CLIP. This model is trained via Teacher Learning method using OpenAi's CLIP as Teacher
This repo contain only Text encoder which is compatible with ViT-B/32 OpenAi's Image encoder. To use this model as CLIP, you need to clone this repo (text encoder) and OpenAi's CLIP (image encoder). Look furthermore in Tutorial notebook Here!
[TBA]
This Text encoder is trained on 2M Thai captions translated from English by AiResearch's MT model using WangchanBERTa as a pretrained model with an additional linear layer on top.
Name | Model Base | Vision Model | Vision Dimensions | #Parameters |
---|---|---|---|---|
WangchanBERTa ViT-B/32 | WangchanBERTa | OpenAI ViT-B/32 | 512 | 106 M |
- VQGAN-ThCLIP: Text-to-image synthesis model in Thai language
- AI Builders for providing knowledge and support along the way
- Multilingual-CLIP for Teacher learning method
- OpenAI's CLIP
- AIResearch's translation model