Skip to content

Commit

Permalink
fix
Browse files Browse the repository at this point in the history
  • Loading branch information
ariG23498 committed Mar 4, 2025
1 parent 729bb0f commit 5d52e83
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions aya-vision.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,10 +68,11 @@ Model merging enhances the generative capabilities of our final model that leads

Multimodal model merging also enables our Aya Vision models to excel in text-only tasks as measured in mArenaHard datasets compared with the other leading vision-language models.

![stages](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/aya-vision/image-11.png)
Overview of the training pipeline for Aya Vision
| ![stages](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/aya-vision/image-11.png) |
| :--: |
| Overview of the training pipeline for Aya Vision |

Scaling up to 32B
## Scaling up to 32B

Finally, we scale our recipe from 8B to 32B, resulting in the state-of-the-art open-weight multilingual vision-language model – Aya Vision 32B which shows significant improvements in win rates due to the stronger initialization of the text-backbone, and outperforms models more than 2x of its size, such as Llama-3.2 90B Vision, Molmo 72B, and Qwen2.5-VL 72B by win rates ranging from 49% to 63% on AyaVisionBench and 52% to 72% on mWildVision average across 23 languages.

Expand Down

0 comments on commit 5d52e83

Please sign in to comment.