Stars
Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Solve Visual Understanding with Reinforced VLMs
[ICLR2025] Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Official Repo for Paper ‘’HealthGPT : A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation‘’
Official code of paper "GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis"
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI.
[ECCV2024] I-MedSAM: Implicit Medical Image Segmentation with Segment Anything
The official repository for "One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts"
The official codes for "AutoRG-Brain: Grounded Report Generation for Brain MRI".
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation (ICLR 2025)
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
[ICLR 2025] This is the official repository of our paper "MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine“
MedRegA: Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks
Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision
[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
[T-PAMI] A curated list of self-supervised multimodal learning resources.