MedBridgeRL | Kimia Shaban

Medical vision-language models (VLMs) are increasingly trained with reinforcement learning (RL) on top of supervised fine-tuning (SFT). But when does RL actually help — and how much of the apparent gain is due to the vision encoder, the SFT stage, or RL itself?

This project systematically disentangles these contributions across a range of medical imaging benchmarks, providing clearer guidance for practitioners on when RL is worth the extra complexity.

Links: Paper (arXiv) · Project Page

Status: Preprint (2026)

(Jeddi et al., 2026)

References

2026

Preprint
When Does RL Help Medical VLMs? Disentangling Vision, SFT, and RL Gains

Ahmadreza Jeddi, Kimia Shaban, Negin Baghbanzadeh, and 4 more authors

arXiv preprint, 2026

Abs arXiv Bib HTML Website

We study when reinforcement learning provides meaningful gains over supervised fine-tuning for medical vision-language models, disentangling the contributions of vision encoders, SFT, and RL to model performance.
@article{shaban2026medbridgerl, title = {When Does RL Help Medical VLMs? Disentangling Vision, SFT, and RL Gains}, author = {Jeddi, Ahmadreza and Shaban, Kimia and Baghbanzadeh, Negin and Sharan, Natasha and Moturu, Abhishek and Dolatabadi, Elham and Taati, Babak}, year = {2026}, journal = {arXiv preprint}, }