MedBridgeRL

When does RL help medical vision-language models?

Medical vision-language models (VLMs) are increasingly trained with reinforcement learning (RL) on top of supervised fine-tuning (SFT). But when does RL actually help — and how much of the apparent gain is due to the vision encoder, the SFT stage, or RL itself?

This project systematically disentangles these contributions across a range of medical imaging benchmarks, providing clearer guidance for practitioners on when RL is worth the extra complexity.

Links: Paper (arXiv) · Project Page

Status: Preprint (2026)

(others & Shaban, 2026)

References

2026

  1. Preprint
    When Does RL Help Medical VLMs? Disentangling Vision, SFT, and RL Gains
    others and Kimia Shaban
    arXiv preprint, 2026