MedBridgeRL
When does RL help medical vision-language models?
Medical vision-language models (VLMs) are increasingly trained with reinforcement learning (RL) on top of supervised fine-tuning (SFT). But when does RL actually help — and how much of the apparent gain is due to the vision encoder, the SFT stage, or RL itself?
This project systematically disentangles these contributions across a range of medical imaging benchmarks, providing clearer guidance for practitioners on when RL is worth the extra complexity.
Links: Paper (arXiv) · Project Page
Status: Preprint (2026)