2nd MMFM-BIOMED Workshop

Multimodal Foundation Models for Biomedicine:
Challenges and Opportunities

MMFM-BIOMED @ CVPR 2026

Workshops: June 3, 2026 Denver Convention Center, Denver, USA Time/Room: 8:45-12:30 @ 1CD

About

Biomedical data spans diverse modalities across biological scales - from molecular genomics and cellular microscopy to tissue pathology, organ-level radiology, and patient-level electronic health records. While each modality provides unique insights, integrating these heterogeneous data sources remains a significant challenge in creating comprehensive biomedical understanding.

The Multimodal Foundation Models for Biomedicine (MMFM-BIOMED) workshop brings together experts across disciplines to tackle this challenge. The workshop explores two critical questions:

Technical Challenges

What are the core limitations of existing multimodal learning techniques when applied to biomedical data? Challenges include cross-modal alignment between data with different spatial and temporal resolutions; handling extreme data imbalances between well-annotated and sparse modalities; maintaining modality-specific contexts while enabling knowledge transfer across domains.

New Opportunities

What transformative opportunities do multimodal foundation models unlock in biomedicine? Potential breakthroughs include multi-scale disease diagnosis by combining radiology images with pathology slides; personalized treatment by integrating wearable sensor data with genomic profiles; context-aware operations by synchronizing surgical videos with patient records.

Invited Speakers

Tentative Schedule

8:45-8:50

Opening Remarks

8:50-9:00

Spotlight Talk 1: Challenges in Retrieval and Context Engineering for Multimodal Patient Timelines (Ryan Nayebi)

9:00-9:10

Spotlight Talk 2: Automated Report-Derived Oncology VQA Benchmark for Evaluating Vision-Language Models on 3D Medical Imaging (Bo Liu)

9:10-9:20

Spotlight Talk 3: Reality vs. Priors: Atypical Anatomy Breaks Vision-Language Models (Leon Mayer)

9:20-9:30

Spotlight Talk 4: Too Sure to Be Right: High-Confidence Wrong and Latent Failure Signals in Medical VQA (Haohao Zhu)

9:30-10:00

Invited Talk 1: Maria Brbic

10:00-10:30

Invited Talk 2: Emily Fox

10:30-11:30

Poster & Discussion

11:30-12:00

Invited Talk 3: James Zou

12:00-12:30

Invited Talk 4: Mengdi Wang

12:30-12:40

Closing Remarks

Call for Papers

We invite short, non-archival paper submissions (4 pages maximum excluding references using CVPR 2026 template) that explore both challenges and opportunities in multimodal foundation models for biomedicine. We encourage two types of submissions:

Challenges in Current Techniques

Papers highlighting limitations in existing methods, especially simple yet intuitive approaches that unexpectedly lead to negative outcomes.

  • Multimodal foundation models - pre-training, post-training, alignment
  • Agentic framework - design, evaluation, control
  • Benchmark and evaluation - failure modes, reproducibility

Opportunities with MMFMs

Papers showcasing novel applications, especially in underexplored areas such as drug discovery and surgery.

Selected papers will be featured as posters, and four will receive spotlight oral presentations.

Submission platform: OpenReview (MMFM-BIOMED)

Submission deadline: May 1, 2026

Notification date: May 10, 2026

Workshop date: June 3, 2026

Reminder: Please bring your poster (following CVPR 2026 poster requirement) to the workshop

Accepted Papers

All accepted papers participate in the workshop poster session. Four papers are selected for oral presentations. We will update the poster floor location soon. Please design and print your poster and bring it with you to the workshop.

Poster guidelines: follow the official CVPR 2026 poster printing information for workshop dimensions (half size), templates, logistics, and optional onsite printing.

MMFM-BIOMED 2026 accepted oral and poster papers
Poster Board ID Format Title Authors
TBD Oral Automated Report-Derived Oncology VQA Benchmark for Evaluating Vision-Language Models on 3D Medical Imaging Bo Liu, Hanxue Gu, Xiangru Li, Zheren Zhu, Jacob Ellison, Kang Wang, Janine M. Lupo, Yang Yang, Hui Lin
TBD Oral Reality vs. Priors: Atypical Anatomy Breaks Vision-Language Models Leon Mayer, Piotr Kalinowski, Caroline Maria Ebersbach, Marcel Knopp, Tim Rädsch, Evangelia Christodoulou, Annika Reinke, Fiona R. Kolbinger, Lena Maier-Hein
TBD Oral Challenges in Retrieval and Context Engineering for Multimodal Patient Timelines Ryan D'Cunha, Ryan Nayebi, Jason Alan Fries
TBD Oral Too Sure to Be Right: High-Confidence Wrong and Latent Failure Signals in Medical VQA Haohao Zhu, Jiayu Zhou
TBD Poster BSMAD: Bridging Semantic and Structural Manifolds for Robust Cross-Modality Medical Anomaly Detection Shih-Chih Lin, Chia-Lin Lee, Yun-Tung Chu, Jia-Xian Jian, Wei-Chieh Sun, Fang-Yi Lin
TBD Poster PSF-Med: A Clinician-Audited Benchmark for Paraphrase Sensitivity in Medical Vision-Language Models Binesh Sadanandan, Vahid Behzadan, Lekshmy Jayan, Arun Gopinatha Kurup
TBD Poster SCOPE: Self-Consistent Patch Reconstruction with Pathology-Aware Prototype Alignment for Anatomical Neglect in CXR Report Generation Ngo Xuan Cuong
TBD Poster Fine-Tuning BiomedParse for Stroke Detection and Segmentation on CT: A Comparison with Gemini 2.5 Pro and GPT-5 Artun Gunturkun, Halil Ibrahim Gulluk, Özgür Ilker Koska, Olivier Gevaert
TBD Poster Exploring Prompt Alignment with Clinical Factors in Zero-Shot Segmentation VLMs for NSCLC Tumor Segmentation Suraj Pai, Thibault Heintz, Cosmin Ciausu, Marion Tonneau, Hugo Aerts, Raymond Mak
TBD Poster Position: Why Medical AI for the Global South Must Be Built on Different Technical Principles Azmine Toushik Wasi, Mohsin Mahmud Topu, Mahfuz Ahmed Anik, Md. Manjurul Ahsan
TBD Poster Multimodal Predictors of Heterogeneity of Treatment Effects in Transcranial Direct Current Stimulation for Knee Osteoarthritis Pain: A Machine Learning Analysis Seoyoung Kim, Yeri Kim, Chiyoung Lee, Juyoung Park, Allison Huff, Huanrui Yang, Heewon Kim
TBD Poster Does Language Shift Break Medical Vision-Language Models? Indonesian Radiology Visual Question Answering Case Study Pieter Christy Yan Yudhistira, Dzaki Rafif Malik, Novanto Yudistira
TBD Poster Directional Pathology–Omics Discordance from Frozen Whole-Slide Foundation Embeddings Across Lung and Kidney Cancer Kuan-Ting Wu
TBD Poster When Pretraining Fails to Transfer: Diagnosing Representation Failure in fMRI Foundation Models Vijay Srinivas P., Amrutha Veluppal
TBD Poster Self-Supervised Vision Transformers for CBCT-Based Detection of Temporomandibular Joint Osteoarthritis Shradhdha Trivedi, Vrundan Sojitra
TBD Poster Label-Efficient Multimodal Trust Mapping for MRI Jacob Ellison, Amir Sadikov, Duan Xu, Janine M. Lupo
TBD Poster RadDiff: Describing Differences in Radiology Image Sets with Natural Language Xiaoxian Shen, Yuhui Zhang, Sahithi Ankireddy, Xiaohan Wang, Maya Varma, Henry Guo, Curtis Langlotz, Serena Yeung-Levy
TBD Poster When Prompts Mislead: Textual Dominance and Diagnostic Bias in MLLMs Inhyuk Park, Doohyun Park
TBD Poster JASPR: Joint Spatial Representation learning of histology and spatial genomics for improved virtual genomic screening and clinical prognostication Marija Pizurica, Eric Zimmermann, Neil Tenenholtz, James Brian Hall, Olivier Gevaert, Ava P. Amini, Lorin Crawford, Kristen A. Severson
TBD Poster Foundation Model–Driven Interpretable Discovery of Prognostic Signals Beyond Clinical Records in Histology and Spatial Proteomics Yutong Sun, Kyeong Joo Jung, James D. Brooks, Chrystal Chadwick, Sanghee Cho, Soumya Ghose, Elizabeth McDonough, Jianwei Qiu, Robert West, Fiona Ginty, Raghu Machiraju, Parag Mallick
TBD Poster Multimodal Alignment Improves Generalizability of Genomic Biomarker Prediction in Computational Pathology Ekaterina Redekop, Eric Zimmermann, Ava P. Amini, Alex Xijie Lu, Neil Tenenholtz, James Brian Hall, Lorin Crawford, Kristen A. Severson
TBD Poster Predicting Brain Tumor Recurrence from Multimodal, Longitudinal Patient Data Divyanshu Tak, Diya Sreedhar, Hugo Aerts, Benjamin H. Kann
TBD Poster On the Trade-offs of LLM-Augmented Multimodal Foundation Models for Clinical Image Classification Zhaohui Liang, Niccolo Marini, Sivaramakrishnan Rajaraman, Zhiyun Xue, Sameer Antani
TBD Poster Objective-Aligned Direct Answer SFT for Robust Multi-Frame Medical VQA Site Li, Jianyi Hao, Xiaofeng Liu
TBD Poster When Multimodal Models Stop Listening: Modality Imbalance Induces Dominant-Modality Bias in Biomedicine Dineth Jayakody
TBD Poster A Landmark-Based Panoramic Radiograph Dataset for Periodontal Severity and Comprehensive Dental Screening Nayoon Kwon, Jongwon Choi, Jung Seok Lee
TBD Poster An Open Multi-Center Whole-Body FDG PET/CT Foundation Model for Tumor Segmentation Xiaofeng Liu, Qianru Zhang, Thibault Marin, Menghua Xia, Chi Liu, Georges El Fakhri, Jinsong Ouyang

Organizers

Previous Workshop

1st edition: MMFM-BIOMED @ CVPR 2025