Download PDFOpen PDF in browserNot Quite Anything: Overcoming SAM’s Limitations for 3D Medical Imaging15 pages•Published: April 19, 2026AbstractFoundation segmentation models (such as SAM and SAM-2) perform well on natural images but struggle with brain MRIs where structures like the caudate and thalamus lack sharp boundaries and have poor contrast. Rather than fine-tune these models (e.g., MedSAM), we propose a compositional alternative where we treat the foundation model’s output as an additional input channel (like an extra color channel) and pass it alongside the MRI to highlight regions of interest.We generate SAM-2 segmentation prompts (e.g., a bounding box or positive/negative points) using a lightweight 3D U-Net that was previously trained on MRI segmentation. However, the U-Net might have been trained on a different dataset. As such its guesses for prompts are often inaccurate but often in the right region. The edges of the resulting foundation segmentation "guesses" are then smoothed to allow better alignment with the MRI. We also test prompt-less segmentation using DINO attention maps within the same framework. This “has-a” architecture avoids modifying foundation weights and adapts to domain shift without retraining the foundation model. It achieves 96% volume accuracy on basal ganglia segmentation, which is sufficient for our study of longitudinal volume change. Our approach is faster, more label-efficient, and robust to out-of-distribution scans. We apply it to study inflammation-linked changes in sudden-onset pediatric OCD. Keyphrases: adapting rather than retraining, foundation model, mri segmentation, prompt augmented segmentation with sam 2, segmentation as a color channel, segmentation as an imaging modality, segmentation is attention, segmentation models, use of foundation models for medical imaging In: Jernej Masnec, Hamid Reza Karimian, Parisa Kordjamshidi and Yan Li (editors). Proceedings of AI for Accelerated Research Symposium, vol 3, pages 138-152.
|

