Skip to content
All projects
Computer Vision
Featured
May 2026

ViT for Medical Imaging

A vision transformer that detects pathology in chest X-rays at radiologist-level accuracy.

0.912
AUC (CheXpert)
0.86
F1 (Pneumonia)
1,200
GPU hours

The problem

CNNs underperform on long-range pathologies in chest X-rays. Labeled medical data is scarce.

The approach

Self-supervised MAE pre-training on 500k unlabeled chest X-rays, followed by supervised fine-tuning with class-balanced focal loss.

Results

Beat the ResNet-50 baseline by 6.2 AUC points on CheXpert and reached radiologist-level F1 on three out of fourteen findings.