Long-form writing on training, debugging and shipping deep learning systems.
A pragmatic look at when ViTs beat ConvNets, when they don't, and how to choose between them in production.