VGG — Very Deep Convolutional Networks for Large-Scale Image Recognition (2014) Convolutions, ReLU, max-pooling.
Inception — Going deeper with convolutions (2015) Introduced in the original GoogLeNet paper. Many variants have been developed by the same authors.
ResNet — Deep Residual Learning for Image Recognition (2015)
EfficientNet — EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (2019) [code] Do more hyparameter tuning on the Inception model to optimize both classification accuracy and speed. Many variants have been developed, such as noisy student and sharpness-aware minimization (SAM) (code).
BiT — Big Transfer (BiT): General Visual Representation Learning (2019)
ViT — An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (2020) Attention is all you need (image classification edition).