Vision transformers — citation graph of key research

Explore the most influential research on vision transformers as an interactive citation graph on Constellation. The papers below are connected by direct citations and shared references — open any one to center the graph on it and discover related work.

Top papers on vision transformers

  1. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
    Ze Liu, Yutong Lin, Yue Cao et al. — 2021 · 2021 IEEE/CVF International Conference on Computer Vision (ICCV) · 29,712 citations
  2. Emerging Properties in Self-Supervised Vision Transformers
    Mathilde Caron, Hugo Touvron, Ishan Misra et al. — 2021 · 2021 IEEE/CVF International Conference on Computer Vision (ICCV) · 4,948 citations
  3. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
    Wenhai Wang, Enze Xie, Xiang Li et al. — 2021 · 2021 IEEE/CVF International Conference on Computer Vision (ICCV) · 4,656 citations
  4. PVT v2: Improved baselines with pyramid vision transformer
    Wenhai Wang, Enze Xie, Xiang Li et al. — 2022 · Computational Visual Media · 2,153 citations
  5. Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
    Li Yuan, Yunpeng Chen, Tao Wang et al. — 2021 · 2021 IEEE/CVF International Conference on Computer Vision (ICCV) · 2,239 citations
  6. CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
    Chun-Fu Richard Chen, Quanfu Fan, Rameswar Panda — 2021 · 2021 IEEE/CVF International Conference on Computer Vision (ICCV) · 1,927 citations
  7. An Empirical Study of Training Self-Supervised Vision Transformers
    Xinlei Chen, Saining Xie, Kaiming He — 2021 · 2021 IEEE/CVF International Conference on Computer Vision (ICCV) · 1,439 citations
  8. Scaling Vision Transformers
    Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby et al. — 2022 · 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) · 778 citations
  9. Vision Transformers for Single Image Dehazing
    Yuda Song, Zhuqing He, Hui Qian et al. — 2023 · IEEE Transactions on Image Processing · 974 citations
  10. BiFormer: Vision Transformer with Bi-Level Routing Attention
    Lei Zhu, Xinjiang Wang, Zhanghan Ke et al. — 2023 · 1,037 citations
  11. Vision Transformer with Deformable Attention
    Zhuofan Xia, Xuran Pan, Shiji Song et al. — 2022 · 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) · 848 citations
  12. CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
    Xiaoyi Dong, Jianmin Bao, Dongdong Chen et al. — 2022 · 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) · 1,219 citations