Machine learningDeep Learning, Vision Transformers

Swin Transformer

The Swin Transformer is a hierarchical vision transformer introduced by Liu et al. in 2021 that uses shifted window attention to achieve computational efficiency while maintaining strong performance on computer vision tasks. Unlike the original Vision Transformer which applies global self-attention, Swin uses local window-based attention with periodic shifting to balance expressiveness and efficiency.

Open in MethodMindSoonVideoSoon

Read the full method

Members only

Sign in with a free account to read this section.

Sign in

Sources

  1. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin Transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 10012-10022). DOI: 10.1109/ICCV48922.2021.00986

Related methods

Referenced by

ScholarGateSwin Transformer (Shifted Window Transformer for Vision). Retrieved 2026-06-04 from https://scholargate.app/en/deep-learning/swin-transformer