Machine learningDeep Learning, Vision Transformers

Swin Transformer (โปรแกรมแปลง Swin)

Swin Transformer คือสถาปัตยกรรมวิสัยทัศน์แบบลำดับชั้น (hierarchical vision transformer) ที่นำเสนอโดย Liu และคณะ ในปี 2021 ซึ่งใช้กลไก attention แบบหน้าต่างเลื่อน (shifted window attention) เพื่อให้ได้ประสิทธิภาพเชิงคำนวณ (computational efficiency) ที่ดี ในขณะที่ยังคงรักษาประสิทธิภาพที่แข็งแกร่งสำหรับงานด้านคอมพิวเตอร์วิทัศน์ (computer vision) แตกต่างจาก Vision Transformer ดั้งเดิมที่ใช้ self-attention แบบครอบคลุมทั่วทั้งภาพ (global self-attention) Swin ใช้ self-attention แบบเฉพาะส่วนในหน้าต่าง (local window-based attention) ร่วมกับการเลื่อนหน้าต่างเป็นระยะ เพื่อสร้างสมดุลระหว่างความสามารถในการแสดงออก (expressiveness) และประสิทธิภาพ

เปิดใน MethodMindเร็ว ๆ นี้วิดีโอเร็ว ๆ นี้Download slides

อ่านวิธีฉบับเต็ม

สำหรับสมาชิกเท่านั้น

เข้าสู่ระบบด้วยบัญชีฟรีเพื่ออ่านส่วนนี้

เข้าสู่ระบบ

Method map

The neighbourhood of related methods — select a node to explore.

Swin Transformer (โปรแกรมแปลง Swin)

DETR (Detection Transfor…Masked Autoencoders วิชันแมมบ้า วิชันทรานส์ฟอร์มเมอร์การตรวจจับวัตถุแบบ Few-S…โมเดล Segment Anything SimCLR เครือข่ายคอนโวลูชันกราฟเ…

แหล่งอ้างอิง

Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin Transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 10012-10022). DOI: 10.1109/ICCV48922.2021.00986 ↗

วิธีอ้างอิงหน้านี้

ScholarGate. (2026, June 3). Shifted Window Transformer for Vision. ScholarGate. https://scholargate.app/th/deep-learning/swin-transformer

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

DETR (Detection Transformer)การเรียนรู้เชิงลึก↔ compare
Masked Autoencodersการเรียนรู้เชิงลึก↔ compare
วิชันแมมบ้าการเรียนรู้เชิงลึก↔ compare
วิชันทรานส์ฟอร์มเมอร์การเรียนรู้เชิงลึก↔ compare

Compare side by side →

ถูกอ้างอิงโดย

DETR (Detection Transformer)การตรวจจับวัตถุแบบ Few-Shot Masked Autoencoders โมเดล Segment Anything SimCLR เครือข่ายคอนโวลูชันกราฟเชิงพื้นที่-เวลา วิชันแมมบ้า

พบปัญหาในหน้านี้หรือไม่ แจ้งหรือเสนอการแก้ไข →

อ่านวิธีฉบับเต็ม

Method map

แหล่งอ้างอิง

วิธีอ้างอิงหน้านี้

วิธีที่เกี่ยวข้อง

Which method?

ถูกอ้างอิงโดย