Machine learning

Longformer / BigBird

ทรานส์ฟอร์เมอร์สำหรับลำดับยาว (Long-sequence Transformers) เช่น Longformer (Beltagy, Peters & Cohan, 2020) และ BigBird (Zaheer et al., 2020) ได้เข้ามาแทนที่กลไกความสนใจ (attention) แบบมาตรฐานของทรานส์ฟอร์เมอร์ ซึ่งมีค่าใช้จ่าย O(n²) ด้วยรูปแบบความสนใจแบบกระจัดกระจาย (sparse attention patterns) ที่ปรับขนาดเชิงเส้นตรง O(n) ตามความยาวของลำดับ สิ่งนี้ทำให้แบบจำลองเดี่ยวสามารถประมวลผลโทเค็นได้หลายพันรายการ เช่น เอกสารฉบับเต็ม ข้อความทางกฎหมาย หรือลำดับจีโนม ซึ่งไม่สามารถทำได้ด้วยทรานส์ฟอร์เมอร์แบบดั้งเดิม

เปิดใน MethodMindเร็ว ๆ นี้วิดีโอเร็ว ๆ นี้Download slides

อ่านวิธีฉบับเต็ม

สำหรับสมาชิกเท่านั้น

เข้าสู่ระบบด้วยบัญชีฟรีเพื่ออ่านส่วนนี้

เข้าสู่ระบบ

Method map

The neighbourhood of related methods — select a node to explore.

Longformer / BigBird

Graph Attention Network Mixture of Experts Random Forest XGBoost การกลั่นความรู้การค้นหาสถาปัตยกรรมโครงข…การเรียนรู้เชิงเปรียบเที…

แหล่งอ้างอิง

Beltagy, I., Peters, M. E. & Cohan, A. (2020). Longformer: The Long-Document Transformer. arXiv. link ↗
Zaheer, M. et al. (2020). Big Bird: Transformers for Longer Sequences. NeurIPS. link ↗

วิธีอ้างอิงหน้านี้

ScholarGate. (2026, June 1). Long-Sequence Transformers with Sparse Attention (Longformer / BigBird). ScholarGate. https://scholargate.app/th/deep-learning/longformer-bigbird

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Graph Attention Networkการเรียนรู้เชิงลึก↔ compare
Mixture of Expertsการเรียนรู้เชิงลึก↔ compare
Random Forestการเรียนรู้ของเครื่อง↔ compare
XGBoostการเรียนรู้ของเครื่อง↔ compare

Compare side by side →

ถูกอ้างอิงโดย

การกลั่นความรู้การค้นหาสถาปัตยกรรมโครงข่ายประสาทเทียม การเรียนรู้เชิงเปรียบเทียบสำหรับภาพ (Visual Contrastive Learning)

พบปัญหาในหน้านี้หรือไม่ แจ้งหรือเสนอการแก้ไข →