Machine learning

注意力机制

注意力机制由Bahdanau、Cho和Bengio于2015年提出，同年由Luong、Pham和Manning改进，它允许序列解码器动态学习在每个步骤中应该关注编码器的哪些输出。在Transformer之前，它通过将模型从将整个输入压缩成一个固定向量中解放出来，显著提高了机器翻译的质量。

在 MethodMind 中打开即将推出视频即将推出Download slides

阅读完整方法

仅限会员

使用免费账户登录即可阅读本节。

Method map

The neighbourhood of related methods — select a node to explore.

注意力机制

BERT微调 GPT模型微调随机森林多头自注意力机制 XGBoost 双向循环神经网络可解释强化学习可解释的语义分割门控循环单元 (GRU)多模态LSTM

+3 more

来源

Bahdanau, D., Cho, K. & Bengio, Y. (2015). Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. link ↗
Luong, M.T., Pham, H. & Manning, C.D. (2015). Effective Approaches to Attention-based Neural Machine Translation. EMNLP, 1412–1421. DOI: 10.18653/v1/D15-1166 ↗

如何引用本页

ScholarGate. (2026, June 1). Attention Mechanism (Bahdanau / Luong Attention). ScholarGate. https://scholargate.app/zh/deep-learning/attention-mechanism

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side →

被引用于

双向循环神经网络可解释强化学习可解释的语义分割门控循环单元 (GRU)多模态LSTM 多模态自然语言处理序列到序列模型 T5（Text-to-Text Transfer Transformer）

发现本页有问题？报告或提出修改建议 →

阅读完整方法

Method map

来源

如何引用本页

相关方法

Which method?

被引用于