Machine learning

Attention Mechanism

The attention mechanism, introduced by Bahdanau, Cho and Bengio in 2015 and refined by Luong, Pham and Manning the same year, lets a sequence decoder dynamically learn which of the encoder's outputs to focus on at each step. Before the Transformer, it substantially improved machine-translation quality by freeing models from compressing an entire input into a single fixed vector.

Open in MethodMindSoonVideoSoon

Read the full method

Members only

Sign in with a free account to read this section.

Sign in

Sources

  1. Bahdanau, D., Cho, K. & Bengio, Y. (2015). Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. link
  2. Luong, M.T., Pham, H. & Manning, C.D. (2015). Effective Approaches to Attention-based Neural Machine Translation. EMNLP, 1412–1421. DOI: 10.18653/v1/D15-1166

Related methods

Referenced by

ScholarGateAttention Mechanism (Attention Mechanism (Bahdanau / Luong Attention)). Retrieved 2026-06-04 from https://scholargate.app/en/deep-learning/attention-mechanism