Machine learningTime-series forecasting
Reformer: The Efficient Transformer for Long Sequences
The Reformer is an efficient variant of the Transformer architecture introduced by Kitaev, Kaiser, and Levskaya at ICLR 2020. It addresses the prohibitive O(L²) memory and computational cost of standard self-attention for long sequences. The key innovations are locality-sensitive hashing (LSH) attention, which approximates full attention in O(L log L) time, and reversible residual layers that dramatically reduce activation memory during training.
Open in MethodMindSoonVideoSoon
Read the full method
Members only
Sign inSign in with a free account to read this section.
Sources
- Kitaev, N., Kaiser, Ł., & Levskaya, A. (2020). Reformer: The efficient transformer. ICLR. link ↗