方法对比

并排查看您选择的方法；存在差异的行会高亮显示。

	Pyraformer：用于长程时间序列预测的金字塔注意力Transformer ×	Informer ×	Reformer：长序列的高效Transformer ×
领域	深度学习	深度学习	深度学习
方法族	Machine learning	Machine learning	Machine learning
起源年份≠	2022	2021	2020
提出者≠	Shizhan Liu et al.	Zhou, H. et al.	Nikita Kitaev, Łukasz Kaiser & Anselm Levskaya
类型≠	Pyramidal self-attention transformer for time-series forecasting	Transformer (ProbSparse self-attention)	Memory-efficient attention-based sequence model
开创性文献≠	Liu, S., Yu, H., Liao, C., Li, J., Lin, W., Liu, A. X., & Dustdar, S. (2022). Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. ICLR. link ↗	Zhou, H. et al. (2021). Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. AAAI. DOI ↗	Kitaev, N., Kaiser, Ł., & Levskaya, A. (2020). Reformer: The efficient transformer. ICLR. link ↗
别名≠	Pyramidal Attention Transformer, Pyraformer Transformer, Piramit Dikkat Dönüştürücüsü, Low-Complexity Transformer	Informer — Uzun Dizi Transformer Tahmini, Informer transformer, ProbSparse attention forecaster	Efficient Transformer, LSH Transformer, Locality-Sensitive Hashing Transformer, Verimli Dönüştürücü
相关≠	3	5	2
摘要≠	Pyraformer is a Transformer-based model for long-range time-series forecasting introduced by Liu et al. at ICLR 2022. Its central innovation is a Pyramidal Attention Module (PAM) that organizes tokens into a multi-resolution hierarchy, enabling the model to capture temporal dependencies across multiple scales while keeping time and memory complexity at O(L log L) rather than the quadratic cost of vanilla self-attention.	Informer is a Transformer-based model introduced by Zhou et al. in 2021 for long-sequence time-series forecasting, using a ProbSparse self-attention mechanism that lowers the computational complexity of the standard Transformer to O(L log L). It is built for problems that demand predictions across thousands of future steps.	The Reformer is an efficient variant of the Transformer architecture introduced by Kitaev, Kaiser, and Levskaya at ICLR 2020. It addresses the prohibitive O(L²) memory and computational cost of standard self-attention for long sequences. The key innovations are locality-sensitive hashing (LSH) attention, which approximates full attention in O(L log L) time, and reversible residual layers that dramatically reduce activation memory during training.
ScholarGate数据集 ↗	v1 1 来源 PUBLISHED	v1 2 来源 PUBLISHED	v1 1 来源 PUBLISHED

前往搜索 → 下载幻灯片