ScholarGate
Trợ lý

So sánh phương pháp

Xem các phương pháp đã chọn cạnh nhau; những hàng khác biệt được làm nổi bật.

Multimodal GRU×Multimodal LSTM×
Lĩnh vựcHọc sâuHọc sâu
HọMachine learningMachine learning
Năm ra đời2014–20172016
Người khởi xướngCho, K. et al. (GRU); adapted to multimodal settings by multiple research groupsRajagopalan et al. and various concurrent works (2016–2018)
LoạiRecurrent neural network (multimodal variant)Recurrent neural network architecture
Công trình gốcCho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Proceedings of EMNLP 2014, 1724–1734. link ↗Rajagopalan, S., Tran, L., Rozgic, V., Narayanan, S., Kumar, A., & Ramakrishna, S. (2016). Extending Long Short-Term Memory for Multi-View Structured Learning. In Proceedings of ECCV 2016. Springer. link ↗
Tên gọi khácMM-GRU, Multimodal Gated Recurrent Unit, Cross-modal GRU, Multi-input GRUMM-LSTM, multimodal recurrent network, multi-input LSTM, multimodal sequence model
Liên quan64
Tóm tắtMultimodal GRU extends the Gated Recurrent Unit architecture to jointly process sequential data from multiple input modalities — such as text, audio, and video frames — within a single recurrent framework. By fusing modality-specific encodings at the input or hidden-state level, it captures temporal dependencies across heterogeneous data streams and is widely used in multimodal sentiment analysis, video understanding, and audio-visual speech recognition.Multimodal LSTM extends the standard Long Short-Term Memory network to jointly process sequential data from multiple input modalities — such as text, audio, and video — within a unified recurrent architecture. By fusing representations from different sources before or within the LSTM cells, it captures temporal dependencies that span and cross modalities, making it a foundational approach for tasks like sentiment analysis, video captioning, and affective computing.
ScholarGateBộ dữ liệu
  1. v1
  2. 2 Nguồn tài liệu
  3. PUBLISHED
  1. v1
  2. 2 Nguồn tài liệu
  3. PUBLISHED

Đến trang tìm kiếm Tải xuống bản trình chiếu

ScholarGateSo sánh phương pháp: Multimodal GRU · Multimodal LSTM. Truy cập ngày 2026-06-18 từ https://scholargate.app/vi/compare