Machine learningDeep learning / NLP / CV

Multimodal Recurrent Neural Network

Ett Multimodal Recurrent Neural Network kombinerar indata från två eller flera datamodaliteter — såsom bilder, text och ljud — inom ett ramverk för sekvensbehandling med rekurrenta nätverk. Det kodar varje modalitet separat, smälter samman representationerna och bearbetar sedan den kombinerade signalen genom rekurrenta enheter (RNN, LSTM eller GRU) för att generera eller klassificera sekventiella utdata. Denna design gjorde det till ett grundläggande angreppssätt inom bildtextning, videobeskrivning och ljud-visuell taligenkänning.

Öppna i MethodMindSnartVideoSnartDownload slides

Läs hela metoden

Endast för medlemmar

Logga in med ett kostnadsfritt konto för att läsa avsnittet.

Logga in

Method map

The neighbourhood of related methods — select a node to explore.

Multimodal Recurrent Neural Network

Gated Recurrent Unit (GR…Long Short-Term Memory (…Multimodal BERT-baserad…Multimodal Convolutional…Multimodal Transformer Återkommande neuralt nät…Multimodal GRU

Källor

Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and Tell: A Neural Image Caption Generator. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3156–3164. DOI: 10.1109/CVPR.2015.7298935 ↗
Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., & Ng, A. Y. (2011). Multimodal Deep Learning. Proceedings of the 28th International Conference on Machine Learning (ICML), pp. 689–696. link ↗

Så citerar du den här sidan

ScholarGate. (2026, June 3). Multimodal Recurrent Neural Network (MM-RNN). ScholarGate. https://scholargate.app/sv/deep-learning/multimodal-recurrent-neural-network

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Gated Recurrent Unit (GRU)Djupinlärning↔ compare
Long Short-Term Memory (LSTM)Djupinlärning↔ compare
Multimodal BERT-baserad klassificeringDjupinlärning↔ compare
Multimodal Convolutional Neural NetworkDjupinlärning↔ compare
Multimodal TransformerDjupinlärning↔ compare
Återkommande neuralt nätverkDjupinlärning↔ compare

Compare side by side →

Refereras av

Multimodal Convolutional Neural Network Multimodal GRU

Hittade du ett fel på sidan? Rapportera eller föreslå en rättelse →