Machine learningDeep learning / NLP / CV
Multimodal Graph Neural Network
A Multimodal Graph Neural Network (MM-GNN) combines data from multiple modalities — such as text, images, and structured features — into a unified graph structure and applies graph-based message passing to learn joint representations. It enables relational reasoning across heterogeneous data sources, going beyond what unimodal or simple concatenation approaches can capture.
MethodMind'de açSoonVideoSoon
Tam yöntemi oku
Members only
Sign inSign in with a free account to read this section.
Sources
- Kipf, T. N., & Welling, M. (2017). Semi-Supervised Classification with Graph Convolutional Networks. International Conference on Learning Representations (ICLR). link ↗
- Zhang, Z., Lin, H., & Zhao, X. (2020). Multimodal Graph Neural Network for Knowledge-Based Visual Question Answering. Information Processing & Management, 57(6), 102382. DOI: 10.1016/j.ipm.2020.102382 ↗