ScholarGate
Msaidizi
Process / pipeline

Usindikaji wa Lugha Asilia wa Multimodal — Uelewa wa Maono-Lugha

Usindikaji wa Lugha Asilia wa Multimodal (Multimodal NLP) ni familia ya mifumo ya usindikaji wa lugha asilia inayochanganya maandishi na aina moja au zaidi za data za ziada — kwa kawaida picha, lakini pia sauti na video — ili kufanya kazi za uelewa na utengenezaji kama vile kujibu maswali ya kuona, kuelezea picha, na kutambua hisia za multimodal. Nyanja hii ilipata umbo lake la kisasa na CLIP (Radford et al., 2021) na tangu hapo imesonga mbele kupitia miundo kama BLIP-2 (Li et al., 2023) ambayo huunganisha vipachikaji picha vilivyogandishwa na mifumo mikubwa ya lugha.

Fungua katika MethodMindHivi karibuniVideoHivi karibuniDownload slides

Soma mbinu kamili

Kwa wanachama pekee

Ingia kwa akaunti ya bure ili kusoma sehemu hii.

Ingia

Method map

The neighbourhood of related methods — select a node to explore.

Vyanzo

  1. Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning Transferable Visual Models From Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning (ICML), 8748–8763. link
  2. Li, J., Li, D., Savarese, S., & Hoi, S. (2023). BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. Proceedings of the 40th International Conference on Machine Learning (ICML), 19730–19742. link

Jinsi ya kunukuu ukurasa huu

ScholarGate. (2026, June 1). Multimodal Natural Language Processing. ScholarGate. https://scholargate.app/sw/text-mining/multimodal-nlp

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side
ScholarGateMultimodal NLP (Multimodal Natural Language Processing). Imepatikana 2026-06-15 kutoka https://scholargate.app/sw/text-mining/multimodal-nlp · Seti ya data: https://doi.org/10.5281/zenodo.20539026