Machine learningDeep learning / NLP / CV

Multimodal Instance Segmentation

Multimodal instance segmentation extends classical instance segmentation — which assigns a per-pixel mask and a class label to every individual object in an image — by incorporating complementary sensor streams such as depth maps, LiDAR point clouds, or infrared frames. Fusing these modalities helps the model handle ambiguous appearances, low light, and occlusion that trip up RGB-only systems.

MethodMind'de açSoonVideoSoon

Tam yöntemi oku

Members only

Sign in with a free account to read this section.

Sign in

Sources

  1. He, K., Gkioxari, G., Dollar, P., & Girshick, R. (2017). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2961–2969. DOI: 10.1109/ICCV.2017.322
  2. Instance segmentation. Wikipedia. link

Related methods

Referenced by

ScholarGateMultimodal Instance Segmentation (Multimodal Instance Segmentation (Multi-sensor Deep Mask Prediction)). Retrieved 2026-06-04 from https://scholargate.app/tr/deep-learning/multimodal-instance-segmentation