Machine learningDeep learning / NLP / CV
Explainable Sentence Embeddings
Explainable sentence embeddings combine dense sentence representation learning with post-hoc or intrinsic interpretability tools — such as probing classifiers, LIME, SHAP, or attention attribution — to reveal what linguistic and semantic information is encoded in a sentence vector and why a downstream model makes a given prediction. The goal is to retain the representational power of modern encoders while making their behavior auditable.
Open in MethodMindSoonVideoSoon
Read the full method
Members only
Sign inSign in with a free account to read this section.
Sources
- Conneau, A., Kruszewski, G., Lample, G., Barrault, L., & Baroni, M. (2018). What you can cram into a single $\vec{v}$ector: Probing sentence embeddings for linguistic properties. In Proceedings of ACL 2018, pp. 2126–2136. link ↗
- Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). "Why Should I Trust You?": Explaining the predictions of any classifier. In Proceedings of KDD 2016, pp. 1135–1144. DOI: 10.1145/2939672.2939778 ↗