Process / pipeline

Sprachidentifikation (LID)

Sprachidentifikation ist eine Aufgabe der natürlichen Sprachverarbeitung, die automatisch erkennt, in welcher Sprache ein Text verfasst ist. Aufbauend auf sofort einsetzbaren Tools wie langid.py (Lui & Baldwin, 2012) und den effizienten Klassifikatoren von Joulin et al. (2017) wird sie häufig zur Vorverarbeitung und Filterung mehrsprachiger Datensätze eingesetzt.

In MethodMind öffnenDemnächstVideoDemnächstDownload slides

Die vollständige Methode lesen

Nur für Mitglieder

Melden Sie sich mit einem kostenlosen Konto an, um diesen Abschnitt zu lesen.

Anmelden

Method map

The neighbourhood of related methods — select a node to explore.

Sprachidentifikation (LID)

N-Gramm-Sprachmodell Sentiment-Analyse Rechtschreib- und Gramma…Textklassifizierung Morphologische Analyse Text Segmentation

Quellen

Lui, M. & Baldwin, T. (2012). langid.py: An Off-the-shelf Language Identification Tool. Proceedings of the ACL 2012 System Demonstrations. link ↗
Joulin, A., Grave, E., Bojanowski, P. & Mikolov, T. (2017). Bag of Tricks for Efficient Text Classification. Proceedings of the EACL 2017. link ↗

So zitieren Sie diese Seite

ScholarGate. (2026, June 1). Language Identification (LID). ScholarGate. https://scholargate.app/de/text-mining/language-identification

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

N-Gramm-SprachmodellText Mining↔ compare
Sentiment-AnalyseText Mining↔ compare
Rechtschreib- und GrammatikprüfungText Mining↔ compare
TextklassifizierungText Mining↔ compare

Compare side by side →

Referenziert von

Morphologische Analyse Text Segmentation

Einen Fehler auf dieser Seite entdeckt? Melden oder Korrektur vorschlagen →