Machine learningDeep learning / NLP / CV

Multilingual Text Summarization

Also known as: cross-lingual summarization, multilingual abstractive summarization, multilingual extractive summarization, multilingual seq2seq summarization

Multilingual text summarization applies pre-trained multilingual encoder-decoder models — such as mT5 or mBART — to generate concise summaries of documents written in many languages, either within the same language (monolingual) or across languages (cross-lingual). Fine-tuning these models on multilingual summarization benchmarks like XL-Sum enables coverage of dozens of languages with a single model.

Tools & resources

Download slides

Learn & explore

Read the full method

Members only

Method map

The neighbourhood of related methods — select a node to explore.

Multilingual text summarization

Fine-Tuned Text Summariz…Multilingual RoBERTa-bas…Multilingual Transformer Sentence Embeddings

When to use it

Use multilingual text summarization when you need to condense documents in multiple languages into shorter summaries — for example, in news aggregation, multilingual literature review, legal document processing, or cross-lingual information retrieval — and you have access to a pre-trained multilingual model and sufficient fine-tuning data for the target languages. It excels when many languages must be covered by a single system rather than separate monolingual models. Do NOT use it as a replacement for extraction-based baselines if you lack fine-tuning data for the target language; also avoid it when faithfulness to factual content is mission-critical without a hallucination-detection layer, as abstractive models can generate plausible but factually incorrect summaries.

Strengths & limitations

Strengths

A single model covers many languages, reducing engineering overhead compared to maintaining per-language pipelines.
Transfer learning from high-resource to low-resource languages lets the model perform reasonably well even for languages with limited labeled summarization data.
Abstractive generation produces fluent, coherent summaries rather than mere sentence extraction.
Cross-lingual summarization (different source and target languages) is possible within the same framework.
Large multilingual pre-trained checkpoints (mT5, mBART) are publicly available, reducing training cost.

Limitations

Performance degrades substantially for very low-resource languages that were under-represented in the pre-training corpus.
Abstractive models can hallucinate — generating plausible-sounding facts not present in the source document.
Fine-tuning on multilingual summarization data requires language-balanced sampling strategies that are non-trivial to implement well.
ROUGE scores are known to correlate poorly with human judgments of summary quality, especially for morphologically rich languages.

Frequently asked

Which pre-trained model should I start with for multilingual summarization?

mT5-base or mT5-large are strong general starting points as they cover 101 languages. mBART-50 is also well-suited for seq2seq generation tasks. For languages well covered by XLM-R, pairing an encoder with a multilingual decoder is another option.

How much fine-tuning data do I need per language?

Thousands to tens of thousands of document-summary pairs per language are ideal. For low-resource languages, cross-lingual transfer from related high-resource languages can compensate for limited labeled data, but expect a performance gap compared to high-resource settings.

How do I evaluate summaries in languages I do not speak?

ROUGE scores provide an automatic proxy, but they can be misleading for morphologically rich languages. Machine translation of both generated and reference summaries into English before ROUGE scoring is a common workaround. Hiring native-speaker annotators for a sample of outputs is the most reliable approach.

Can multilingual summarization models be used zero-shot on unseen languages?

Partially. If the language uses a script and vocabulary covered by the model's tokenizer and pre-training corpus, zero-shot transfer is possible but quality is significantly lower than for languages seen during fine-tuning. Languages with very different scripts typically require at least some in-language fine-tuning data.

What is the difference between multilingual and cross-lingual summarization?

Multilingual summarization produces a summary in the same language as the input document, across multiple languages. Cross-lingual summarization produces a summary in a different language from the input — for example, summarizing a French article in English. Cross-lingual summarization is technically harder and requires the model to perform translation and compression simultaneously.

Sources

Xue, L., Constant, N., Roberts, A., Kale, M., Al-Rfou, R., Siddhant, A., Barua, A., & Raffel, C. (2021). mT5: A Massively Multilingual Pre-Trained Text-to-Text Transformer. Proceedings of NAACL-HLT 2021, pp. 483–498. Association for Computational Linguistics. link ↗
Hasan, T., Bhattacharjee, A., Islam, M. S., Mubasshir, K., Li, Y.-F., Kang, Y.-B., Rahman, M. S., & Shahriyar, R. (2021). XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages. Findings of ACL-IJCNLP 2021, pp. 4693–4703. Association for Computational Linguistics. link ↗

How to cite this page

ScholarGate. (2026, June 3). Multilingual Text Summarization. ScholarGate. https://scholargate.app/en/deep-learning/multilingual-text-summarization

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Fine-Tuned Text SummarizationDeep learning↔ compare
Multilingual RoBERTa-based ClassificationDeep learning↔ compare
Multilingual TransformerDeep learning↔ compare
Sentence EmbeddingsDeep learning↔ compare

Compare side by side →

Related reference concepts

Machine Translation Machine Translation Sequence-to-Sequence Models and Transformers Natural Language Processing Computational Linguistics Text Classification

Spotted an issue on this page? Report or suggest a fix →

Multilingual Text Summarization

Also known as: cross-lingual summarization, multilingual abstractive summarization, multilingual extractive summarization, multilingual seq2seq summarization

Tools & resources

Download slides

Learn & explore

Read the full method

Members only

When to use it

Strengths & limitations

Strengths

A single model covers many languages, reducing engineering overhead compared to maintaining per-language pipelines.
Transfer learning from high-resource to low-resource languages lets the model perform reasonably well even for languages with limited labeled summarization data.
Abstractive generation produces fluent, coherent summaries rather than mere sentence extraction.
Cross-lingual summarization (different source and target languages) is possible within the same framework.
Large multilingual pre-trained checkpoints (mT5, mBART) are publicly available, reducing training cost.

Limitations

Performance degrades substantially for very low-resource languages that were under-represented in the pre-training corpus.
Abstractive models can hallucinate — generating plausible-sounding facts not present in the source document.
Fine-tuning on multilingual summarization data requires language-balanced sampling strategies that are non-trivial to implement well.
ROUGE scores are known to correlate poorly with human judgments of summary quality, especially for morphologically rich languages.

Frequently asked

Which pre-trained model should I start with for multilingual summarization?

How much fine-tuning data do I need per language?

How do I evaluate summaries in languages I do not speak?

Can multilingual summarization models be used zero-shot on unseen languages?

What is the difference between multilingual and cross-lingual summarization?

Sources

Xue, L., Constant, N., Roberts, A., Kale, M., Al-Rfou, R., Siddhant, A., Barua, A., & Raffel, C. (2021). mT5: A Massively Multilingual Pre-Trained Text-to-Text Transformer. Proceedings of NAACL-HLT 2021, pp. 483–498. Association for Computational Linguistics. link ↗
Hasan, T., Bhattacharjee, A., Islam, M. S., Mubasshir, K., Li, Y.-F., Kang, Y.-B., Rahman, M. S., & Shahriyar, R. (2021). XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages. Findings of ACL-IJCNLP 2021, pp. 4693–4703. Association for Computational Linguistics. link ↗

How to cite this page

ScholarGate. (2026, June 3). Multilingual Text Summarization. ScholarGate. https://scholargate.app/en/deep-learning/multilingual-text-summarization

Multilingual Text Summarization

Read the full method

Method map

When to use it

Strengths & limitations

Frequently asked

Sources

How to cite this page

Which method?

Similar methods

Related reference concepts

Multilingual Text Summarization

Read the full method

Method map

When to use it

Strengths & limitations

Frequently asked

Sources

How to cite this page

Which method?

Similar methods

Related reference concepts