Process / pipeline
自动文本评估 — BLEU, ROUGE, BERTScore
自动文本评估是一类基于参考的度量方法,用于通过将机器生成的文本(如翻译、摘要或自然语言生成(NLG)输出)与一个或多个人工编写的参考文本进行比较,来衡量其质量。该领域由 Papineni 等人在 2002 年以 BLEU 开创,现已发展到包括 n-gram 重叠度量(BLEU, ROUGE)和语义感知度量(BERTScore, MoverScore),这些度量能够捕捉超越表面词语匹配的含义。
阅读完整方法
仅限会员
登录使用免费账户登录即可阅读本节。
Method map
The neighbourhood of related methods — select a node to explore.
来源
- Papineni, K., Roukos, S., Ward, T., & Zhu, W.-J. (2002). BLEU: A Method for Automatic Evaluation of Machine Translation. Proceedings of ACL 2002. link ↗
- Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., & Artzi, Y. (2020). BERTScore: Evaluating Text Generation with BERT. Proceedings of ICLR 2020. link ↗
如何引用本页
ScholarGate. (2026, June 1). Automatic Text Evaluation (BLEU, ROUGE, BERTScore). ScholarGate. https://scholargate.app/zh/text-mining/automatic-text-evaluation
Which method?
Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.
Compare side by side →