ScholarGate
助手
Machine learning

知识蒸馏

知识蒸馏是一种模型压缩技术,由 Geoffrey Hinton 及其同事于 2015 年提出,它利用大型教师模型的软标签输出来训练一个小型学生模型。DistilBERT 和 TinyBERT 等蒸馏模型能够达到大型模型约 97% 的性能,同时运行速度快得多。

在 MethodMind 中打开即将推出视频即将推出Download slides

阅读完整方法

仅限会员

使用免费账户登录即可阅读本节。

登录

Method map

The neighbourhood of related methods — select a node to explore.

+3 more

来源

  1. Hinton, G., Vinyals, O. & Dean, J. (2015). Distilling the Knowledge in a Neural Network. NeurIPS Deep Learning Workshop. link
  2. Sanh, V., Debut, L., Chaumond, J. & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv:1910.01108. link

如何引用本页

ScholarGate. (2026, June 1). Knowledge Distillation (Teacher–Student Model Compression). ScholarGate. https://scholargate.app/zh/deep-learning/knowledge-distillation

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side

被引用于

ScholarGateKnowledge Distillation (Knowledge Distillation (Teacher–Student Model Compression)). 于 2026-06-15 检索自 https://scholargate.app/zh/deep-learning/knowledge-distillation · 数据集: https://doi.org/10.5281/zenodo.20539026