Machine learning

知识蒸馏

知识蒸馏是一种模型压缩技术，由 Geoffrey Hinton 及其同事于 2015 年提出，它利用大型教师模型的软标签输出来训练一个小型学生模型。DistilBERT 和 TinyBERT 等蒸馏模型能够达到大型模型约 97% 的性能，同时运行速度快得多。

在 MethodMind 中打开即将推出视频即将推出Download slides

阅读完整方法

仅限会员

使用免费账户登录即可阅读本节。

Method map

The neighbourhood of related methods — select a node to explore.

知识蒸馏

长格式Transformer / BigBird 专家混合模型随机森林视觉对比学习 XGBoost 胶囊网络集成自监督学习联邦学习 MobileNet：面向移动视觉的高效卷积神经网络多任务学习

+3 more

来源

Hinton, G., Vinyals, O. & Dean, J. (2015). Distilling the Knowledge in a Neural Network. NeurIPS Deep Learning Workshop. link ↗
Sanh, V., Debut, L., Chaumond, J. & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv:1910.01108. link ↗

如何引用本页

ScholarGate. (2026, June 1). Knowledge Distillation (Teacher–Student Model Compression). ScholarGate. https://scholargate.app/zh/deep-learning/knowledge-distillation

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side →

被引用于

胶囊网络集成自监督学习联邦学习 MobileNet：面向移动视觉的高效卷积神经网络多任务学习神经架构搜索自监督图像分类弱监督视觉变换器

发现本页有问题？报告或提出修改建议 →

阅读完整方法

Method map

来源

如何引用本页

相关方法

Which method?

被引用于