方法对比
并排查看您选择的方法;存在差异的行会高亮显示。
| 多任务学习× | 知识蒸馏× | 迁移学习× | |
|---|---|---|---|
| 领域≠ | 深度学习 | 深度学习 | 机器学习 |
| 方法族 | Machine learning | Machine learning | Machine learning |
| 起源年份≠ | 1997 | 2015 | 2010 (formalized); 1990s (early roots) |
| 提出者≠ | Rich Caruana | Hinton, G., Vinyals, O. & Dean, J. | Pan, S. J. & Yang, Q. (survey); Bengio, Y. (deep learning framing) |
| 类型≠ | Inductive transfer method | Neural network compression (teacher–student) | Learning paradigm |
| 开创性文献≠ | Caruana, R. (1997). Multitask learning. Machine Learning, 28(1), 41–75. DOI ↗ | Hinton, G., Vinyals, O. & Dean, J. (2015). Distilling the Knowledge in a Neural Network. NeurIPS Deep Learning Workshop. link ↗ | Pan, S. J., & Yang, Q. (2010). A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. DOI ↗ |
| 别名 | MTL, Joint Learning, Shared Representation Learning, Çok Görevli Öğrenme | Bilgi Damıtma (Knowledge Distillation), bilgi damıtma, teacher-student distillation, model distillation | TL, domain adaptation, fine-tuning, pre-trained model adaptation |
| 相关≠ | 3 | 5 | 3 |
| 摘要≠ | Multitask Learning (MTL) is a machine learning paradigm in which a model is trained simultaneously on multiple related tasks, sharing representations across them to improve generalization. Introduced formally by Rich Caruana in 1997, MTL draws on the intuition that auxiliary tasks act as inductive bias, providing extra supervision signals that help the shared layers learn richer, more robust feature representations than single-task training would yield. | Knowledge Distillation is a model-compression technique, introduced by Geoffrey Hinton and colleagues in 2015, that trains a small student model using the soft-label outputs of a large teacher model. Distilled models such as DistilBERT and TinyBERT reach roughly 97% of the larger model's performance while running far faster. | Transfer learning is a machine learning paradigm in which knowledge gained from training a model on a source task or domain is reused to improve learning on a different but related target task or domain. It is especially powerful when labeled data for the target task is scarce, and it underlies most modern deep learning applications in computer vision, natural language processing, and beyond. |
| ScholarGate数据集 ↗ |
|
|
|