مقایسهٔ روشها
روشهای انتخابی خود را کنار هم مرور کنید؛ ردیفهای متفاوت برجسته شدهاند.
| VGGNet (شبکههای کانولوشنی بسیار عمیق)× | AlexNet× | MobileNet: شبکههای عصبی کانولوشنی کارآمد برای بینایی ماشین در موبایل× | ResNet (شبکه باقیمانده)× | |
|---|---|---|---|---|
| حوزه | یادگیری عمیق | یادگیری عمیق | یادگیری عمیق | یادگیری عمیق |
| خانواده | Machine learning | Machine learning | Machine learning | Machine learning |
| سال پیدایش≠ | 2014 | 2012 | 2017 | 2016 |
| پدیدآور≠ | Simonyan, K. & Zisserman, A. (Visual Geometry Group, Oxford) | Krizhevsky, A.; Sutskever, I.; Hinton, G. E. | Andrew Howard et al. (Google) | He, K.; Zhang, X.; Ren, S.; Sun, J. |
| نوع≠ | Deep Convolutional Neural Network (image classification) | Deep Convolutional Neural Network (CNN) | Lightweight CNN architecture | Deep Convolutional Neural Network with skip connections |
| منبع بنیادین≠ | Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556 [cs.CV]. Published at ICLR 2015. DOI ↗ | Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, 25, 1097–1105. (Republished: Communications of the ACM, 60(6), 84–90, 2017.) DOI ↗ | Howard, A. G., et al. (2017). MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint. link ↗ | He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778. DOI ↗ |
| نامهای دیگر≠ | VGG, VGG-16, VGG-19, Very Deep ConvNet | AlexNet, Krizhevsky net, SuperVision CNN, ImageNet CNN 2012 | MobileNets, Depthwise Separable CNN, Efficient Mobile Vision Network, Mobil Evrişimli Sinir Ağı | ResNet, Residual Network, Deep Residual Learning, ResNet-50 |
| مرتبط≠ | 4 | 3 | 2 | 4 |
| خلاصه≠ | VGGNet is a deep convolutional neural network architecture introduced by Karen Simonyan and Andrew Zisserman at the Visual Geometry Group, Oxford, in 2014 (published at ICLR 2015). It demonstrated that network depth — achieved exclusively through stacking small 3x3 convolutional filters — is the single most critical factor for high image-classification accuracy, and its two canonical variants (VGG-16 and VGG-19) became the dominant benchmark architectures for CNN design throughout the mid-2010s. | AlexNet is a deep convolutional neural network (CNN) introduced by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton in 2012. It won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC 2012) with a top-5 error rate of 15.3%, outstripping the runner-up by more than 10 percentage points and reigniting broad interest in deep learning. The architecture introduced or popularised several techniques — ReLU activations, dropout regularisation, and multi-GPU training — that became standard practice across the field. | MobileNet is a family of lightweight convolutional neural network architectures introduced by Howard et al. at Google in 2017. It is designed to run image classification, object detection, and other vision tasks directly on mobile devices and embedded systems with limited computational budgets. By replacing standard convolutions with depthwise separable convolutions and exposing two global hyperparameters, MobileNet dramatically reduces multiply-add operations and model size while retaining competitive accuracy. | ResNet (Residual Network) is a deep convolutional neural network architecture introduced by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun at CVPR 2016. By inserting shortcut (skip) connections that carry the input of a block directly to its output — defining the block's task as learning a residual correction rather than a full mapping — ResNet enabled training of networks with hundreds or even thousands of layers without the vanishing-gradient degradation that had previously made very deep networks impractical. It won the ILSVRC 2015 image recognition competition with a top-5 error of 3.57% and remains the most widely used backbone architecture in computer vision. |
| ScholarGateمجموعهداده ↗ |
|
|
|
|