Pytorch實作系列 — ResNet

5 min readDec 4, 2020

ResNet 是由He et al.(2015, 微軟)在 Deep Residual Learning for Image Recognition 提出，如今已成為最廣為使用的卷積網路骨架，可見其不小的影響力。而使其擁有如此突破的原因莫過於殘差結構，相較於一般的卷積網路，殘差連接使梯度能夠在較深的網路傳遞順暢，減少因梯度消失或突增導致的數值不穩定性。

資料集

Animals-10，一個包含十種動物，兩萬八千多張照片的資料集。

網路概念

論文中的啟發點是類神經網路可以漸進逼近任意函數H，但是學習一個未知的函數可能過於困難，不如學習殘差函數F，其中F(x) = H(x)-x，因此得到如今的殘差網路結構H(x) = F(x) + x。

網路結構

左側是原本的卷積區塊，圖中所示是bottleneck結構，右側即為殘差連接，是梯度能繼續傳遞的關鍵，即使梯度可能因為網路深度增加而傳遞不良，仍可保持一定大小往回傳遞。而下採樣處理是為了對應特徵圖尺寸的變化，若尺寸不變則為identity。許多的ResNet後代，如ResNeXt, SEResNet, SEResNeXt, ResNest多是對左側的卷積結構進行變換，也就是在不改變傳遞的保護方法下，提升特徵抽取的能力。

接著探索更極端的網路深度，He et al.(2016, 微軟)對identity mapping做了更多研究，提出identity shortcut對學習的資訊流動損失最低，pre activation結構對深度的延伸有所幫助，達到ResNet-1001, 甚至更深。也有非官方社群稱其為resnet2。

pre activation將 BN、ReLU兩運算都擺在convolution前面。

而 ResNet-B, ResNet-C, ResNet-D 為後續變形，出現在He et al.(2018, AWS)的論文中，該文章收錄多種圖像分類任務訓練技巧，其中ResNet-D有torchvision的官方實作。個人實作上將average pooling的kernel size改為3，因應兩側特徵圖大小不一致的情況。

ResNet-C是取代ResNet stage 1，也就是開頭kernel size 7的部分，不是residual block。

左側為ResNet50，往右依序為 ResNet-B, ResNet-C, ResNet-D

評估

訓練準確度達82%，加入Batch Normalization和ReLU後速度加快且模型大小減少一半以上。

結語

本文為實現簡單的ResNet-50，代碼可參考以下連結。

gitE0Z9/classical-network-series

GitHub is home to over 50 million developers working together to host and review code, manage projects, and build…

github.com

參考

Deep Residual Learning for Image Recognition

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of…

arxiv.org

Identity Mappings in Deep Residual Networks

Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice…

arxiv.org

Bag of Tricks for Image Classification with Convolutional Neural Networks

Much of the recent progress made in image classification research can be credited to training procedure refinements…

arxiv.org

Pytorch實作系列 — ResNet

gitE0Z9/classical-network-series

GitHub is home to over 50 million developers working together to host and review code, manage projects, and build…

參考

Deep Residual Learning for Image Recognition

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of…

Identity Mappings in Deep Residual Networks

Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice…

Bag of Tricks for Image Classification with Convolutional Neural Networks

Much of the recent progress made in image classification research can be credited to training procedure refinements…

Written by mz bai