Pytorch實作系列 — EfficientNet v1 & v2

6 min readDec 1, 2024

--

EfficientNet 是由 Tan et al.(2019, Google)在 EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks提出，藉由使用compound scaling同時操作網路的深度、寬度、解析度達到較少計算量下最大準確度的目標，比其他同等準確度的模型更有效率。

網路概念

CNN在擴張尺度來提升準確度表現通常是基於深度、寬度、解析度三者

compound scaling illustration

並藉由調控三者，觀察到以下三點

深度報酬遞減
寬度快速飽和
解析度報酬遞減

因此提出一個調整架構大小的方法 Compound scaling，Φ 是用來控制整體尺度的參數，而成本限制式反映著與深度的線性關係，寬度和解析度的平方關係。

乘積限制是2可讓FLOPS變化量為2的Φ次方，比較好計算。

compound scaling formula

而EfficientNetV2則進一步從以下三點，改進訓練效率

大圖片訓練時，速度較慢
depthwise convolution在網路淺層，速度較慢
對每個stage都一致擴張尺度是sub-optimal

作者提出training-aware NAS，藉由操控模型參數，優化以下乘積，A是準確度，S是歸一訓練時間，P是權重數量，w跟v是超參數

training aware NAS formula

另一個提出的是Progressive learning，概念是正則化的強度應與模型大小相關，在訓練過程中逐步加強。

網路結構

下圖是EfficientNet-B0的架構，整體類似MobileNetV2，同樣使用inverted residual bottleneck。

係數上使用α=1.2, β=1.1, γ=1.15，這些數字也可以自行做超參數搜尋取得，論文中使用NASNet的搜尋公式，以準確度為目標，FLOPS為成本。

其中Φ增加則變成B1~B7。

EfficientNet-B0 (v1) architecture

EfficientNet-S的架構，淺層採用Fused-MBConv，中間層不使用大kernel，每個stage的scaling參數不相同。

EfficientNet-S (v2) architecture

論文中出現的技巧如autoaugment, stochastic depth, SiLU, randAugment。

資料集

GTSRB，一個有兩萬六千多張照片的德國交通號誌辨識資料集，共43種標記。

評估

Efficientnet-B0 不使用SE，測試準確度達到88.85%，模型大小為13.35MiB，跟同類型的MobileNetV2、MobileNetV3-Large準確度相當，模型大小是前者的1.5倍，後者的六成。

使用SE，測試準確度達到88%，模型大小為27.76MiB。

efficientnet-B0 confusion matrix

Efficientnet-S 測試準確度達到86%，模型大小為80.75MiB。沒有使用progressive learning。

efficientnet-S confusion matrix

Google在過去幾年不停往輕量CNN研發，主要是往手機硬體的運行邁進。不管是MobileNet或MNASNet。

EfficientNet當年出來是比NASNet更有名氣些。NASNet被評價為Google的燒錢實驗，一般人做不來，不過也是AutoML的研發的必經之路。

實作

GitHub — gitE0Z9/pytorch-implementations: Deep learning models implemented in PyTorch

Deep learning models implemented in PyTorch. Contribute to gitE0Z9/pytorch-implementations development by creating an…

github.com

參考

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for…

arxiv.org

EfficientNetV2: Smaller Models and Faster Training

This paper introduces EfficientNetV2, a new family of convolutional networks that have faster training speed and better…

arxiv.org

tpu/models/official/efficientnet at master · tensorflow/tpu

Reference models and tools for Cloud TPUs. Contribute to tensorflow/tpu development by creating an account on GitHub.

github.com

automl/efficientnetv2 at master · google/automl

Google Brain AutoML. Contribute to google/automl development by creating an account on GitHub.

github.com

Computer Vision

Written by mz bai

Math is math, math is universal Code is code, code is horrifying unique.

No responses yet

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams