Pytorch實作系列 — FCN

5 min readDec 17, 2020

全卷積網路(Fully Convolutional Network)是由Long et al.(2014, 柏克萊) 在 Fully Convolutional Networks for Semantic Segmentation 提出，主要用於像素級任務，如語意分割。設計概念是用上採樣的方式恢復下採樣的特徵圖達到不同層級特徵圖的融合，同時提升類神經網路的特徵抽取能力。

網路概念

能不能只用卷積層就完成圖像任務，盡可能不用fully connected layer, pooling layer, padding layer
圖片分類的架構對語意分割這類密集預測任務是否沒有完全優化?

網路架構

骨架使用VGG16，採用加法整合下採樣的特徵圖和上採樣的特徵圖，達到多尺度特徵融合。依據預測結果的最後上採樣尺度可分為FCN-32s、FCN-16s、FCN-8s。當特徵融合達到最後一個上採樣尺度，則分別以stride為32、16、8的deconvolution(fractionally-strided convolution, convolution transpose)恢復到輸入圖片的尺寸。

在第二版論文中，添加VGG的全卷積頭以及各尺度的預測頭，因此融合時使用的是各尺度的預測圖而非池化特徵。

資料集

PASCAL VOC 2012，著名的影像資料集，包含物件偵測和語意分割等任務。

訓練

語意分割如同圖片的多分類任務，是對每個像素都做多個類別的分類。在第二版論文中提到可用從FCN32S -> FCN16S -> FCN8S訓練上來的多階段訓練達到較穩定的訓練過程，實際使用後發現FCN8S確實比直接訓練來得穩定。

評估

多階段訓練FCN8S在VOC 2007資料集的Jaccard index為0.59。模型大小達到513MiB。

直接訓練的情形是FCN16S > FCN32S > FCN8S，FCN8S似乎不好掌控。

展示

woman from https://www.pexels.com/photo/woman-holding-disposable-cup-712513/

筆記

上採樣後使用center crop去除deconvolution多餘的部分
第一層卷積腳可用zero padding 100使後續上下採樣順利
多階段訓練可使用不同的學習率
原文中將雙線性插值核與deconvolution結合，但找不到相關資源
有興趣可以再對全卷積層預測頭finetune，也許效果會更好

代碼連結

gitE0Z9/classical-network-series

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

參考

Fully Convolutional Networks for Semantic Segmentation

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional…

arxiv.org

This one is old, not the same as the github repo

Fully Convolutional Networks for Semantic Segmentation

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional…

arxiv.org

fcn.berkeleyvision.org/voc-fcn8s/net.py at master · shelhamer/fcn.berkeleyvision.org

Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, and Trevor Darrell. CVPR…

github.com

This one is quite easier implementation and works well. Much less parameters and comparable performance.

pochih/FCN-pytorch

I train with two popular benchmark dataset: CamVid and Cityscapes pip3 install -r requirements.txt and download pytorch…

github.com

Pytorch實作系列 — FCN

網路概念

網路架構

資料集

訓練

評估

展示

筆記

代碼連結

gitE0Z9/classical-network-series

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

參考

Fully Convolutional Networks for Semantic Segmentation

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional…

Fully Convolutional Networks for Semantic Segmentation

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional…

fcn.berkeleyvision.org/voc-fcn8s/net.py at master · shelhamer/fcn.berkeleyvision.org

Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, and Trevor Darrell. CVPR…

pochih/FCN-pytorch

I train with two popular benchmark dataset: CamVid and Cityscapes pip3 install -r requirements.txt and download pytorch…

Written by mz bai

No responses yet

Pytorch實作系列 — FCN

網路概念

網路架構

資料集

訓練

評估

展示

筆記

代碼連結

gitE0Z9/classical-network-series

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

參考

Fully Convolutional Networks for Semantic Segmentation

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional…

Fully Convolutional Networks for Semantic Segmentation

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional…

fcn.berkeleyvision.org/voc-fcn8s/net.py at master · shelhamer/fcn.berkeleyvision.org

Fully Convolutional Networks for Semantic Segmentation by Jonathan Long*, Evan Shelhamer*, and Trevor Darrell. CVPR…

pochih/FCN-pytorch

I train with two popular benchmark dataset: CamVid and Cityscapes pip3 install -r requirements.txt and download pytorch…

Written by mz bai

No responses yet

Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, and Trevor Darrell. CVPR…