MyDLNote - Enhancement : [NLA系列] Image Restoration via Residual Non-local Attention Networks

[2019ICLR] Image Restoration via Residual Non-local Attention Networks

[paper] Image Restoration via Residual Non-local Attention Networks

[PyTorch] https://github.com/yulunzhang/RNAN

本篇文章語言非常好,值得細細細細讀。

整體思路是,同時學習 local 和 non-local 注意力。這是一個非常重要的思想。

[Non-Local Attention 系列]

Non-local neural networks

 

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond [my CSDN]

Asymmetric Non-local Neural Networks for Semantic Segmentation [my CSDN]

Efficient Attention: Attention with Linear Complexities [my CSDN]

CCNet: Criss-Cross Attention for Semantic Segmentation [my CSDN]

Non-locally Enhanced Encoder-Decoder Network for Single Image De-raining [my CSDN]

Image Restoration via Residual Non-local Attention Networks [my CSDN]


目錄

[2019ICLR] Image Restoration via Residual Non-local Attention Networks

Abstract

Residual Non-Local Attention Network For Image Restoration

Framework

Residual (Non-)Local Attention Block



Abstract

In this paper, we propose a residual non-local attention network for high-quality image restoration. 

這篇文章是幹啥的。

Without considering the uneven distribution of information in the corrupted images, previous methods are restricted by local convolutional operation and equal treatment of spatial- and channel-wise features. 

提出問題:在不考慮圖像中信息分佈不均勻的情況下,以往的方法受到局部卷積操作和平等處理空間特徵、通道特徵的限制。

To address this issue, we design local and non-local attention blocks to extract features that capture the long-range dependencies between pixels and pay more attention to the challenging parts. 

一句話概況研究內容及功能。

Specifically, we design trunk branch and (non-)local mask branch in each (non-)local attention block. The is used to extract hierarchical features Local and non-local mask branches aim to adaptively rescale these hierarchical features with mixed attentions. The local mask branch concentrates on more local structures with convolutional operations, while non-local attention considers more about long-range dependencies in the whole feature map. Furthermore, we propose residual local and non-local attention learning to train the very deep network, which further enhance the representation ability of the network. 

細節介紹:

1)trunk branch:提取的分層特性;

2)Local and non-local mask branches:用混合注意力模型自適應地重新調整這些層次特徵。

Local mask branch:使用卷積操作專注於局部結構特徵

non-local mask branch:關注長距離依賴關係

3)residual local and non-local attention learning:訓練非常深入的網絡

Our proposed method can be generalized for various image restoration applications, such as image denoising, demosaicing, compression artifacts reduction, and super-resolution. 

說明該網絡的功能。(加分)

Experiments demonstrate that our method obtains comparable or better results compared with recently leading methods quantitatively and visually.

結論。


Introduction

Related Work

(持續更新。。。)

 


Residual Non-Local Attention Network For Image Restoration

Framework

The first and last convolutional layers are shallow feature extractor and reconstruction layer respectively. We propose residual local and non-local attention blocks to extract hierarchical attention-aware features. In addition to making the main network learn degradation components, we further concentrate on more challenging areas by using local and non-local attention. We only incorporate residual non-local attention block in low-level and high-level feature space. This is mainly because a few non-local modules can well offer non-local ability to the network for image restoration.

整個 framework 中,第一和最後兩個卷積層分別是:shallow feature extractor and reconstruction layer。

二者中間是一系列 local 或 non-local attention blocks。

non-local attention blocks 只在 low-level and high-level feature space。原因是:少量 non-local 模塊可以很好地爲網絡提供非局部的圖像恢復能力。(其實,還因爲 non-local 計算量太大了。當然,作者很聰明,不是說作者不知道,而是不能說:其一,這是缺點不能說;其二,這不是本文要解決的核心內容,作者當然知道怎麼簡化,但最好是不要提,否則文章邏輯會因爲內容太雜而混亂)。

To show the effectiveness of our RNAN, we choose to optimize the same loss function (e.g., L2 loss function) as previous works.

可見,作者對自己的模型多有信心,只用一個 L2 loss。

 

整體就這麼簡單,接下來,詳細介紹每個細節.

Residual (Non-)Local Attention Block

Our residual non-local attention network is constructed by stacking several residual local and nonlocal attention blocks shown in Figure 2. Each attention block is divided into two parts: q residual blocks (RBs) in the beginning and end of attention block. Two branches in the middle part: trunk branch and mask branch. For non-local attention block, we incorporate non-local block (NLB) in the mask branch, resulting non-local attention. Then we give more details to those components.

這個 Residual (Non-)Local Attention Block 結構是若干 RBs+兩個支路+若干 RBs。

兩個支路:trunk branch and mask branch。

是 local 還是 non-local,取決於根據 mask branch 的是否有 non-local block。 

Trunk branch:

As shown in Figure 2, the trunk branch includes t residual blocks (RBs). Different from the original residual block in ResNet (He et al., 2016), we adopt the simplified RB from (Lim et al., 2017). The simplified RB (labelled with blue dashed) only consists of two convolutional layers and one ReLU (Nair & Hinton, 2010), omitting unnecessary components, such as maxpooling and batch normalization (Ioffe & Szegedy, 2015) layers. We find that such simplified RB not only contributes to image super-resolution (Lim et al., 2017), but also helps to construct very deep network for other image restoration tasks.

trunk branch 就是若干簡化的 residual blocks。

所謂簡化,就是去掉了 BN。只有卷積和 ReLU。不僅有助於圖像的超分辨率,而且有助於爲其他圖像恢復任務構建非常深入的網絡。

Figure 2: Residual (non-)local attention block. It mainly consists of trunk branch (labelled with gray dashed) and mask branch (labelled with red dashed). The trunk branch consists of t RBs. The mask branch is used to learning mixed attention maps in channel- and spatial-wise simultaneously.
 

Mask branch:

The key point in mask branch is how to grasp information of larger scope, namely larger receptive field size, so that it s possible to obtain more sophisticated attention map. One possible solution is to perform maxpooling several times, as used in (Wang et al., 2017) for image classification. However, more pixel-level accurate results are desired in image  restoration. Maxpooling would lose lots of details of the image, resulting in bad performance. To alleviate such drawbacks, we choose to use large-stride convolution and deconvolution to enlarge receptive field size. Another way is considering non-local information across the whole inputs, which will be discussed in the next subsection.

mask branch 的關鍵是如何把握更大範圍的信息,即更大的感受野大小,從而獲得更復雜的注意圖。

在圖像恢復中,需要獲得更精確的像素級結果。所以,Maxpooling 會丟失大量的細節的形象,導致不良的性能。

mask branch 結構:(NLB) + 若干 RB + 大步進卷積 + 若干 RB + 大步進去卷積 + 若干 RB + 1x1卷積 + sigmoid。

Non-Local Mixed Attention: 
With non-local and local attention computation, feature maps in the mask branch are finally mapped by sigmoid function.

其實這節想說的是,mask branch 的結構可以理解爲: 非局部(NLB)+ 局部(大步進卷積和去卷積) + sigmoid。這樣,就將非局部和局部的 attention 給 mixed 一起了。

個人認爲這是這篇文章最重要的思想。我在圖像修復實驗中,也用了類似結構,發現確實有效!

 


後面內容嘛,不算是本文最核心思想,所以等有時間在更新吧。。。

就醬吧。。。

(持續更新中)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章