中值池化
中值池化是參考圖像處理中的中值濾波而引申的一種池化方式。在目前CNN架構中極爲少見,僅發現一篇論文:基於卷積神經網絡和中值池化的人臉識別,不確定是否爲水文。
在前向與反向傳播過程中,中值池化類似於最大值池化,故不再贅述。
中值池化同樣具有學習邊緣和紋理結構的特性,同時具有抗噪性。代碼描述參考:
# 代碼摘自開源項目:pytorch-image-models
class MedianPool2d(nn.Module):
""" Median pool (usable as median filter when stride=1) module.
Args:
kernel_size: size of pooling kernel, int or 2-tuple
stride: pool stride, int or 2-tuple
padding: pool padding, int or 4-tuple (l, r, t, b) as in pytorch F.pad
same: override padding and enforce same padding, boolean
"""
def __init__(self, kernel_size=3, stride=1, padding=0, same=False):
super(MedianPool2d, self).__init__()
self.k = _pair(kernel_size)
self.stride = _pair(stride)
self.padding = _quadruple(padding) # convert to l, r, t, b
self.same = same
def _padding(self, x):
if self.same:
ih, iw = x.size()[2:]
if ih % self.stride[0] == 0:
ph = max(self.k[0] - self.stride[0], 0)
else:
ph = max(self.k[0] - (ih % self.stride[0]), 0)
if iw % self.stride[1] == 0:
pw = max(self.k[1] - self.stride[1], 0)
else:
pw = max(self.k[1] - (iw % self.stride[1]), 0)
pl = pw // 2
pr = pw - pl
pt = ph // 2
pb = ph - pt
padding = (pl, pr, pt, pb)
else:
padding = self.padding
return padding
def forward(self, x):
x = F.pad(x, self._padding(x), mode='reflect')
x = x.unfold(2, self.k[0], self.stride[0]).unfold(3, self.k[1], self.stride[1])
x = x.contiguous().view(x.size()[:4] + (-1,)).median(dim=-1)[0]
return x
分數階最大值池化
分數階最大值池化(Fractional Max Pooling)見諸於ARXIV。本文查閱Pytorch代碼時發現的,先前未曾瞭解過,也未曾用過。感興趣者可以參見原文或者用pytorch代碼試玩幾把,這裏提供一個pytorch試玩deme:
import torch
import torch.nn as nn
inputs = torch.rand(20, 16, 50, 32)
fmp = nn.FractionalMaxPool2d(3, output_ratio=(0.8, 0.8))
output = fmp(inputs)
print(output.size())
# 此時output的尺寸爲:20X16X40X25.
組合池化
組合池化則是同時利用最大值池化與均值池化兩種的優勢而引申的一種池化策略。常見組合策略有兩種:Cat與Add。其代碼描述如下:
def add_avgmax_pool2d(x, output_size=1):
x_avg = F.adaptive_avg_pool2d(x, output_size)
x_max = F.adaptive_max_pool2d(x, output_size)
return 0.5 * (x_avg + x_max)
def cat_avgmax_pool2d(x, output_size=1):
x_avg = F.adaptive_avg_pool2d(x, output_size)
x_max = F.adaptive_max_pool2d(x, output_size)
return torch.cat([x_avg, x_max], 1)