代碼參考2d的 spp-net code Github
1.SPP 插入位置
在分類之前,FC要求輸入特定尺寸的feature。當輸入中包含不同size的圖像時,傳統的fc就不可以了。此時kaiming提出了SPP-Net,解決了這個問題。
2. SPP module 細節
輸入任意大小的feature,使用maxpooling/average pooling等方法得到4x4xc,2x2xc, 1x1xc的三個固定大小feature。
3.代碼實現
class SpatialPyramidPool3D(nn.Module):
"""
Args:
out_side (tuple): Length of side in the pooling results of each pyramid layer.
Inputs:
- `input`: the input Tensor to invert ([batch, channel, width, height])
"""
def __init__(self, out_side):
super(SpatialPyramidPool3D, self).__init__()
self.out_side = out_side
def forward(self, x):
out = None
for n in self.out_side:
d_r, w_r, h_r = map(lambda s: math.ceil(s / n), x.size()[2:]) # Receptive Field Size
s_d, s_w, s_h = map(lambda s: math.floor(s / n), x.size()[2:]) # Stride
max_pool = nn.MaxPool3d(kernel_size=(d_r, w_r, h_r), stride=(s_d, s_w, s_h))
y = max_pool(x)
if out is None:
out = y.view(y.size()[0], -1)
else:
out = torch.cat((out, y.view(y.size()[0], -1)), 1)
return out
用的時候這麼用,如果feature本身大小小於4*4,那肯定不行,4去掉就行。
self.spp = SpatialPyramidPool3D(out_side=(1,2,4))