相關函數

1、torch.clamp()

torch.clamp(input, min, max, out=None) --> Tensor

將輸入input Tensor中每個元素約束到【min，max】，返回一個截斷後的new Tensor

例子：

input = torch.Tensor([2,1,0.5,-0.5])
output = torch.clamp(input,0,1)
print(output)

'''
output
Out[16]: tensor([1.000, 1.000, 0.500, 0.000])
'''

2、itertools.product( )

product(*iterables,repeat=1) --> product object

iterable是可迭代對象，repeat指定iterable重複幾次，即product（A,repeat=3）等價於product（A,A,A）

該函數用於求輸入迭代對象的笛卡爾集，相當於一個前天的for循環。product返回的是一個迭代器。

import itertools.product as product

list1=['a','b']
list2=['c','d']

for item in product(list1,list2):
    print(item)


'''
...
('a','c')
('a','d')
('b','c')
('b','d')
'''

list1 = [1,2,3,4]
for item in product(list1,repeat=2):
    print(item)

'''
用於生成笛卡爾座標
（1,1）
（1,2）
（1,3）
（1,4）
  ...
（4,1）
（4,2）
（4,3）
（4,4）
'''

相關參數

# RefineDet CONFIGS
voc_refinedet = { 
    '320': {
        'num_classes': 21, 
        'lr_steps': (80000, 100000, 120000),
        'max_iter': 120000,
        'feature_maps': [40, 20, 10, 5], 
        'min_dim': 320,
        'steps': [8, 16, 32, 64],
        'min_sizes': [32, 64, 128, 256],
        'max_sizes': [], 
        'aspect_ratios': [[2], [2], [2], [2]],
        'variance': [0.1, 0.2],
        'clip': True,
        'name': 'RefineDet_VOC_320',
    },  
    '512': {
        'num_classes': 21, 
        'lr_steps': (80000, 100000, 120000),
        'max_iter': 120000,
        'feature_maps': [64, 32, 16, 8], 
        'min_dim': 512,
        'steps': [8, 16, 32, 64],
        'min_sizes': [32, 64, 128, 256],
        'max_sizes': [], 
        'aspect_ratios': [[2], [2], [2], [2]],
        'variance': [0.1, 0.2],
        'clip': True,
        'name': 'RefineDet_VOC_320',
    }   
}

num_class爲當前的數據集類別數+背景（voc爲20+1）

feature_maps爲FPN的特徵圖大小

steps特徵圖相對於輸入圖像的下採樣率，用於計算當前特徵圖上某一位置處所有默認框中心座標在輸入圖像的座標

min_sizes&max_size爲當前特徵層的默認尺寸，個人感覺這裏直接寫出來的意思是表示該層的anchor大小在這兩個值之間。

aspect_ratios anchor的長寬比

代碼解讀

class PriorBox(object):
    """Compute priorbox coordinates in center-offset form for each source
    feature map.
    """
    def __init__(self, cfg):
        super(PriorBox, self).__init__()
        self.image_size = cfg['min_dim']
        # number of priors for feature map location (either 4 or 6)
        self.num_priors = len(cfg['aspect_ratios'])
        self.variance = cfg['variance'] or [0.1]  # [0.1, 0.2]
        self.feature_maps = cfg['feature_maps']   #[64, 32, 16, 8]
        self.min_sizes = cfg['min_sizes']         #[32, 64, 128, 256]
        self.max_sizes = cfg['max_sizes']         #[] 
        self.steps = cfg['steps']                 #[8, 16, 32, 64]
        self.aspect_ratios = cfg['aspect_ratios'] #[[2],[2],[2],[2]]
        self.clip = cfg['clip']                   # True
        self.version = cfg['name']                # RefineDet_VOC_512
        for v in self.variance:
            if v <= 0:
                raise ValueError('Variances must be greater than 0')

    def forward(self):
        mean = []
        for k, f in enumerate(self.feature_maps):   #遍歷4個特徵圖，每個特徵圖分別生成anchor
            for i, j in product(range(f), repeat=2):#針對每個特徵圖遍歷所有座標
            '''
            將特徵圖的座標對應回原圖座標，然後縮放到0~1的相對距離
            原始公式應該爲cx = （j+0.5）* step /min_dim,這裏拆成兩步計算
            '''
                f_k = self.image_size / self.steps[k]
                # unit center x,y
                cx = (j + 0.5) / f_k 
                cy = (i + 0.5) / f_k 

                # aspect_ratio: 1
                # rel size: min_size
              
                s_k = self.min_sizes[k]/self.image_size
                mean += [cx, cy, s_k, s_k]

                # aspect_ratio: 1
                # rel size: sqrt(s_k * s_(k+1))
                if self.max_sizes:
                    s_k_prime = sqrt(s_k * (self.max_sizes[k]/self.image_size))
                    mean += [cx, cy, s_k_prime, s_k_prime]

                # rest of aspect ratios
                for ar in self.aspect_ratios[k]:
                    mean += [cx, cy, s_k*sqrt(ar), s_k/sqrt(ar)]
                    mean += [cx, cy, s_k/sqrt(ar), s_k*sqrt(ar)]
        # back to torch land將產生的anchor轉化成n行4列的標準格式
        output = torch.Tensor(mean).view(-1, 4)

        if self.clip:
            output.clamp_(max=1, min=0)
        return output