相關函數
1、torch.clamp()
torch.clamp(input, min, max, out=None) --> Tensor
將輸入input Tensor中每個元素約束到【min,max】,返回一個截斷後的new Tensor
例子:
input = torch.Tensor([2,1,0.5,-0.5])
output = torch.clamp(input,0,1)
print(output)
'''
output
Out[16]: tensor([1.000, 1.000, 0.500, 0.000])
'''
2、itertools.product( )
product(*iterables,repeat=1) --> product object
iterable是可迭代對象,repeat指定iterable重複幾次,即product(A,repeat=3)等價於product(A,A,A)
該函數用於求輸入迭代對象的笛卡爾集,相當於一個前天的for循環。product返回的是一個迭代器。
import itertools.product as product
list1=['a','b']
list2=['c','d']
for item in product(list1,list2):
print(item)
'''
...
('a','c')
('a','d')
('b','c')
('b','d')
'''
list1 = [1,2,3,4]
for item in product(list1,repeat=2):
print(item)
'''
用於生成笛卡爾座標
(1,1)
(1,2)
(1,3)
(1,4)
...
(4,1)
(4,2)
(4,3)
(4,4)
'''
相關參數
# RefineDet CONFIGS
voc_refinedet = {
'320': {
'num_classes': 21,
'lr_steps': (80000, 100000, 120000),
'max_iter': 120000,
'feature_maps': [40, 20, 10, 5],
'min_dim': 320,
'steps': [8, 16, 32, 64],
'min_sizes': [32, 64, 128, 256],
'max_sizes': [],
'aspect_ratios': [[2], [2], [2], [2]],
'variance': [0.1, 0.2],
'clip': True,
'name': 'RefineDet_VOC_320',
},
'512': {
'num_classes': 21,
'lr_steps': (80000, 100000, 120000),
'max_iter': 120000,
'feature_maps': [64, 32, 16, 8],
'min_dim': 512,
'steps': [8, 16, 32, 64],
'min_sizes': [32, 64, 128, 256],
'max_sizes': [],
'aspect_ratios': [[2], [2], [2], [2]],
'variance': [0.1, 0.2],
'clip': True,
'name': 'RefineDet_VOC_320',
}
}
num_class爲當前的數據集類別數+背景(voc爲20+1)
feature_maps爲FPN的特徵圖大小
steps特徵圖相對於輸入圖像的下採樣率,用於計算當前特徵圖上某一位置處所有默認框中心座標在輸入圖像的座標
min_sizes&max_size爲當前特徵層的默認尺寸,個人感覺這裏直接寫出來的意思是表示該層的anchor大小在這兩個值之間。
aspect_ratios anchor的長寬比
代碼解讀
class PriorBox(object):
"""Compute priorbox coordinates in center-offset form for each source
feature map.
"""
def __init__(self, cfg):
super(PriorBox, self).__init__()
self.image_size = cfg['min_dim']
# number of priors for feature map location (either 4 or 6)
self.num_priors = len(cfg['aspect_ratios'])
self.variance = cfg['variance'] or [0.1] # [0.1, 0.2]
self.feature_maps = cfg['feature_maps'] #[64, 32, 16, 8]
self.min_sizes = cfg['min_sizes'] #[32, 64, 128, 256]
self.max_sizes = cfg['max_sizes'] #[]
self.steps = cfg['steps'] #[8, 16, 32, 64]
self.aspect_ratios = cfg['aspect_ratios'] #[[2],[2],[2],[2]]
self.clip = cfg['clip'] # True
self.version = cfg['name'] # RefineDet_VOC_512
for v in self.variance:
if v <= 0:
raise ValueError('Variances must be greater than 0')
def forward(self):
mean = []
for k, f in enumerate(self.feature_maps): #遍歷4個特徵圖,每個特徵圖分別生成anchor
for i, j in product(range(f), repeat=2):#針對每個特徵圖遍歷所有座標
'''
將特徵圖的座標對應回原圖座標,然後縮放到0~1的相對距離
原始公式應該爲cx = (j+0.5)* step /min_dim,這裏拆成兩步計算
'''
f_k = self.image_size / self.steps[k]
# unit center x,y
cx = (j + 0.5) / f_k
cy = (i + 0.5) / f_k
# aspect_ratio: 1
# rel size: min_size
s_k = self.min_sizes[k]/self.image_size
mean += [cx, cy, s_k, s_k]
# aspect_ratio: 1
# rel size: sqrt(s_k * s_(k+1))
if self.max_sizes:
s_k_prime = sqrt(s_k * (self.max_sizes[k]/self.image_size))
mean += [cx, cy, s_k_prime, s_k_prime]
# rest of aspect ratios
for ar in self.aspect_ratios[k]:
mean += [cx, cy, s_k*sqrt(ar), s_k/sqrt(ar)]
mean += [cx, cy, s_k/sqrt(ar), s_k*sqrt(ar)]
# back to torch land將產生的anchor轉化成n行4列的標準格式
output = torch.Tensor(mean).view(-1, 4)
if self.clip:
output.clamp_(max=1, min=0)
return output