網上有很多類似的文章和代碼,但是都不怎麼好,這裏分享一下我打比賽用的。
import os
import json
import random
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['font.family']='sans-serif'
plt.rcParams['figure.figsize'] = (10.0, 5.0)
# 這裏打開你的訓練集的標註,格式是COCO數據集的格式
with open('annotation.json') as f:
ann=json.load(f)
category_dic=dict([(i['id'],i['name']) for i in ann['categories']])
counts_label=dict([(i['name'],0) for i in ann['categories']])
for i in ann['annotations']:
counts_label[category_dic[i['category_id']]]+=1
box_w = []
box_h = []
box_wh = []
categorys_wh = [[] for j in range(204)]
for a in ann['annotations']:
if a['category_id'] != 0:
box_w.append(round(a['bbox'][2],2))
box_h.append(round(a['bbox'][3],2))
#因爲anchor_ratio是指H/W,這裏就統計H/W
hw = a['bbox'][3]/a['bbox'][2]
if hw > 1:
hw = round(hw, 0)
else:
hw = round(hw, 1)
box_wh.append(hw)
categorys_wh[a['category_id']-1].append(hw)
# 所有標籤的長寬高比例
box_wh_unique = list(set(box_wh))
box_wh_unique.sort()
box_wh_count=[box_wh.count(i) for i in box_wh_unique]
# 繪圖
wh_df = pd.DataFrame(box_wh_count,index=box_wh_unique,columns=['寬高比數量'])
wh_df.plot(kind='bar',color="#55aacc")
plt.show()
得到圖像你就可以大概選一些ratio了,選擇出現比例比較多的ratio,比如這個例子裏面我就選了紅色箭頭的這些,可以跑跑看看結果,然後再微調一下看看哪個結果好。