使用DOTA數據集進行多類別任意方向遙感目標檢測

博主目前正嘗試使用DOTA數據集 進行多類別任意方向遙感目標檢測

一、DOTA數據集

論文github OBB -Faster-rcnn Deformable Model

1、已按照公式 完成xml 的轉換
image.png

2、由於圖片過大,不能直接送入網絡,需要裁剪

二、訓練階段

use the DOTA_devkit to split the data into patches ang merge the results and visual data etc.

一些完整的目標被裁成兩部分
a series of 1024×1024 patches from the original images with a stride set to 512.

image.png

如果$U_{i}$>0.7 保持原來的標註

否則 label difficult

三、代碼修改

根據自身數據集的類別等修改 生成lmdb的過程

修改prototxt文件

# Modify the job name if you want.
job_name = "SSD_{}".format(resize)
# The name of the model. Modify it if you want.
model_name = "VGG_VOC0712_{}".format(job_name)

# Directory which stores the model .prototxt file.
save_dir = "models/VGGNet/VOC0712/{}".format(job_name)
# Directory which stores the snapshot of models.
snapshot_dir = "models/VGGNet/VOC0712/{}".format(job_name)
# Directory which stores the job script and log file.
job_dir = "jobs/VGGNet/VOC0712/{}".format(job_name)
# Directory which stores the detection results.
output_result_dir = "{}/data/VOCdevkit/results/VOC2007/{}/Main".format(os.environ['HOME'], job_name)

# model definition files.
train_net_file = "{}/train.prototxt".format(save_dir)
test_net_file = "{}/test.prototxt".format(save_dir)
deploy_net_file = "{}/deploy.prototxt".format(save_dir)
solver_file = "{}/solver.prototxt".format(save_dir)
# snapshot prefix.
snapshot_prefix = "{}/{}".format(snapshot_dir, model_name)
# job script path.
job_file = "{}/{}.sh".format(job_dir, model_name)

# Stores the test image names and sizes. Created by data/VOC0712/create_list.sh
name_size_file = "data/VOC0712/test_name_size.txt"
# The pretrained model. We use the Fully convolutional reduced (atrous) VGGNet.
pretrain_model = "models/VGGNet/VGG_ILSVRC_16_layers_fc_reduced.caffemodel"
# Stores LabelMapItem.
label_map_file = "data/VOC0712/labelmap_voc.prototxt"

測試階段

crop後檢測得出臨時結果,再聯合結果得出最後的檢測結果

In the testing phase, first we send the cropped image patches to obtain temporary results and then
we combine the results together to restore the detecting results on the original image.

相關問題

修改類別導致的錯誤

 net.cpp:774] Cannot copy param 0 weights from layer 'conv4_3_norm_mbox_conf'; shape mismatch.  Source param shape is 40 512 3 5 (307200); target param shape is 320 512 3 5 (2457600). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.


原先是兩類 現在十六類

這是由於預訓練網絡的參數與當前模型架構對不上,只要把出現錯誤的層名字改了就ok了!!之前識別的是兩類(background和text),然後源碼每個點是有20個priorbox,所以num_output是40,現在我要識別16類,輸出應該是320纔對。

I have solved this problem by delete the file like "VGG_text_text_polygon_precise_fix_order_384x384_iter_120000.solverstate"

Delete all solverstate file and problem have solved .

Later, I realize that the true reason is that I have altered another caffemodel to train my data , so I can solve this problem!

 

The latest solvement is to rename the layer's name which you add, or you can change the layers'name in "model_libs.py".  It works!

But the train val is very low so I think this is a bad solvement ?Oh my god ~I don't know what I can do !
-----------------

針對 textbox++
在 model_libs.py的 CreateMultiBoxHead_multitask 函數中將

name = "{}_mbox_conf{}".format(from_layer, conf_postfix)

改爲

name = "{}_mbox_conf1{}".format(from_layer, conf_postfix)

問題2

Train net output #0: mbox_loss = 0 (* 1 = 0 loss)

問題原因: label 文件中的所有座標 需要爲整數

其它

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章