DL開源框架Caffe | 目標檢測Faster-rcnn問題全解析

原地址http://blog.csdn.net/u010402786 https://blog.csdn.net/u010402786/article/details/72675831

一工程目錄

在github上clone下來的代碼，可以看到根目錄下有以下幾個文件夾，其中output爲訓練完之後纔會有的文件夾。

caffe-fast-rcnn ，這裏是caffe框架目錄；
data，用來存放pretrained模型，比如imagenet上的，以及讀取文件的cache緩存；
experiments，存放配置文件以及運行的log文件，另外這個目錄下有scripts可以用end2end或者alt_opt兩種方式訓練；
lib，用來存放一些python接口文件，如其下的datasets主要負責數據庫讀取，config負責cnn一些訓練的配置選項；
models，裏面存放了三個模型文件，小型網絡的ZF，大型網絡VGG16，中型網絡VGG_CNN_M_1024。推薦使用VGG16，如果使用端到端的approximate joint training方法，開啓CuDNN，只需要3G的顯存即；
output，這裏存放的是訓練完成後的輸出目錄，默認會在faster_rcnn_end2end文件夾下；
tools，裏面存放的是訓練和測試的Python文件。

二訓練方式

Alternative training(alt-opt)
Approximate joint training(end-to-end)

　　推薦使用第二種，因爲第二種使用的顯存更小，而且訓練會更快，同時準確率差不多，兩種方式需要修改的代碼是不一樣的，同時faster rcnn提供了三種訓練模型，小型的ZFmodel，中型的VGG_CNN_M_1024和大型的VGG16,論文中說VGG16效果比其他兩個好，但是同時佔用更大的GPU顯存(~11GB)

三訓練代碼

cd py-faster-rcnn
./experiments/scripts/faster_rcnn_alt_opt.sh 0 VGG16 pascal_voc
# 第一塊GPU(0) 模型是VGG16 數據集時pascal_voc 
cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] [--set ...]

python ./tools/train_net.py --gpu 1 --solver models/pascal_voc/VGG_CNN_M_1024/faster_rcnn_end2end/solver.prototxt --weights data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel --imdb voc_2012_trainval --iters 70000 --cfg experiments/cfgs/faster_rcnn_end2end.yml

問題1：如何在同一張圖像中畫出不同種類對應顏色的目標框？

修改demo.py中的代碼，代碼如下：

`# Visualize detections for each class
CONF_THRESH = 0.7
NMS_THRESH = 0.3
for cls_ind, cls in enumerate(CLASSES[1:]):
cls_ind += 1 # because we skipped background
cls_boxes = boxes[:, 4_cls_ind:4_(cls_ind + 1)]
cls_scores = scores[:, cls_ind]
dets = np.hstack((cls_boxes,
cls_scores[:, np.newaxis])).astype(np.float32)
keep = nms(dets, NMS_THRESH)
dets = dets[keep, :]

    #draw
    #vis_detections(im, cls, dets, thresh=CONF_THRESH)
    font = cv2.FONT_HERSHEY_SIMPLEX
    color = (0,0,0)
    if cls_ind == 1: #motorbike
        color = (0, 0, 255)
    elif cls_ind == 2: #car
        color = (0, 255, 0)
    elif cls_ind == 3: #bus
        color = (255, 0, 0)
    else: #truck
        color = (255, 255, 255)
    inds = np.where(dets[:, -1] >= CONF_THRESH)[0]
    if len(inds) > 0:
        for i in inds:
            bbox = dets[i, :4]
            score = dets[i, -1]
            cv2.rectangle(im,(bbox[0], bbox[1]), (bbox[2], bbox[3]), color, 2)
            cv2.putText(im,'{:s} {:.3f}'.format(cls, score),(bbox[0], (int)((bbox[1]- 2))), font, 0.5, (0,255,0), 1)

# Display the resulting frame
cv2.imshow('{:s}'.format(image_name),im)`

四場景應用

問題1：如果想檢測小的物體，應該怎麼辦？
解答：改變anchor_target_layer 和proposal_layer層的參數，[鏈接在此]

scales: decrease these values to account for smaller boxes
ratios: adjust them depending on the shape of your grount-truth boxes
feat_stride : supposedly this can be modified to improve accuracy of the generated anchors

問題2：如何實時的進行視頻的檢測？（#578）
解答：需要修改原代碼demo.py，代碼如下

while True:
    demo_video(net,cv2.VideoCapture(videoFilePath))
def demo_video(net, videoFile):
    global frameRate
    # Load the demo image
    ret, im = videoFile.read()
    # Detect all object classes and regress object bounds
    timer = Timer()
    timer.tic()
    scores, boxes = im_detect(net, im)
    timer.toc()
    print ('Detection took {:.3f}s for '
        '{:d} object proposals').format(timer.total_time, boxes.shape[0])
    frameRate = 1.0/timer.total_time
    print "fps: " + str(frameRate)
    # Visualize detections for each class
    CONF_THRESH = 0.65
    NMS_THRESH = 0.2
    for cls_ind, cls in enumerate(CLASSES[1:]):
        cls_ind += 1 # because we skipped background
        cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]
        cls_scores = scores[:, cls_ind]
        dets = np.hstack((cls_boxes,
                          cls_scores[:, np.newaxis])).astype(np.float32)
        keep = nms(dets, NMS_THRESH)
        dets = dets[keep, :]
        im=vis_detections_video(im, cls, dets, thresh=CONF_THRESH)
    cv2.putText(im,'{:s} {:.2f}'.format("FPS:", frameRate(1750,50),cv2.FONT_HERSHEY_SIMPLEX,1,(0,0,255))
    cv2.imshow(videoFilePath.split('/')[len(videoFilePath.split('/'))-1],im)
    cv2.waitKey(20)

問題3：如何針對小的目標檢測？（#443）

針對一個大圖像中的小目標進行檢測，需要修改anchor的參數，具體的文件：generate_anchors.py
from this：
def generate_anchors(base_size=16, ratios=[0.5, 1, 2], scales=2**np.arange(3, 6)):
To this:
def generate_anchors(base_size=16, ratios=[0.3, 0.75, 1], scales=2**np.arange(3, 6)):

參考鏈接： [鏈接1] ，[鏈接2]

五訓練問題

問題1：訓練完成的模型，但是使用原圖卻檢測不到任何結果？

原因：很有可能標註的時候的label超出了圖像的邊界。推薦兩個驗證標註的方式：[check the boxes] 和最新版本的LabelImg。

問題2：如何去訓練一個RPN模型（#364）

首先需要知道alt_opt是如何工作的：

Train RPN
Write down the RPN
Train Fast-RCNN using the generated RPNs
Repeat 1-3 again for optimising weights for RPN & Fast-RCNN

然後，只需做1-2步即可生成proposals. 可視化這些proposals可以將
lib/rpn/generate.py中的visualisation置爲1。

問題3：faster-rcnn如何使用多GPU進行訓練

首先答案是否定的，python不支持多GPU訓練。但也有相關的解決方案：
1. https://github.com/315386775/py-R-FCN-multiGPU 這個分支支持多GPU
2. mxnet可以支持多GPU訓練

0526更新
問題4：訓練時出現bbox_loss爲0的問題　　　　　　　　
　　　　　　　　　

問題對應的鏈接如下：[loss爲0的問題]

六訓練日誌

在$FRCNN_ROOT的experiments/script中有腳本可以查看：faster_rcnn_end2end.sh

LOG="experiments/logs/faster_rcnn_end2end_${NET}_${EXTRA_ARGS_SLUG}.txt.`date +'%Y-%m-%d_%H-%M-%S'`"

DL開源框架Caffe | 目標檢測Faster-rcnn問題全解析

原地址http://blog.csdn.net/u010402786 https://blog.csdn.net/u010402786/article/details/72675831

一工程目錄

二訓練方式

三訓練代碼

四場景應用

五訓練問題

六訓練日誌

如何使用 JS 判斷用戶是否處於活躍狀態

Mono 支持LoongArch架構

lightdb秒級增加列和刪除列（not null帶默認值）

lightdb數據庫超時相關控制參數

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

❤️‍🔥 Solon Cloud Event 新的事務特性與應用

lightdb mysql 8.0兼容之不可見主鍵

使用 JS 實現在瀏覽器控制檯打印圖片 console.image()

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（四）使用域名訪問網站應用

snap佔用佔用100%

yolov3-tiny　訓練。以及yolov3 畫圖。

保存結果，改爲ｘｍｌ，修改ＸＭＬ

轉載：算力計算

Mat 　iplimage

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

DL開源框架Caffe | 目標檢測Faster-rcnn問題全解析

原地址http://blog.csdn.net/u010402786 https://blog.csdn.net/u010402786/article/details/72675831

一 工程目錄

二 訓練方式

三 訓練代碼

四 場景應用

五 訓練問題

六 訓練日誌

一工程目錄

二訓練方式

三訓練代碼

四場景應用

五訓練問題

六訓練日誌