歡迎大家關注筆者，你的關注是我持續更博的最大動力

原創文章，轉載告知，盜版必究

最新版本的mmdetection2.0 （v2.0.0）環境搭建、訓練自己的數據集、測試及常見錯誤集合

文章目錄：

1 mmdetection環境搭建

2 準備自己的數據集

3 訓練自己的數據集

3.1 修改配置文件

3.2 開始訓練模型

3.3 在自己的預訓練模型上進行測試

3.4 用自己訓練的模型在圖片和視頻上做測試

本人環境聲明：

系統環境：Ubuntu18.04.1
cuda版本：10.2.89
cudnn版本：7.6.5
torch版本：1.5.0
torchvision版本：0.6.0
mmcv版本：0.5.5
項目代碼mmdetection v2.0.0，官網是在20200506正式發佈的v2.0.0版本

環境搭建和項目代碼下載時間：2020年5月26號，當前最新代碼，版本是：v2.0.0

注意：
這個版本的代碼文件結構上改動比較大，還有一些細節性的東西，我會在文中詳細闡述！

mmdetection v2.0.0版本的主要重構和修改，如下（詳細參考）：
1、速度更快：針對常見模型優化了訓練和推理速度，將訓練速度提高了30％，將推理速度提高了25％。更多模型細節請參考：model_zoo

2、性能更高：更改了一些默認的超參數而沒有額外的費用，這導致大多數模型的性能得到提高。有關詳細信息，請參閱兼容性

3、更多文檔和教程：添加了許多文檔和教程來幫助用戶更輕鬆地入門，在這裏閱讀。

4、支持PyTorch 1.5：不再支持1.1和1.2，切換到一些新的API。

5、更好的配置系統：支持繼承以減少配置的冗餘。

6、更好的模塊化設計：爲了實現簡單和靈活的目標，我們簡化了封裝過程，同時添加了更多其他可配置模塊，例如BBoxCoder，IoUCalculator，OptimizerConstructor，RoIHead。標頭中還包含目標計算，並且調用層次結構更簡單。

7、支持新的方法：FSAF和PAFPN（部分PAFPN）。

MMDetection 1.x的突破性模型培訓與2.0版本不完全兼容，有關詳細信息以及如何遷移到新版本，請參閱兼容性文檔。

數據聲明：
這裏我以：VOC2007數據集作爲訓練數據，主要是爲了方便快捷的做一個POC的驗證，同時節省數據標註的時間，文中會詳細闡述如何自定義自己的數據集，和修改修改和數據集相關的位置。

VOC2007數據下載：

1 mmdetection環境搭建

查看Cuda和cudnn的版本，參考
在Linux下查看cuda的版本：

cat /usr/local/cuda/version.txt

示例：

(mmdetection) zpp@estar-cvip:/HDD/zpp/shl/mmdetection$ cat /usr/local/cuda/version.txt
CUDA Version 10.2.89
(mmdetection) zpp@estar-cvip:/HDD/zpp/shl/mmdetection$

在Linux下查看cudnn的版本：

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

示例：

(mmdetection) zpp@estar-cvip:/HDD/zpp/shl/mmdetection$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"
(mmdetection) zpp@estar-cvip:/HDD/zpp/shl/mmdetection$

如上，我的cudnn版本是：7.6.5

1.1 搭建虛擬環境

1、創建虛擬環境：

conda create -n mmdetection python=3.7

2、激活虛擬環境

conda acitvate mmdetection 或 source activate mmdetection

退出虛擬環境：

conda deactivate 或 source deactivate

注意：

在安裝下面的環境的時候，一定要確保自己在虛擬環境中，不要退出來

1.2 安裝必要的庫包

安裝torch、torchvision、mmcv庫包

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple torch==1.5.0 torchvision==0.6.0 mmcv==0.5.5

我安裝的版本：

torch==1.5.0
torchvision==0.6.0
mmcv==0.5.5
Cython==0.29.16

注意：
如果下載經常中斷，建議先把torch的.whl庫包文件下載下來，然後使用pip intall torch_xxxx.whl進行安裝

torch1.5.0下載地址：點我-》帶你去https://pypi.tuna.tsinghua.edu.cn/packages/76/58/668ffb25215b3f8231a550a227be7f905f514859c70a65ca59d28f9b7f60/torch-1.5.0-cp37-cp37m-manylinux1_x86_64.whl

1.3 下載mmdetection

1、克隆mmdetection到本地

git clone https://github.com/open-mmlab/mmdetection.git

如果git clone下載的速度太慢，可以使用github的鏡像進行下載，如下：

或：git clone https://github.com.cnpmjs.org/open-mmlab/mmdetection.git

2、進入到項目目錄中

cd mmdetection

1.4 安裝mmdetection依賴和編譯mmdetection

1.4.1 安裝mmdetection依賴

安裝依賴的庫，同時會檢查一些庫包的版本是否符合

pip install -r requirements/build.txt

mmdetection/requirements.txt定義的內容：

-r requirements/build.txt
-r requirements/optional.txt
-r requirements/runtime.txt
-r requirements/tests.txt

可看到，一共有四個安裝依賴文件：build.txt, optional.txt, runtime.txt, tests.txt，四個依賴文件中定義的內容如下：

# build.txt
# These must be installed before building mmdetection
numpy
torch>=1.3

####################################################
# optional.txt
albumentations>=0.3.2
cityscapesscripts
imagecorruptions

##################################################
# runtime.txt
matplotlib
mmcv>=0.5.1
numpy
# need older pillow until torchvision is fixed
Pillow<=6.2.2
six
terminaltables
torch>=1.3
torchvision

##################################################
# tests.txt
asynctest
codecov
flake8
isort
# Note: used for kwarray.group_items, this may be ported to mmcv in the future.
kwarray
pytest
pytest-cov
pytest-runner
ubelt
xdoctest >= 0.10.0
yapf

會安裝一些沒有的依賴庫，同時會檢查安裝依賴庫的版本是否符合要求。

1.4.2 安裝cocoapi

安裝庫包cocoapi

pip install "git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI"

如果下載不下來，或者下載的特別慢，同上面一樣使用github鏡像，如下：

或： pip install "git+https://github.com.cnpmjs.org/cocodataset/cocoapi.git#subdirectory=PythonAPI"

注意：

如果你克隆下載的時候出現：Couldn't find host github.com in the .netrc file; using defaults信息，然後就一直卡着不動了，你就使用github鏡像下載

1.4.3 編譯mmdetection環境

在運行下面的編譯命令之前，現在.bashrc把cuda-10.2添加到環境變量中
1、添加環境變量

vi ~/.bashrc

添加內容如下：

export PATH=/usr/local/cuda-10.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

2、讓環境變量生效

source ~/.bashrc

3、編譯mmdetection

python setup.py develop或pip install -v -e .

正確編譯的結果如下：

注意1：

如果你在編譯的過程如下報錯：error: command 'usr/bin/nvcc' failed with exit status 1，記得按照是上面的方式添加把cuda添加到環境變量中即可

注意2：

如果後面你在執行訓練或測試命令時，報錯：ModuleNotFoundError: No module named 'mmdet'，這也是由於沒有正確編譯導致的錯誤。

注意3：
最好使用python setup.py develop進行編譯，我使用pip install -v -e . 進行編譯的時候，包mmcv版本錯誤！

2 準備自己的數據集

2.1 數據標註

數據標註工具使用LabelImg，然後把所有的數據都標註成VOC數據格式，關於如何LabelImg工具如何使用，請參考我的博客：LabelImg教程詳細使用

所有的數據圖片放在：JPEGImage文件夾
所有的數據圖片的標籤文件放在：Annotation文件夾

2.2 數據劃分與存放

2.2.1 VOC2007數據集說明

VOC數據集共包含：訓練集（5011幅），測試集（4952幅），共計9963幅圖，共包含20個種類。

類別	訓練數量	測試集數量
aeroplane	238	204
bicycle	243	239
bird	330	282
boat	181	172
bottle	244	212
bus	186	174
car	713	721
cat	337	322
chair	445	417
cow	141	127
diningtable	200	190
dog	421	418
horse	287	274
motorbike	245	222
person	2008	2007
pottedplant	245	224
sheep	96	97
sofa	229	223
train	261	259
tvmonitor	256	229

更多VOC數據集介紹，可以參考

2.2.2 VOC2007數據集劃分與存放

數據集存放在如下機構中：

所有的數據標籤存放在：./data/VOCdevkit/VOC2007/Annotations
所有的圖片數據存放在：./data/VOCdevkit/VOC2007/JPEGImage

./data
└── VOCdevkit
    └── VOC2007
        ├── Annotations  # 標註的VOC格式的xml標籤文件
        ├── JPEGImages   # 數據集圖片
        ├── ImageSet
        │     └── Main
		│ 	     ├── test.txt   # 劃分的測試集
		│ 	     ├── train.txt   # 劃分的訓練集
		│        ├── trainval.txt
		│        └── val.txt   # 劃分的驗證集
        ├── cal_txt_data_num.py  # 用於統計text.txt、train.txt等數據集的個數
        └── split_dataset.py  # 數據集劃分腳本

1、數據集的劃分，使用：split_dataset.py腳本

腳本內容：

import os
import random

trainval_percent = 0.8
train_percent = 0.8
xmlfilepath = 'Annotations'
txtsavepath = 'ImageSets\Main'
total_xml = os.listdir(xmlfilepath)

num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

ftrainval = open('ImageSets/Main/trainval.txt', 'w')
ftest = open('ImageSets/Main/test.txt', 'w')
ftrain = open('ImageSets/Main/train.txt', 'w')
fval = open('ImageSets/Main/val.txt', 'w')

for i in list:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

執行完該腳本後，會在./data/VOCdevkit/VOC2007/ImageSets/Main目錄下，生成四個txt文件：

train.txt
trainval.txt
test.txt
val.txt

每個txt文件中存儲的都是圖片的名字（不含圖片名字的後綴.jpg），例如：trian.txt中的內容如下：

當然你也可以把數據放到其他目錄，然後使用軟連接的形式連接到./mmdetection/data目錄下()：

ln -s /HDD/VOCdevkit ./data # 就是把實體目錄VOCdevkit做一個鏈接放到 ./data目錄下

2、統計劃分數據集數據的個數，使用：cal_txt_data_num.py腳本

腳本內容：

import sys
import os

# 計算txt中有多少個數據，即有多上行

names_txt = os.listdir('./ImageSets/Main')
#print(names_txt)
for name_txt in names_txt:
    with open(os.path.join('./ImageSets/Main', name_txt)) as f:
        lines = f.readlines()
        print(('文件 %s'%name_txt).ljust(35) + ("共有數據：%d個"%len(lines)).ljust(50))

執行結果，如下（顯示了我數據集的劃分情況）：

文件 test.txt                        共有數據：1003個
文件 val.txt                         共有數據：802個
文件 train.txt                       共有數據：3206個
文件 trainval.txt                    共有數據：4008個

當然你也可以用coco格式數據集，但是需要把labelImg標註的xml標籤轉化一下（參考1，參考2 格式轉換）

至此，數據集的準備工作已經全部完成

3 訓練自己的數據集

在數據進行訓練前，需要先進行一些配置文件的修改工作

3.1 修改配置文件

3.1.1 修改模型配置文件

修改：./mmdetection/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py

faster_rcnn_r50_fpn_1x_coco.py腳本內容的原始內容

_base_ = [
    '../_base_/models/faster_rcnn_r50_fpn.py',
    '../_base_/datasets/coco_detection.py',
    '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]

faster_rcnn_r50_fpn_1x_coco.py腳本內容的修改如下（使用VOC數據格式）

_base_ = [
    '../_base_/models/faster_rcnn_r50_fpn.py',
    '../_base_/datasets/voc0712.py',
    '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]

之前是把模型的配置文件，和模型結構都是定義到faster_rcnn_r50_fpn_1x_coco.py這種腳本文件中，最近mmdetection項目代碼更新，只不過是做了更好的封裝，把這文件進行了拆分，放到了mmdetection/configs/_base_ 目錄下，所以要有些內容就要到_base_目錄下的文件中進行修改，_base_目錄結構如下：

_base_/
├── datasets  # 定義數據路徑等信息
│   ├── cityscapes_detection.py
│   ├── cityscapes_instance.py
│   ├── coco_detection.py
│   ├── coco_instance.py
│   ├── coco_instance_semantic.py
│   ├── voc0712.py
│   └── wider_face.py
├── default_runtime.py
├── models  # 定義模型的配置信息
│   ├── cascade_mask_rcnn_r50_fpn.py
│   ├── cascade_rcnn_r50_fpn.py
│   ├── faster_rcnn_r50_caffe_c4.py
│   ├── faster_rcnn_r50_fpn.py
│   ├── fast_rcnn_r50_fpn.py
│   ├── mask_rcnn_r50_caffe_c4.py
│   ├── mask_rcnn_r50_fpn.py
│   ├── retinanet_r50_fpn.py
│   ├── rpn_r50_caffe_c4.py
│   ├── rpn_r50_fpn.py
│   └── ssd300.py
└── schedules  # 定義訓練策略信息
    ├── schedule_1x.py
    ├── schedule_20e.py
    └── schedule_2x.py

../_base_/models/faster_rcnn_r50_fpn.py：定義模型文件
如果你想使用
../_base_/datasets/coco_detection.py：定義訓練數據路徑等
../_base_/schedules/schedule_1x.py：定義學習策略，例如leaning_rate、epoch等
../_base_/default_runtime.py：定義一些日誌等其他信息

3.1.2 修改訓練數據的配置文件

修改：./mmdetection/configs/_base_/datasets/voc712.py

因爲我們使用的是VOC2007數據，因此只要把其中含有VOC2012路徑註釋即可，修改後的內容如下：

# dataset settings
dataset_type = 'VOCDataset'
data_root = 'data/VOCdevkit/'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', img_scale=(1000, 600), keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1000, 600),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type='RepeatDataset',
        times=3,
        # dataset=dict(
        #     type=dataset_type,
        #     ann_file=[
        #         data_root + 'VOC2007/ImageSets/Main/trainval.txt',
        #         data_root + 'VOC2012/ImageSets/Main/trainval.txt'
        #     ],
        #     img_prefix=[data_root + 'VOC2007/', data_root + 'VOC2012/'],
        #     pipeline=train_pipeline)),
        # 把含有VOC2012的路徑去掉
        dataset=dict(
            type=dataset_type,
            ann_file=[
                data_root + 'VOC2007/ImageSets/Main/trainval.txt',
            ],
            img_prefix=[data_root + 'VOC2007/'],
            pipeline=train_pipeline)),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
        img_prefix=data_root + 'VOC2007/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
        img_prefix=data_root + 'VOC2007/',
        pipeline=test_pipeline))
evaluation = dict(interval=1, metric='mAP')

3.1.3 修改模型文件中的類別個數

修改：./mmdetection/configs/_base_/models/faster_rcnn_r50_fpn.py

因爲這裏使用的是VOC2007數據集，一共有20個類別，因此這裏把``faster_rcnn_r50_fpn.py第46行的num_classes`的值改爲20，根據自己的分類的個數，有多少類就改成多少，修改完如下所示：

model = dict(
    type='FasterRCNN',
    pretrained='torchvision://resnet50',
    backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(type='BN', requires_grad=True),
        norm_eval=True,
        style='pytorch'),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        num_outs=5),
    rpn_head=dict(
        type='RPNHead',
        in_channels=256,
        feat_channels=256,
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[.0, .0, .0, .0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
        loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
    roi_head=dict(
        type='StandardRoIHead',
        bbox_roi_extractor=dict(
            type='SingleRoIExtractor',
            roi_layer=dict(type='RoIAlign', out_size=7, sample_num=0),
            out_channels=256,
            featmap_strides=[4, 8, 16, 32]),
        bbox_head=dict(
            type='Shared2FCBBoxHead',
            in_channels=256,
            fc_out_channels=1024,
            roi_feat_size=7,
            num_classes=20,     # 把類別個數改成自己數據集的類別，如果是voc2007數據集就改成20
            bbox_coder=dict(
                type='DeltaXYWHBBoxCoder',
                target_means=[0., 0., 0., 0.],
                target_stds=[0.1, 0.1, 0.2, 0.2]),
            reg_class_agnostic=False,
            loss_cls=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
            loss_bbox=dict(type='L1Loss', loss_weight=1.0))))
# model training and testing settings
train_cfg = dict(
    rpn=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.7,
            neg_iou_thr=0.3,
            min_pos_iou=0.3,
            match_low_quality=True,
            ignore_iof_thr=-1),
        sampler=dict(
            type='RandomSampler',
            num=256,
            pos_fraction=0.5,
            neg_pos_ub=-1,
            add_gt_as_proposals=False),
        allowed_border=-1,
        pos_weight=-1,
        debug=False),
    rpn_proposal=dict(
        nms_across_levels=False,
        nms_pre=2000,
        nms_post=1000,
        max_num=1000,
        nms_thr=0.7,
        min_bbox_size=0),
    rcnn=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.5,
            neg_iou_thr=0.5,
            min_pos_iou=0.5,
            match_low_quality=False,
            ignore_iof_thr=-1),
        sampler=dict(
            type='RandomSampler',
            num=512,
            pos_fraction=0.25,
            neg_pos_ub=-1,
            add_gt_as_proposals=True),
        pos_weight=-1,
        debug=False))
test_cfg = dict(
    rpn=dict(
        nms_across_levels=False,
        nms_pre=1000,
        nms_post=1000,
        max_num=1000,
        nms_thr=0.7,
        min_bbox_size=0),
    rcnn=dict(
        score_thr=0.05, nms=dict(type='nms', iou_thr=0.5), max_per_img=100)
    # soft-nms is also supported for rcnn testing
    # e.g., nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05)
)

注意：

之前的代碼版本是，num_classes改成類別+1，也就把背景也算作一類，在mmdetection V2.0.0版本，背景不在作爲一類，因此不用再加1，有多少個類別就寫多少

3.1.4 修改測試時的標籤類別文件

修改：./mmdetection/mmdet/core/evaluation/class_names.py

修改mmdetection/mmdet/core/evaluation/class_names.py下的class_names.py中的voc_classes，將其改爲要訓練的數據集的類別名稱。如果不改的話，最後測試的結果的名稱還會是’aeroplane’, ‘bicycle’, ‘bird’, ‘boat’,…這些。因爲我使用的是voc2007因此可以不做改動，你可以根據自己的類別進行修改：

def voc_classes():
    return [
        'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat',
        'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
        'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'
    ]

注意：
如果只有一個類別，需要加上一個逗號，否則將會報錯，例如只有一個類別，如下：

def voc_classes():
    return ['aeroplane', ]

3.1.5 修改voc.py文件

修改：mmdetection/mmdet/datasets/voc.py

修改mmdetection/mmdet/datasets/voc.py下的voc.py中的CLASSES，將其改爲要訓練的數據集的類別名稱。如果不改的話，最後測試的結果的名稱還會是’aeroplane’, ‘bicycle’, ‘bird’, ‘boat’,…這些。因爲我使用的是voc2007因此可以不做改動，你可以根據自己的類別進行修改：

class VOCDataset(XMLDataset):

    CLASSES = ('aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car',
               'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse',
               'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train',
               'tvmonitor')

注意：
如果只有一個類別，需要加上一個逗號，否則將會報錯，例如只有一個類別，如下：

class VOCDataset(XMLDataset):

    CLASSES = ('aeroplane', )

提示錯誤：
1、IndentationError: unexpected indent
2、FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp32p_rtz7/tmp6f4slg8x.py'

從上面看提示了兩個錯誤，我一開始關注的是第二個錯誤，然後就沒有找到錯誤的原因，後面看到上面還有一個錯誤：IndentationError: unexpected indent，這種錯誤一般是由於空格和Tab空格混用導致的，看錯誤上面的dataset=dict(可以定位到這個錯誤的位置，然後進行修改。

因此建議，在vi編輯器中全部使用空格鍵進行縮進，最好是在編輯器中，例如Pycharm中改好在再替換，一般不會出現這種錯誤。

到此爲止，環境的搭建、數據的準備、配置文件的修改已經全部準備完畢，下面就讓我們開始訓練吧

3.2 開始訓練模型

3.2.1 快速開始訓練

1、訓練命令：

python python tools/train.py ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py

我是在VOC2007數據集上，2080Ti（11G顯存，實際使用顯存大概4G）的顯卡上訓練12epoch，訓練時間3個小時

結果信息：

+-------------+-----+------+--------+-------+
| class       | gts | dets | recall | ap    |
+-------------+-----+------+--------+-------+
| aeroplane   | 48  | 113  | 0.708  | 0.646 |
| bicycle     | 74  | 145  | 0.824  | 0.773 |
| bird        | 89  | 205  | 0.640  | 0.524 |
| boat        | 75  | 175  | 0.587  | 0.464 |
| bottle      | 99  | 171  | 0.545  | 0.418 |
| bus         | 41  | 111  | 0.732  | 0.573 |
| car         | 247 | 370  | 0.854  | 0.799 |
| cat         | 84  | 198  | 0.869  | 0.735 |
| chair       | 181 | 355  | 0.470  | 0.359 |
| cow         | 53  | 174  | 0.849  | 0.577 |
| diningtable | 33  | 158  | 0.818  | 0.547 |
| dog         | 99  | 352  | 0.919  | 0.739 |
| horse       | 76  | 213  | 0.868  | 0.762 |
| motorbike   | 74  | 182  | 0.905  | 0.813 |
| person      | 881 | 1477 | 0.829  | 0.757 |
| pottedplant | 127 | 187  | 0.425  | 0.323 |
| sheep       | 76  | 153  | 0.579  | 0.491 |
| sofa        | 48  | 214  | 0.833  | 0.547 |
| train       | 54  | 140  | 0.759  | 0.673 |
| tvmonitor   | 58  | 117  | 0.690  | 0.582 |
+-------------+-----+------+--------+-------+
| mAP         |     |      |        | 0.605 |
+-------------+-----+------+--------+-------+
2020-05-27 14:09:53,057 - mmdet - INFO - Epoch [12][6013/6013]  lr: 0.00020, mAP: 0.6050

2、訓練完成在工作目錄下生成模型文件和日誌文件

訓練完成之後，訓練的模型文件和日誌文件等會被保存在./mmdetection/work_dir目錄下（work_dir目錄不指定會自動創建，也可以用參數--work-dir自己指定）：

work_dirs/
└── faster_rcnn_r50_fpn_1x_coco
    ├── 20200527_105051.log
    ├── 20200527_105051.log.json
    ├── epoch_10.pth
    ├── epoch_11.pth
    ├── epoch_12.pth
    ├── epoch_1.pth
    ├── epoch_2.pth
    ├── epoch_3.pth
    ├── epoch_4.pth
    ├── epoch_5.pth
    ├── epoch_6.pth
    ├── epoch_7.pth
    ├── epoch_8.pth
    ├── epoch_9.pth
    ├── latest.pth -> epoch_12.pth
    └── faster_rcnn_r50_fpn_1x_coco.py  # 把之前的列表中的三個文件的代碼都寫到這個文件中

從上面生成的文件可以看出：每訓練完一輪都會保存一個epoch_x.pth模型，最新的也是最終的模型會被保存爲latest.pth，同時會生成兩個日誌文件：

20200527_105051.log 日誌內容就是訓練時輸出的信息：
20200527_105051.log.json 日誌內容是訓練過程中的損失、學習率、精度等信息，主要是爲了數據可視化展示，方便調試：

注意1：
在訓練的過程中程序終止，報錯：IndexError: list index out of range
，這個錯誤是由於類別（num_classes）沒有修改導致的，同時類別的修改也發生變化，現在的類別已經不包括背景（background）

注意2：
這次我使用的是自己的數據集進行訓練，一共兩類：

hard_hat
other

訓練的時候沒有報錯，但是出現異常，所有的訓練損失都變成了0，每一輪的label也不是自己設置的label，而變成了VOC的label，這個問題是因爲少了一個逗號，我也不知道爲什麼會因爲一個逗號而引發一場血案：在/mmdetection/configs/_base_/datasets/voc0712.py中的'VOC2007/ImageSets/Main/trainval.txt'後的逗號一定要加上，否則會報同樣的錯誤。

異常信息：

問題解決： 把後面的逗號補上

dataset=dict(
            type=dataset_type,
            ann_file=[
                data_root + 'VOC2007/ImageSets/Main/trainval.txt', ],
            img_prefix=[data_root + 'VOC2007/'],

3.2.2 訓練命令中的指定參數

訓練命令：

python python tools/train.py ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py

--work-dir：指定訓練保存模型和日誌的路徑
--resume-from：從預訓練模型chenkpoint中恢復訓練
--no-validate：訓練期間不評估checkpoint
--gpus：指定訓練使用GPU的數量（僅適用非分佈式訓練）
--gpu-ids：指定使用哪一塊GPU（僅適用非分佈式訓練）
--seed：隨機種子
--deterministic：是否爲CUDNN後端設置確定性選項
--options： arguments in dict
--launcher： {none,pytorch,slurm,mpi} job launcher
--local_rank： LOCAL_RANK
--autoscale-lr： automatically scale lr with the number of gpus

加其他參數的訓練命令：

1、自己指定模型保存路徑

python python tools/train.py ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py --work-dir my_faster

2、指定GPU數量

python python tools/train.py ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py --gpus 1 --no-validate --work-dir my_faster

3.3 在自己的預訓練模型上進行測試

3.3.1 測試命令1，使用測試腳本`test.py`

測試命令

tools/test.py configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py ./work_dirs/my_faster_rcnn_r50_fpn_1x_coco/latest.pth --out ./result.pkl

configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py：是模型文件
./work_dirs/my_faster_rcnn_r50_fpn_1x_coco/latest.pth：是我們自己訓練保存的模型
./result.pkl：生成一個result.pkl文件，大小1.2M，該文件中會保存各個類別對應的信息，用於計算AP

如下是我測試的結果顯示（test.txt 測試集圖片有1003張）：

(mmdetection) zpp@estar-cvip:/HDD/zpp/shl/mmdetection$ python tools/test.py configs/faster_rcnn/my_faster_rcnn_r50_fpn_1x_coco.py ./work_dirs/my_faster_rcnn_r50_fpn_1x_coco/latest.pth --out ./result.pkl
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 1003/1003, 7.5 task/s, elapsed: 134s, ETA:     0s
writing results to ./result.pkl

測試的其他參數：

MMDet test (and eval) a model

positional arguments:
  config                test config file path
  checkpoint            checkpoint file

optional arguments:
  -h, --help            show this help message and exit
  --out OUT             output result file in pickle format
  --fuse-conv-bn        Whether to fuse conv and bn, this will slightly
                        increasethe inference speed
  --format-only         Format the output results without perform evaluation.
                        It isuseful when you want to format the result to a
                        specific format and submit it to the test server
  --eval EVAL [EVAL ...]
                        evaluation metrics, which depends on the dataset,
                        e.g., "bbox", "segm", "proposal" for COCO, and "mAP",
                        "recall" for PASCAL VOC
  --show                show results
  --show-dir SHOW_DIR   directory where painted images will be saved
  --show-score-thr SHOW_SCORE_THR
                        score threshold (default: 0.3)
  --gpu-collect         whether to use gpu to collect results.
  --tmpdir TMPDIR       tmp directory used for collecting results from
                        multiple workers, available when gpu-collect is not
                        specified
  --options OPTIONS [OPTIONS ...]
                        arguments in dict
  --launcher {none,pytorch,slurm,mpi}
                        job launcher
  --local_rank LOCAL_RANK

使用--show-dir 參數，可以把測試的檢測圖片檢測結果保存到指定文件夾中，如下命令：

python tools/test.py ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py ./work_dirs_hat_faster_rcnn/latest.pth --out ./result.pkl --show-dir test_hat_result

生成的測試結果圖片會被保存到test_hat_result/JPEGImages文件夾下，部分測試結果如下：

3.3.2 測試命令2，使用測試腳本`test_robustness.py`

測試命令：

python tools/test_robustness.py ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py ./work_dirs/faster_rcnn_r50_fpn_1x_coco/latest.pth --out ./result2.pkl

如下是我測試的結果顯示（test.txt 測試集圖片有1003張）：

(mmdetection) zpp@estar-cvip:/HDD/zpp/shl/mmdetection$ python tools/test_robustness.py ./configs/faster_rcnn/my_faster_rcnn_r50_fpn_1x_coco.py ./work_dirs/my_faster_rcnn_r50_fpn_1x_coco/latest.pth --out ./result2.pkl

Testing gaussian_noise at severity 0
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 1003/1003, 7.6 task/s, elapsed: 132s, ETA:     0s
Testing gaussian_noise at severity 1
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 1003/1003, 7.6 task/s, elapsed: 132s, ETA:     0s

test_robustness.py有很多其他的參數，如下：

positional arguments:
  config                test config file path
  checkpoint            checkpoint file

optional arguments:
  -h, --help            show this help message and exit
  --out OUT             output result file
  --corruptions {all,benchmark,noise,blur,weather,digital,holdout,None,gaussian_noise,shot_noise,impulse_noise,defocus_blur,glass_blur,motion_blur,zoom_blur,snow,frost,fog,brightness,contrast,elastic_transform,pixelate,jpeg_compression,speckle_noise,gaussian_blur,spatter,saturate} [{all,benchmark,noise,blur,weather,digital,holdout,None,gaussian_noise,shot_noise,impulse_noise,defocus_blur,glass_blur,motion_blur,zoom_blur,snow,frost,fog,brightness,contrast,elastic_transform,pixelate,jpeg_compression,speckle_noise,gaussian_blur,spatter,saturate} ...]
                        corruptions
  --severities SEVERITIES [SEVERITIES ...]
                        corruption severity levels
  --eval {proposal,proposal_fast,bbox,segm,keypoints} [{proposal,proposal_fast,bbox,segm,keypoints} ...]
                        eval types
  --iou-thr IOU_THR     IoU threshold for pascal voc evaluation
  --summaries SUMMARIES
                        Print summaries for every corruption and severity
  --workers WORKERS     workers per gpu
  --show                show results
  --tmpdir TMPDIR       tmp dir for writing some results
  --seed SEED           random seed
  --launcher {none,pytorch,slurm,mpi}
                        job launcher
  --local_rank LOCAL_RANK
  --final-prints {P,mPC,rPC} [{P,mPC,rPC} ...]
                        corruption benchmark metric to print at the end
  --final-prints-aggregate {all,benchmark}
                        aggregate all results or only those for benchmark
                        corruptions

3.3.3 計算類別的AP

之前計算AP的命令爲（參考）：python tools/voc_eval.py results.pkl ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py，在mmdetection v1.0.0的版本中還有voc_eval.py這個測試腳本，但是在mmdetection v2.0.0已經取消這個腳本，然後集成都了robustness_eval.py這個腳本中

1、新的計算AP腳本
因此，使用robustness_eval.py腳本計算AP的命令如下：

python tools/robustness_eval.py ./result.pkl --dataset voc --metric AP

2、繼續使用之前的腳本計算AP
此時你可以在./tools下新建一個my_voc_eval.py腳本，腳本如下：

'''
項目名稱：計算AP
創建時間：20200528
'''

__Author__ = "Shliang"
__Email__ = "[email protected]"


from argparse import ArgumentParser

import mmcv

from mmdet import datasets
from mmdet.core import eval_map


def voc_eval(result_file, dataset, iou_thr=0.5, nproc=4):
    det_results = mmcv.load(result_file)
    annotations = [dataset.get_ann_info(i) for i in range(len(dataset))]
    if hasattr(dataset, 'year') and dataset.year == 2007:
        dataset_name = 'voc07'
    else:
        dataset_name = dataset.CLASSES
    eval_map(
        det_results,
        annotations,
        scale_ranges=None,
        iou_thr=iou_thr,
        dataset=dataset_name,
        logger='print',
        nproc=nproc)


def main():
    parser = ArgumentParser(description='VOC Evaluation')
    parser.add_argument('result', help='result file path')
    parser.add_argument('config', help='config file path')
    parser.add_argument(
        '--iou-thr',
        type=float,
        default=0.5,
        help='IoU threshold for evaluation')
    parser.add_argument(
        '--nproc',
        type=int,
        default=4,
        help='Processes to be used for computing mAP')
    args = parser.parse_args()
    cfg = mmcv.Config.fromfile(args.config)
    test_dataset = mmcv.runner.obj_from_dict(cfg.data.test, datasets)
    voc_eval(args.result, test_dataset, args.iou_thr, args.nproc)


if __name__ == '__main__':
    main()

輸入計算AP的命令：

python tools/my_voc_eval.py ./result.pkl ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py

運行結果：

3.4 用自己訓練的模型在圖片和視頻上做測試

3.4.1 用自己訓練的模型在圖片上做測試

測試的腳本爲：./mmdetection/demo/image_demo.py

from argparse import ArgumentParser

from mmdet.apis import inference_detector, init_detector, show_result_pyplot


def main():
    parser = ArgumentParser()
    parser.add_argument('img', help='Image file')
    parser.add_argument('config', help='Config file')
    parser.add_argument('checkpoint', help='Checkpoint file')
    parser.add_argument(
        '--device', default='cuda:0', help='Device used for inference')
    parser.add_argument(
        '--score-thr', type=float, default=0.3, help='bbox score threshold')
    args = parser.parse_args()

    # build the model from a config file and a checkpoint file
    model = init_detector(args.config, args.checkpoint, device=args.device)
    # test a single image
    result = inference_detector(model, args.img)
    # show the results
    show_result_pyplot(model, args.img, result, score_thr=args.score_thr)


if __name__ == '__main__':
    main()

然後在命令行中輸入命令進行測試：

python image_demo.py ../test.jpg ../configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py ../work_dirs/faster_rcnn_r50_fpn_1x_coco/latest.pth

原圖：

測試結果圖：

3.4.2 用自己訓練的模型在視頻上做測試

參考1： # 內容比較詳細
參考2：https://blog.csdn.net/syysyf99/article/details/96574325
參考3：https://blog.csdn.net/weicao1990/article/details/93484603

♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠

最新版本的mmdetection2.0 （v2.0.0版本）環境搭建、訓練自己的數據集、測試以及常見錯誤集合

最新版本的mmdetection2.0 （v2.0.0）環境搭建、訓練自己的數據集、測試及常見錯誤集合文章目錄：