win10+Faster-RCNN-TensorFlow-Python3訓練自己的數據集並可視化loss和p-r曲線

1. 下載源碼地址:https://github.com/dBeker/Faster-RCNN-TensorFlow-Python3,下載預訓練模型VGG16網絡和其他你感興趣的網絡,下載地址: https://github.com/tensorflow/models/tree/master/research/slim#pre-trained-models.
在data文件夾下新建imagenet_weights文件夾,將下載好的網絡放到此文件夾下。注意要將vgg_16.ckpt重命名爲vgg16.ckpt
 

2.製作voc數據集,放在Faster-RCNN-TensorFlow-Python3-master\data\VOCdevkit2007文件夾下

3.修改參數並訓練

3.1 訓練之前修改lib/config/config.py下的代碼,在第30行的max_iters中將40000修改成10000,能節省點時間。(‘snapshot_iterations’, 1000, “Iteration to take snapshot”),將5000改成1000,batch_size根據GPU修改

3.2在...\lib\datasets目錄下,有pascal_voc.py文件,這個是必須要更改的,代碼中 self._classes要來指定識別的類別,在33行修改:將代碼中的類別替換爲自己數據集的類別(不要更改'__background__'!!例如我需要分類的類別爲card1,則選擇一個原有類別更換爲“card1”)

3.3 修改完參數後運行Faster-RCNN-TensorFlow-Python3.5-master\train.py即可訓練。

訓練結果被保存到了Faster-RCNN-TensorFlow-Python3.5-master\default\voc_2007_trainval\default

注:如果要再次進行訓練,需要把Faster-RCNN-TensorFlow-Python3.5-master\default\voc_2007_trainval\default和Faster-RCNN-TensorFlow-Python3.5-master\output\vgg16\voc_2007_trainval\default路徑下之前訓練產生的模型和data/cache路徑下的cache刪掉
 

3.4 訓練過程中出錯:報錯No module named ‘lib.utils.cython_bbox’

解決流程
3.4.1 修改Faster-RCNN-TensorFlow-Python3\data/coco/PythonAPI/setup.py文件:在第15行加上
,
    Extension( 'lib.utils.cython_bbox',
               sources=['../../../lib/utils/bbox.c','../../../lib/utils/bbox.pyx'],
               include_dirs = [np.get_include(), '/lib/utils'], 
               extra_compile_args=[], )

3.4.2. 由於沒有bbox.c和blob.py文件。所以要先在Faster-RCNN-TensorFlow-Python3\lib\utils執行

python setup.py build_ext --inplace
生成cython_bbox.c和cython_bbox.pyx,然後將這兩個改名爲bbox.c和bbox.pyx。


3.4.3. 再在./data/coco/PythonAPI下面運行
 python setup.py build_ext --inplace
 python setup.py build_ext install
 

4. 測試並可視化訓練loss

4.1 首先進入demo.py文件,修改幾處代碼:
1)39行的vgg16_faster_rcnn_iter_70000.ckpt修改爲vgg16_faster_rcnn_iter_20000.ckpt
2)由於本文是復現VGG16模型,所以將demo.py的第104行原來默認的res101,改成我們現在用的vgg16。


3)106行的default='pascal_voc_0712’改爲pascal_voc

4)在根目錄下新建output/vgg16/voc_2007_trainval/default,將訓練好的第20000次的模型放入此文件夾中,結構如圖:

4.2 可視化loss

4.2.1 在train.py文件裏的def train(self):函數裏添加的代碼如下,在train.py同一層級目錄下新建write_loss.txt文檔,total loss寫入該文檔

 filename = './write_loss.txt'#添加的代碼
        while iter < cfg.FLAGS.max_iters + 1:
            # Learning rate
            if iter == cfg.FLAGS.step_size + 1:
                # Add snapshot here before reducing the learning rate
                # self.snapshot(sess, iter)
                sess.run(tf.assign(lr, cfg.FLAGS.learning_rate * cfg.FLAGS.gamma))

            timer.tic()
            # Get training data, one batch at a time
            blobs = self.data_layer.forward()

            # Compute the graph without summary
            try:
                rpn_loss_cls, rpn_loss_box, loss_cls, loss_box, total_loss = self.net.train_step(sess, blobs, train_op)
            except Exception:
                # if some errors were encountered image is skipped without increasing iterations
                print('image invalid, skipping')
                continue

            timer.toc()
            iter += 1
            
            
            if iter % (cfg.FLAGS.snapshot_iterations) == 0:
                self.snapshot(sess, iter )
             
            # Display training information
            if iter % (cfg.FLAGS.display) == 0: 
            #添加的代碼
                fw = open(filename,'a')  
                fw.write(str(int(iter)) + ' '+ str(float('%.4f' % total_loss))+"\n")    
                fw.close()
            #添加結束
                print('iter: %d / %d, total loss: %.6f\n >>> rpn_loss_cls: %.6f\n '
                      '>>> rpn_loss_box: %.6f\n >>> loss_cls: %.6f\n >>> loss_box: %.6f\n ' % \
                      (iter, cfg.FLAGS.max_iters, total_loss, rpn_loss_cls, rpn_loss_box, loss_cls, loss_box))
                print('speed: {:.3f}s / iter'.format(timer.average_time))

4.2.2

import random
import numpy as np
import matplotlib.pyplot as plt


y_ticks = [0,0, 0.5, 1.0, 2.0, 3.0, 4.0, 5.0]  # 縱座標的值,可以自己設置。
data_path = 'E:\\yolodaima\\Faster-RCNN-TensorFlow-Python3-master-vgg16\\write_loss-20000.txt'  # log_loss的路徑。
result_path = 'E:\\yolodaima\\Faster-RCNN-TensorFlow-Python3-master-vgg16\\total_loss'  # 保存結果的路徑。

data1_loss =np.loadtxt(data_path)
x = data1_loss[:,0]   #冒號左邊是行範圍,冒號右邊列範圍。取第一列
y = data1_loss[:,1]   #取第2列


################開始畫圖
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.plot(x, y, label='total_loss')
plt.yticks(y_ticks)  # 如果不想自己設置縱座標,可以註釋掉。
#plt.grid()
ax.legend(loc='best')
ax.set_title('The loss curves')
ax.set_xlabel('batches')
fig.savefig(result_path)

5. 畫p-r曲線和計算ap

5.1 lib/datasets/passcal_voc.py:

passcal_voc在開頭加入這幾句

import matplotlib.pyplot as plt
from sklearn.metrics import precision_recall_curve
from itertools import cycle
import pylab as pl
 

   def _do_python_eval(self, output_dir='output'):
        annopath = self._devkit_path + '\\VOC' + self._year + '\\Annotations\\' + '{:s}.xml'
        imagesetfile = os.path.join(
            self._devkit_path,
            'VOC' + self._year,
            'ImageSets',
            'Main',
            self._image_set + '.txt')
        cachedir = os.path.join(self._devkit_path, 'annotations_cache')
        aps = []
        #添加
        recs=[]
        precs=[]
        #結束
        # The PASCAL VOC metric changed in 2010
        use_07_metric = True if int(self._year) < 2010 else False
        print('VOC07 metric? ' + ('Yes' if use_07_metric else 'No'))
        if not os.path.isdir(output_dir):
            os.mkdir(output_dir)
        for i, cls in enumerate(self._classes):
            if cls == '__background__':
                continue
            filename = self._get_voc_results_file_template().format(cls)
            rec, prec, ap = voc_eval(
                filename, annopath, imagesetfile, cls, cachedir, ovthresh=0.5,
                use_07_metric=use_07_metric)
            aps += [ap]
            #添加代碼,這裏的rec和prec是由voc_eval.py得到
            pl.plot(rec, prec, lw=2,
                    label='Precision-recall curve of class {} (area = {:.4f})'
                          ''.format(cls, ap))
            print(('AP for {} = {:.4f}'.format(cls, ap)))
            with open(os.path.join(output_dir, cls + '_pr.pkl'), 'wb') as f:
                pickle.dump({'rec': rec, 'prec': prec, 'ap': ap}, f)
        #畫p-r曲線圖
        pl.xlabel('Recall')
        pl.ylabel('Precision')
        plt.grid(True)
        pl.ylim([0.0, 1.2])
        pl.xlim([0.0, 1.0])
        pl.title('Precision-Recall')
        pl.legend(loc="upper right")
        plt.show()
        print(('Mean AP = {:.4f}'.format(np.mean(aps))))
        print('~~~~~~~~')
        print('Results:')
        for ap in aps:
            print(('{:.3f}'.format(ap)))
        print(('{:.3f}'.format(np.mean(aps))))
        print('~~~~~~~~')
        print('')
        print('--------------------------------------------------------------')
        print('Results computed with the **unofficial** Python eval code.')
        print('Results should be very close to the official MATLAB eval code.')
        print('Recompute with `./tools/reval.py --matlab ...` for your paper.')
        print('-- Thanks, The Management')
        print('--------------------------------------------------------------')

 

 

這個函數註釋一部分:生成每類的預測框的文本文件:包括圖片名,置信度,四個座標值

   def evaluate_detections(self, all_boxes, output_dir):
        self._write_voc_results_file(all_boxes)

        self._do_python_eval(output_dir)
        if self.config['matlab_eval']:
            self._do_matlab_eval(output_dir)
        #if self.config['cleanup']:
        #    for cls in self._classes:
        #        if cls == '__background__':
        #            continue
        #        filename = self._get_voc_results_file_template().format(cls)
        #       os.remove(filename)

 

5.2 voc_eval.py文件中做如下更改:

def parse_rec(filename):#讀取標註xml文件
    """ Parse a PASCAL VOC xml file """
    tree = ET.parse(''+filename)
    objects = []#./data/VOCdevkit2007/VOC2007/Annotations/
    for obj in tree.findall('object'):
        obj_struct = {}
        obj_struct['name'] = obj.find('name').text
        obj_struct['pose'] = obj.find('pose').text
        obj_struct['truncated'] = int(obj.find('truncated').text)
        obj_struct['difficult'] = int(obj.find('difficult').text)
        bbox = obj.find('bndbox')
        obj_struct['bbox'] = [int(bbox.find('xmin').text),
                              int(bbox.find('ymin').text),
                              int(bbox.find('xmax').text),
                              int(bbox.find('ymax').text)]
        objects.append(obj_struct)

    return objects

 

5.3在faster-rcnn-tensorflow-python3.5-master文件夾下新建test_net.py

# !/usr/bin/env python

# --------------------------------------------------------
# Tensorflow Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Xinlei Chen, based on code from Ross Girshick
# --------------------------------------------------------

"""
Demo script showing detections in sample images.
See README.md for installation instructions before running.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import os

import tensorflow as tf
from lib.nets.vgg16 import vgg16
from lib.datasets.factory import get_imdb
from lib.utils.test import test_net


NETS = {'vgg16': ('vgg16_faster_rcnn_iter_20000.ckpt',)}  # 自己需要修改:訓練輸出模型
DATASETS = {'pascal_voc': ('voc_2007_trainval',), 'pascal_voc_0712': ('voc_2007_trainval+voc_2012_trainval',)}
def parse_args():
    """Parse input arguments."""
    parser = argparse.ArgumentParser(description='Tensorflow Faster R-CNN test')
    parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16 res101]',
                        choices=NETS.keys(), default='vgg16')
    parser.add_argument('--dataset', dest='dataset', help='Trained dataset [pascal_voc pascal_voc_0712]',
                        choices=DATASETS.keys(), default='pascal_voc')
    args = parser.parse_args()
    return args
if __name__ == '__main__':
    args = parse_args()
    # model path
    demonet = args.demo_net
    dataset = args.dataset
    tfmodel = os.path.join('output', demonet, DATASETS[dataset][0], 'default', NETS[demonet][0])  # 模型路徑
    # 獲得模型文件名稱
    filename = (os.path.splitext(tfmodel)[0]).split('\\')[-1]
    filename = 'default' + '/' + filename
    imdb = get_imdb("voc_2007_test")  # 得到
    imdb.competition_mode('competition mode')
    if not os.path.isfile(tfmodel + '.meta'):
        print(tfmodel)
        raise IOError(('{:s} not found.\nDid you download the proper networks from '
                       'our server and place them properly?').format(tfmodel + '.meta'))
    # set config
    tfconfig = tf.ConfigProto(allow_soft_placement=True)
    tfconfig.gpu_options.allow_growth = True
    # init session
    sess = tf.Session(config=tfconfig)
    # load network
    if demonet == 'vgg16':
        net = vgg16(batch_size=1)
    # elif demonet == 'res101':
    # net = resnetv1(batch_size=1, num_layers=101)
    else:
        raise NotImplementedError
    net.create_architecture(sess, "TEST", 2,  # 自己需要修改:類別數量+1
                            tag='default', anchor_scales=[8, 16, 32])
    saver = tf.train.Saver()
    saver.restore(sess, tfmodel)
    print('Loaded network {:s}'.format(tfmodel))
    print(filename)
    test_net(sess, net, imdb, filename, max_per_image=100)
    sess.close()

 

最後終端運行python test_net.py即可畫出p-r曲線並計算出每類的ap。

切記,改變測試圖像評估模型時候,記得刪除annots.pkl文件。

默認在.\data\VOCdevkit2007\annotations_cache下

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章