網上參考了很多大牛的博客，很多說訓練自己的數據集，其實是在用官方的API訓練公開的PASCAL VOC2012或者PASCAL VOC2007數據集。

根據項目需求，我們自定義了標籤數據集，按照Pascal_VOC的格式進行製作，這樣可以大大的減少工作量。也可以實用官方的腳本。

Pre：Tensorflow環境安裝+Object Detection API的安裝（網上教程很多，稍後補上教程）

環境：訓練環境Ubuntu16.04+GTX1080Ti+python3.5+tensorflow1.4.0

標記製作環境：windows

一、準備數據集

1、物體識別所需PASCAL_VOC數據集格式解釋：

1)JPEGImages文件夾
文件夾裏包含了訓練圖片和測試圖片，混放在一起

2)Annatations文件夾

文件夾存放的是xml格式的標籤文件，每個xml文件都對應於JPEGImages文件夾的一張圖片
3)ImageSets文件夾
Main存放的是圖像物體識別的數據，裏面有XX_train.txtXX_val.txtMain。在本次使用API時候用不到XX_train.txt這些文件夾，只需要裏面的test.txt , train.txt, val.txt ,trainval.txt.這四個文件我們後面會生成。

因此只需要建立VOC文件夾，裏面分別建立上述三個文件夾，ImageSets裏建立Main就行了

2、搞定JPEGSImages文件夾

把圖片放到JPEGSImages裏面，在VOC裏面，圖片文件名都是2007_000001.jpg類似這樣的，我們也統一格式，把我們的圖片名字重命名成這樣的。通過cv2讀取視頻，然後截取視頻幀，按照此規則進行保存。

抽取視頻幀代碼

import cv2
cap = cv2.VideoCapture("video.mp4")
c = 1
timeF = 100   #每間隔100幀保存一張圖片
tot =1
while True:
    rval,frame = cap.read()
    if(c % timeF == 0 ):
        print('tot=',tot)
        cv2.imwrite('out/'+str(tot).zfill(6)+'.jpg',frame)
        tot = tot + 1
    c+=1
    cv2.waitKey(1)
cap.release()

3、搞定Annatations文件夾

網上很多教程，但是我覺得都很麻煩，手動標註，會自動生成圖片信息的xml文件
1)在這裏下載：https://tzutalin.github.io/labelImg/，至於怎麼用相信你打開就知道了
2)保存的路徑就是我們的Annatations文件夾，別保存別的地方去了，，，
3)一張張的慢慢畫框。。。。。。。。。

注意：在windows下進行標記，發現製作好的xml文件中的【filename】和標準VOC數據集中有一定的出入，沒有.jpg的後綴名，解決：可以批量進行修改，也可以在後面create_pascal_tf_record時候進行修改代碼。

4、搞定ImageSets文件夾中的Main文件夾中的四個文件

import os
import random

trainval_percent = 0.66
train_percent = 0.5
xmlfilepath = 'Annotations'
txtsavepath = 'ImageSets\Main'
total_xml = os.listdir(xmlfilepath)
print(total_xml)
num=len(total_xml)
list=range(num)
tv=int(num*trainval_percent)
tr=int(tv*train_percent)
trainval= random.sample(list,tv)
train=random.sample(trainval,tr)

ftrainval = open('ImageSets/Main/trainval.txt', 'w')
ftest = open('ImageSets/Main/test.txt', 'w')
ftrain = open('ImageSets/Main/train.txt', 'w')
fval = open('ImageSets/Main/val.txt', 'w')

for i  in list:
    name=total_xml[i][:-4]+'\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest .close()

至此，我們自己按照標準VOC數據集樣例的製作已經完成

製作好的數據集在各個環境下是通用的。我們轉到Ubuntu環境下進行訓練

二、訓練環境準備

1、訓練環境文件結構

根目錄新建一個my_train，裏面文件目錄如下：
1）、dataset:存放訓練數據集
my_VOC：是我剛剛自定義的數據集文件，copy到dataset目錄下。
create_pascal_tf_record.py：這是官方API提供的將標準VOC文件轉爲tfrecord的腳本，我們需要對此進行修改。
複製 cp {...model的路徑}/models/research/object_detection/create_pascal_tf_record.py ./
pascal_label_map.pbtxt：這是需要識別的物體標籤，我們需要自定義修改
複製 cp {...model的路徑}/models/research/object_detection/data/pascal_local_map.pbtxt ./

2）、複製一系列文件

其實複製文件的目的是保持原有API的完整性

ssd_inception_v2_coco.config : 模型的配置文件
train.py：啓動模型訓練的腳本

eval.py ：驗證的腳本

export_inference_graph.py ：導出訓練圖模型成pb文件使用

utils：相關工具腳本

PS:從新的github倉庫裏的model模型，已經對目錄結構進行修改了，object_detection放在research中的。

3）、下載複製ssd_inception_v2預訓練模型
my_train目錄下新建models文件夾，將ssd_inception解壓到models中。

4）、新建record文件夾，用於存放TFrecord數據。

2、針對具體需求環境進行修改相關文件

1）、修改dataset中的create_pascal_tf_record.py
這個關係到是否能夠將pascal_voc 轉爲tf_record
修改的地方在代碼裏進行了中文註釋

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import hashlib
import io
import logging
import os

from lxml import etree
import PIL.Image
import tensorflow as tf

from object_detection.utils import dataset_util
from object_detection.utils import label_map_util

#執行參數，我們可以在這裏修改，也可以在執行時候帶上參數修改，建議帶上參數修改
flags = tf.app.flags
flags.DEFINE_string('data_dir', '', 'Root directory to raw PASCAL VOC dataset.')
flags.DEFINE_string('set', 'train', 'Convert training set, validation set or '
                    'merged set.')
flags.DEFINE_string('annotations_dir', 'Annotations',
                    '(Relative) path to annotations directory.')
flags.DEFINE_string('year', 'VOC2007', 'Desired challenge year.')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
flags.DEFINE_string('label_map_path', 'data/pascal_label_map.pbtxt',
                    'Path to label map proto')
flags.DEFINE_boolean('ignore_difficult_instances', False, 'Whether to ignore '
                     'difficult instances')
FLAGS = flags.FLAGS

SETS = ['train', 'val', 'trainval', 'test']
#增加我們自己的數據集my_VOC
YEARS = ['VOC2007', 'VOC2012', 'my_VOC']


def dict_to_tf_example(data,
                       dataset_directory,
                       label_map_dict,
                       ignore_difficult_instances=False,
                       image_subdirectory='JPEGImages'):
  #確定照片的路徑，這個我調試了很久，一直找未找此路徑的文件，建議將路徑輸出，看看是否正確。
  #這裏有個大坑，官方的XML標註裏面，filename字段是後面有文件類型的，但是用labelImg標註是沒有的
  #我們在img_path裏手動拼接   +'.jpg'  
  img_path = os.path.join('my_VOC',data['folder'], data['filename']+'.jpg')
  full_path = os.path.join(dataset_directory, img_path)
  #手動輸入查看路徑是否正確
  print('full_path',full_path)
  with tf.gfile.GFile(full_path, 'rb') as fid:
    encoded_jpg = fid.read()
  encoded_jpg_io = io.BytesIO(encoded_jpg)
  image = PIL.Image.open(encoded_jpg_io)
  if image.format != 'JPEG':
    raise ValueError('Image format not JPEG')
  key = hashlib.sha256(encoded_jpg).hexdigest()

  width = int(data['size']['width'])
  height = int(data['size']['height'])

  xmin = []
  ymin = []
  xmax = []
  ymax = []
  classes = []
  classes_text = []
  truncated = []
  poses = []
  difficult_obj = []
  for obj in data['object']:
    difficult = bool(int(obj['difficult']))
    if ignore_difficult_instances and difficult:
      continue

    difficult_obj.append(int(difficult))

    xmin.append(float(obj['bndbox']['xmin']) / width)
    ymin.append(float(obj['bndbox']['ymin']) / height)
    xmax.append(float(obj['bndbox']['xmax']) / width)
    ymax.append(float(obj['bndbox']['ymax']) / height)
    classes_text.append(obj['name'].encode('utf8'))
    classes.append(label_map_dict[obj['name']])
    truncated.append(int(obj['truncated']))
    poses.append(obj['pose'].encode('utf8'))

  example = tf.train.Example(features=tf.train.Features(feature={
      'image/height': dataset_util.int64_feature(height),
      'image/width': dataset_util.int64_feature(width),
      'image/filename': dataset_util.bytes_feature(
          data['filename'].encode('utf8')),
      'image/source_id': dataset_util.bytes_feature(
          data['filename'].encode('utf8')),
      'image/key/sha256': dataset_util.bytes_feature(key.encode('utf8')),
      'image/encoded': dataset_util.bytes_feature(encoded_jpg),
      'image/format': dataset_util.bytes_feature('jpeg'.encode('utf8')),
      'image/object/bbox/xmin': dataset_util.float_list_feature(xmin),
      'image/object/bbox/xmax': dataset_util.float_list_feature(xmax),
      'image/object/bbox/ymin': dataset_util.float_list_feature(ymin),
      'image/object/bbox/ymax': dataset_util.float_list_feature(ymax),
      'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
      'image/object/class/label': dataset_util.int64_list_feature(classes),
      'image/object/difficult': dataset_util.int64_list_feature(difficult_obj),
      'image/object/truncated': dataset_util.int64_list_feature(truncated),
      'image/object/view': dataset_util.bytes_list_feature(poses),
  }))
  return example


def main(_):
  if FLAGS.set not in SETS:
    raise ValueError('set must be in : {}'.format(SETS))
  if FLAGS.year not in YEARS:
    raise ValueError('year must be in : {}'.format(YEARS))

  data_dir = FLAGS.data_dir
  #新增我們的數據集到years中
  years = ['VOC2007','VOC2012','my_VOC']
  if FLAGS.year != 'merged':
    years = [FLAGS.year]
  print('data_dir=',data_dir)
  print('years=',years)
  writer = tf.python_io.TFRecordWriter(FLAGS.output_path)

  label_map_dict = label_map_util.get_label_map_dict(FLAGS.label_map_path)

  for year in years:
    logging.info('Reading from PASCAL %s dataset.', year)
    #修改成如下代碼，這裏只需要用到Main下面的train.txt，val.txt等4個文件
    #原來的代碼是用了官方下面的XX_train.txt等文件
    examples_path = os.path.join(data_dir, year, 'ImageSets', 'Main/'
                                  + FLAGS.set + '.txt')
    annotations_dir = os.path.join(data_dir, year, FLAGS.annotations_dir)
    examples_list = dataset_util.read_examples_list(examples_path)
    for idx, example in enumerate(examples_list):
      if idx % 100 == 0:
        logging.info('On image %d of %d', idx, len(examples_list))
      path = os.path.join(annotations_dir, example + '.xml')
      with tf.gfile.GFile(path, 'r') as fid:
        xml_str = fid.read()
      xml = etree.fromstring(xml_str)
      data = dataset_util.recursive_parse_xml_to_dict(xml)['annotation']

      tf_example = dict_to_tf_example(data, FLAGS.data_dir, label_map_dict,
                                      FLAGS.ignore_difficult_instances)
      writer.write(tf_example.SerializeToString())

  writer.close()


if __name__ == '__main__':
  tf.app.run()

2)、修改pascal_label_map.pbtxt
這個裏面是根據我們開始標記的類型進行修改，樣例如下

item {
  id: 1
  name: 'car'
}

item {
  id: 2
  name: 'suv'
}

3）、執行生成TFrecord文件，生成pascal_train.record、pascal_val.record

dell@dell-PowerEdge-T630:~/my_train$ python3 dataset/create_pascal_tf_record.py \
--data_dir=/home/dell/my_train/dataset \
--year=my_VOC \
--set=train \
--output_path=/home/dell/my_train/record/pascal_train.record \
--label_map_path=/home/dell/my_train/dataset/pascal_label_map.pbtxt

dell@dell-PowerEdge-T630:~/my_train$ python3 dataset/create_pascal_tf_record.py \
--data_dir=/home/dell/my_train/dataset \
--year=my_VOC \
--set=val \
--output_path=/home/dell/my_train/record/pascal_val.record \
--label_map_path=/home/dell/my_train/dataset/pascal_label_map.pbtxt

4）、修改模型配置文件 ssd_inception_v2_coco.config

具體修改地方中文註釋

model {
  ssd {
    num_classes: 12    #根據你的pascal_label_map的數量進行修改
    …………
train_config: {
  batch_size: 24
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  #此處爲預加載模型的位置
  fine_tune_checkpoint: "models/ssd_inception_v2_coco_11_06_2017/model.ckpt" 
  from_detection_checkpoint: true
  #訓練的步數
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    #訓練輸入的文件
    input_path: "record/pascal_train.record"
  }
 #自定義加載的標籤集 
  label_map_path: "dataset/pascal_label_map.pbtxt"
}

eval_config: {
  num_examples: 8000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    #驗證數據的文件
    input_path: "record/pascal_val.record"
  }
  #加載自定義的標籤
  label_map_path: "dataset/pascal_label_map.pbtxt"
  shuffle: false
  num_readers: 1
  num_epochs: 1
}

5）、修改train.py
主要是進行執行參數的修改，也可以在執行時候跟上參數，或者直接修改默認參數爲我們的參數

flags.DEFINE_string('train_dir', 'train',
                    'Directory to save the checkpoints and training summaries.')

flags.DEFINE_string('pipeline_config_path', 'ssd_inception_v2_coco.config',
                    'Path to a pipeline_pb2.TrainEvalPipelineConfig config '
                    'file. If provided, other configs are ignored')

flags.DEFINE_string('train_config_path', '',
                    'Path to a train_pb2.TrainConfig config file.')

三、執行訓練

在my_train目錄下執行。

nohup python3 train.py --logtostderr &

這樣邊將訓練掛在後臺運行，我們退出xshell也沒有影響

查看訓練詳情:nohup將輸出日誌定義到nohup.out中，通過tail -f nohup 查看日誌情況

通過tensorboard查看訓練曲線

待補充

Tensorflow Object Detection API訓練自己的數據集

一、準備數據集

1、物體識別所需PASCAL_VOC數據集格式解釋：

2、搞定JPEGSImages文件夾

3、搞定Annatations文件夾

4、搞定ImageSets文件夾中的Main文件夾中的四個文件

二、訓練環境準備

1、訓練環境文件結構

2、針對具體需求環境進行修改相關文件

三、執行訓練

自學編程兩個月，現在我月入 4 萬元

「實戰應用」如何用圖表控件LightningChart創建2D氣泡圖

百度安全多篇議題入選Blackhat Asia以硬技術發現“芯”問題

Google Chrome驅動程序 124.0.6367.62（正式版本）去哪下載？

python3將視頻流保存爲本地視頻文件

[學習筆記]套接字地址結構

Tensorflow Object Detection API訓練自己的數據集

Python3識別判斷圖片主要顏色並和顏色庫進行對比的方法

【POJ-2196】Specialized Four-Digit Numbers

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結