TensorFlow Objection Detection API使用教程

安裝參考官方教程

注意在安裝的時候需要將protoc升級到3.*版本,否則編譯將不能成功。可能報以下錯誤:

cannot import name 'preprocessor_pb2'
cannot import name string_int_label_map_pb2
Import "object_detection/protos/ssd.proto" was not found or had errors.

注意一定要先編譯object_detection/protos文件夾,否則報錯。

1. 訓練

1.1 製作lable_map.pbtxt文件

參考官方代碼,中間的過程需要自己修改

import pandas as pd

def create_labelmap(word_count_file="../data/sub_obj_word_count.txt",
                    labelmap_outfile="../data/labelmap.pbtxt"):
    """

    :param word_count_file: "../data/sub_obj_word_count.txt"
    :param labelmap_outfile:
    :return:
    """
    df = pd.read_csv(word_count_file, header=None,
                    names=["obj_name", "obj_cnt"])
    objects = df.obj_name.tolist()
    end = "\n"
    s = " "
    class_map = {}
    for id, name in enumerate(objects):
        out = ""
        out += "item" + s + "{" + end
        out += (s * 2 + "id:" + " " + (str(id + 1)) + end)
        out += (s * 2 + "name:" + " " + "\'" + name + "\'" + end)
        out += ("}" + end * 2)
        with open(labelmap_outfile, "a") as f:
            f.write(out)
        class_map[name] = id + 1

1.2 製作TFRecord文件

import tensorflow as tf

from object_detection.utils import dataset_util


flags = tf.app.flags
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS


def create_tf_example(example):
  # TODO(user): Populate the following variables from your example.
  height = None # Image height
  width = None # Image width
  filename = None # Filename of the image. Empty if image is not from file
  encoded_image_data = None # Encoded image bytes
  image_format = None # b'jpeg' or b'png'

  xmins = [] # List of normalized left x coordinates in bounding box (1 per box)
  xmaxs = [] # List of normalized right x coordinates in bounding box
             # (1 per box)
  ymins = [] # List of normalized top y coordinates in bounding box (1 per box)
  ymaxs = [] # List of normalized bottom y coordinates in bounding box
             # (1 per box)
  classes_text = [] # List of string class name of bounding box (1 per box)
  classes = [] # List of integer class id of bounding box (1 per box)

  tf_example = tf.train.Example(features=tf.train.Features(feature={
      'image/height': dataset_util.int64_feature(height),
      'image/width': dataset_util.int64_feature(width),
      'image/filename': dataset_util.bytes_feature(filename),
      'image/source_id': dataset_util.bytes_feature(filename),
      'image/encoded': dataset_util.bytes_feature(encoded_image_data),
      'image/format': dataset_util.bytes_feature(image_format),
      'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
      'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
      'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
      'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
      'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
      'image/object/class/label': dataset_util.int64_list_feature(classes),
  }))
  return tf_example


def main(_):
  writer = tf.python_io.TFRecordWriter(FLAGS.output_path)

  # TODO(user): Write code to read in your dataset to examples variable

  for example in examples:
    tf_example = create_tf_example(example)
    writer.write(tf_example.SerializeToString())

  writer.close()


if __name__ == '__main__':
  tf.app.run()

還可以將自己的標籤製作成csv文件,格式如下:

filename width height class xmin ymin xmax ymax
cam_image1.jpg 480 270 queen 173 24 260 137
cam_image1.jpg 480 270 queen 165 135 253 251
cam_image1.jpg 480 270,ten 255 96 337 208
cam_image10.jpg 960 540 ten 501 116 700 353
cam_image10.jpg 960 540 queen 261 124 453 370
cam_image11.jpg 960 540 nine 225 96 490 396
cam_image12.jpg 960 540 king 362 149 560 389
cam_image13.jpg 960 540 jack 349 142 550 388
cam_image14.jpg 960 540 jack 297 167 512 420
cam_image15.jpg 960 540 ace 367 181 589 457
cam_image16.jpg 960 540 ace 303 155 525 456

此時,需要得到三個文件:labelmap、train.csv, test.csv。然後用下面的程序來生成tfrecord文件:

"""
Usage:
  # From tensorflow/models/
  # Create train data:
  python generate_tfrecord.py --csv_input=images/train_labels.csv --image_dir=images/train_img --output_path=train.record

  # Create test data:
  python generate_tfrecord.py --csv_input=images/test_labels.csv  --image_dir=images/test_img --output_path=test.record
"""
from __future__ import division
from __future__ import print_function
from __future__ import absolute_import

import os
import io
import pandas as pd
import tensorflow as tf

from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict

flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('image_dir', '', 'Path to the image directory')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS


# TO-DO replace this with label map
def class_text_to_int(row_label):
    words =pd.read_csv("/home/jamesben/relationship_vrd/data/sub_obj_word_count.txt", header=None, names=["name", "freq"]).name.tolist()
    word2ix = {y: x for x, y in enumerate(words)}
    return word2ix[row_label]

def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]


def create_tf_example(group, path):
    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = group.filename.encode('utf8')
    image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    for index, row in group.object.iterrows():
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(class_text_to_int(row['class']))

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example


def main(_):
    writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
    path = os.path.join(os.getcwd(), FLAGS.image_dir)
    examples = pd.read_csv(FLAGS.csv_input)
    grouped = split(examples, 'filename')
    for group in grouped:
        tf_example = create_tf_example(group, path)
        writer.write(tf_example.SerializeToString())

    writer.close()
    output_path = os.path.join(os.getcwd(), FLAGS.output_path)
    print('Successfully created the TFRecords: {}'.format(output_path))


if __name__ == '__main__':
    tf.app.run()

然後分別使用上面註釋中的命令生成train.record和test.record文件。推薦該腳本來生成。

1.3 修改samples/configs/*.config文件

配置模型,訓練和輸入輸出參數。重點需要修改的是model中的num_classes, train_config中的fine_tune_checkpoint, 以及train_input_reader、eval_config、eval_input_reader、eval_input_reader。

model {
  faster_rcnn {
    num_classes: 100
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    feature_extractor {
      type: 'faster_rcnn_resnet101'
      first_stage_features_stride: 16
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 16
        width_stride: 16
      }
    }
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initializer {
          stddev: 0.01
        }
      }
    }
    first_stage_nms_score_threshold: 0.0
    first_stage_nms_iou_threshold: 0.7
    first_stage_max_proposals: 300
    first_stage_localization_loss_weight: 2.0
    first_stage_objectness_loss_weight: 1.0
    initial_crop_size: 14
    maxpool_kernel_size: 2
    maxpool_stride: 2
    second_stage_box_predictor {
      mask_rcnn_box_predictor {
        use_dropout: false
        dropout_keep_probability: 1.0
        fc_hyperparams {
          op: FC
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            variance_scaling_initializer {
              factor: 1.0
              uniform: true
              mode: FAN_AVG
            }
          }
        }
      }
    }
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.0
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 300
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
  }
}

train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  fine_tune_checkpoint: "test_ckpt/faster_rcnn_resnet101_coco_2018_01_28/model.ckpt"
  from_detection_checkpoint: true
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "object_detection/vrd_tfrecord/vrd_train.record"
  }
  label_map_path: "object_detection/data/vrd_labelmap.pbtxt"
}

eval_config: {
  num_examples: 955  #注意該參數是測試集中圖像的數目
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "object_detection/vrd_tfrecord/vrd_val.record"
  }
  label_map_path: "object_detection/data/vrd_labelmap.pbtxt"
  shuffle: false
  num_readers: 1
}

1.4 設置train的命令行參數

設置參數

--train_dir=train_dir\
--pipeline_config_path=pipeline_config_path

2. 評估預測好的模型

2.1 先將訓練好的ckpt模型導出爲pb文件

模型訓練好了之後,會得到以下三個文件:

  • model.ckpt-${CHECKPOINT_NUMBER}.data-00000-of-00001,
  • model.ckpt-${CHECKPOINT_NUMBER}.index
  • model.ckpt-${CHECKPOINT_NUMBER}.meta

運行export_inference_graph.py文件:

# From tensorflow/models/research/
python export_inference_graph \
    --input_type image_tensor \
    --pipeline_config_path path/to/ssd_inception_v2.config \
    --trained_checkpoint_prefix path/to/model.ckpt-369 \
    --output_directory path/to/exported_model_directory \

然後會在output_directory目錄下會得到一個frozen_inference_graph.pb文件。

2.2 預測

運行infer_detections文件

# From tensorflow/models/research/oid
SPLIT=validation  # or test
TF_RECORD_FILES=$(ls -1 ${SPLIT}_tfrecords/* | tr '\n' ',')  # 獲取素有tfrecord文件

PYTHONPATH=$PYTHONPATH:$(readlink -f ..) \
python -m object_detection/inference/infer_detections \
  --input_tfrecord_paths=$TF_RECORD_FILES \
  --output_tfrecord_path=${SPLIT}_detections.tfrecord\
  --inference_graph=faster_rcnn_inception_resnet_v2_atrous_oid/frozen_inference_graph.pb \
  --discard_image_pixels  # 預測的結果用來算mAP,不需要保存圖片內容

運行完畢之後會得到一個validation_detections.tfrecord文件。該文件會被用來計算mAP

2.3 生成指標相關的配置文件

# From tensorflow/models/research/oid
SPLIT=validation  # or test
NUM_SHARDS=1  # Set to NUM_GPUS if using the parallel evaluation script above

mkdir -p ${SPLIT}_eval_metrics

echo "
label_map_path: '../object_detection/data/oid_bbox_trainable_label_map.pbtxt'
tf_record_input_reader: { input_path: '${SPLIT}_detections.tfrecord@${NUM_SHARDS}' }
" > ${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt

echo "
metrics_set: 'coco_detection_metrics'
" > ${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt

其中metrics_set有以下選項:

  • pascal_voc_detection_metrics
  • weighted_pascal_voc_detection_metrics
  • pascal_voc_instance_segmentation_metrics
  • open_images_detection_metrics
  • coco_detection_metrics
  • coco_mask_metrics

該腳本運行完畢之後,會生成兩個配置文件:

  • validation_eval_config.pbtxt
  • validation_input_config.pbtxt

這兩個配置文件在生成評估結果時會用到。

2.4 得到評價指標的結果

運行以下腳本:

# From tensorflow/models/research/oid
SPLIT=validation  # or test

PYTHONPATH=$PYTHONPATH:$(readlink -f ..) \
python -m object_detection/metrics/offline_eval_map_corloc \
  --eval_dir=${SPLIT}_eval_metrics \
  --eval_config_path=${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt \
  --input_config_path=${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt

運行完畢之後會打印評價結果,並將相關的結果寫進文件metrics.csv文件中。

3. 在tensorboard中查看模型訓練和過擬合情況

要想實現tensorboard中查看,需要按照官方要求將數據組織成以下形式:

+data(folder)
  -label_map file
  -train TFRecord file
  -eval TFRecord file
+models(folder)
  + model(folder)
    -pipeline config file
    +train(folder)
    +eval(folder)

然後在訓練的時候,運行以下命令:

# From the tensorflow/models/research/ directory
python object_detection/train.py \
    --logtostderr \
    --pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \
    --train_dir=${PATH_TO_TRAIN_DIR}

其中${PATH_TO_YOUR_PIPELINE_CONFIG}是上面我們的config文件的路徑。${PATH_TO_TRAIN_DIR}是訓練時checkpoint和events會被寫入的目錄,即上面的train目錄。
訓練的同時,開啓預測程序:

# From the tensorflow/models/research/ directory
python object_detection/eval.py \
    --logtostderr \
    --pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \
    --checkpoint_dir=${PATH_TO_TRAIN_DIR} \
    --eval_dir=${PATH_TO_EVAL_DIR}

預測程序會週期性地取train目錄下最新的checkpoint文件來對測試數據進行評估。其中${PATH_TO_YOUR_PIPELINE_CONFIG}是config文件的目錄,${PATH_TO_TRAIN_DIR}是上面的訓練的checkpoint所在目錄,${PATH_TO_EVAL_DIR}是評估時的event文件將會被寫入的目錄。

開啓上面的兩個程序後,就可以在tensorboard中查看模型的效果。此時進入到上面的models目錄,然後運行下面的命令:

tensorboard --logdir=${PATH_TO_MODEL_DIRECTORY}

其中,${PATH_TO_MODEL_DIRECTORY}指的是train目錄和eval目錄的父目錄,即上面的model目錄。

得到的tensorboard就會有train和eval的loss及mAP:
這裏寫圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章