Nvidia jetson nano / TX2 Tensorflow-gpu SSD 配置 训练 自己的数据集 测试

目录

安装python3

下载tensorflow

SSD.Tensorflow训练自己的数据集

制作voc数据集

    介绍

    生成训练数据文件

训练

测试


 

安装python3

首先安装python3

sudo apt-get install python3

安装pip3,这里可能用 apt get安装

sudo apt-get install python3-pip

执行后可能会报错

The following packages have unmet dependencies:
 python3-pip : Depends: python-pip-whl (= 9.0.1-2) but 9.0.1-2.3~ubuntu1.18.04.1 is to be installed

这里是python2的pip的依赖版本高了,导致无法安装

先卸载python2-pip

sudo apt-get remove python-pip
sudo apt-get remove python-pip-whl

再执行安装

sudo apt-get install python3-pip

再安装回python2-pip就行了

sudo apt-get install python-pip

 

也可以在此链接下载 py脚本来安装

https://bootstrap.pypa.io/get-pip.py

复制内容下来 储存到文件,命名为 get-pip.py

安装命令

sudo python3 get-pip.py

安装完成后可以升下级

python3 -m pip install pip

安装完成后查看版本

pip3 -V

下载tensorflow

这里使用pip3来安装tensorflow-gpu,还有可以去github用源码编译,这里就不展开说。

如果直接和pc上用pip命令直接安装时不行的找不到安装包,我们去nvidia官网下载安装包。

https://developer.nvidia.com/embedded/downloads#?search=tensorflow&tx=$product,jetson_nano

打开页面选择要下载的版本

这里选择了1.13.1 nv19.5的版本

pip安装下载可能有点慢,换国内源

修改 ~/.pip/pip.conf (没有就创建一个文件夹及文件)

cd ~
mkdir .pip
cd .pip
touch pip.conf

pip.conf文件写入,保存

[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple

[install]
trusted-host=mirrors.aliyun.com

 

下载好了后,切换到下载目录,安装

pip3 install tensorflow_gpu-1.13.1+nv19.5-cp36-cp36m-linux_aarch64.whl

等待安装完成

安装依赖

pip3 install matplotlib --user
pip3 install scipy --user

这里scipy安装编译可能会出错,apt安装解决

sudo apt-get install python3-scipy

 

测试是否安装成功

新建文本 tensorflowTest.py

touch tensorflowTest.py

写入

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

#初始化一个Tensorflow的常量:Hello Google Tensorflow! 字符串,并命名为greeting作为一个计算模块
greeting = tf.constant('Hello Google Tensorflow!')
#启动一个会话
sess = tf.Session()
#使用会话执行greeting计算模块
result = sess.run(greeting)
print(result)
sess.close()

运行

python3 tensorflowTest.py

看到命令行打印出

Hello Google Tensorflow!

成功

可能遇到错误

运行测时的错误

ImportError: numpy.core._multiarray_umath failed to import
ImportError: numpy.core.umath failed to import

这个是numpy库的问题,我安装1.14.0会出现这个错误,1.13.3也会出现,安装1.17.4就没问题,可以检查看看。

pip安装时的错误

ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/local/lib/python3.6/dist-packages/cycler.py'
Consider using the `--user` option or check the permissions.

这里是权限问题可以用 sudo或者在末尾加 --user

例如

pip3 install scipy --user
sudo python3 -m pip install scipy

SSD.Tensorflow训练自己的数据集

参考自

https://blog.csdn.net/czksnk/article/details/83010533

https://tensorflow.google.cn/install/source

https://blog.csdn.net/weixin_39881922/article/details/80569803

https://blog.csdn.net/w5688414/article/details/78395177

https://blog.csdn.net/liuyan20062010/article/details/78905517

https://blog.csdn.net/weixin_39881922/article/details/80569803

    根据以上这些网站的信息,配置后只能跑他的模型来预测,训练自己的数据就不收敛,预测出来满屏的框框,不知道是否是我操作不对还是怎样,这个github工程的星星几千个,我看也有人反馈收敛问题,后来我换了个工程,就可以了,配置起来比上面那些还简单点。

github页面:https://github.com/HiKapok/SSD.TensorFlow/tree/AbsoluteCoord

他有其他版本,不过我自己标出来的数据集是绝对座标,voc也是绝对座标的,所以下载绝对座标版本。

这里可以下载我改好过的版本,里面有标好的数据集 和 测试图片

链接:https://download.csdn.net/download/ourkix/12038648

制作voc数据集

    介绍

目录结构是这样的

|--VOCseahorse

           |--VOC2007

                    |--Annotations

                    |--ImageSets

                              |--Layout

                              |--Main

                              |--Segmentation

                    |--JPEGImages

    ·这里解释下Annotations是放xml的,就是放置标注(物体位置 类别)好了的文件,用的labelImg软件标注的。

    ·ImageSets下面的Main放的是文本,里面有4个文件  test.txt  val,txt  train.txt   trainval.txt,分别为测试图片,验证图片,训练图片,训练验证图片的文件名。其他Layout和Segmentation占时用不到,可以放Main中一样的文件进去。

    ·JPEGImages是放置图片的

下载我上传的文件的话,这个数据已经放在dataset里面了。

    生成训练数据文件

    首先切换到SSD.TensorFlow-AbsoluteCoord目录,进入dataset目录。

   新建一个文件 splitImage.py,这个是用来把图片分几份一部分训练一部分测试。

import os
import random 
 
xmlfilepath=r'./VOCseahorse/VOC2007/Annotations'
saveBasePath=r"./VOCseahorse/VOC2007/"
 
trainval_percent=0.8
train_percent=0.7
total_xml = os.listdir(xmlfilepath)
num=len(total_xml)  
list=range(num)  
tv=int(num*trainval_percent)  
tr=int(tv*train_percent)  
trainval= random.sample(list,tv)  
train=random.sample(trainval,tr)  
 
print("train and val size",tv)
print("traub suze",tr)
ftrainval = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/trainval.txt'), 'w')  
ftest = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/test.txt'), 'w')  
ftrain = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/train.txt'), 'w')  
fval = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/val.txt'), 'w')  
 
for i  in list:  
    name=total_xml[i][:-4]+'\n'  
    if i in trainval:  
        ftrainval.write(name)  
        if i in train:  
            ftrain.write(name)  
        else:  
            fval.write(name)  
    else:  
        ftest.write(name)  
  
ftrainval.close()  
ftrain.close()  
fval.close()  
ftest .close()

切换到ImageSets/Main下,新建四个文件

touch train.txt
touch test.txt
touch val.txt
touch trainval.txt

回到dataset目录,运行(前提是有数据 有标注好的xml文件)

python3 splitImage.py

这样就分好了。

 

    修改dataset_common.py文件,修改类别个数,训练图个数 验证图个数。如果用的我的数据集,注释掉原来的,改成这样就行了。至于train数和val数,去看你文件里面有 Main文件夹里的train.txt val.txt这两个文件,数一数几张图就知道了。

#VOC_LABELS = {
#    'none': (0, 'Background'),
#    'aeroplane': (1, 'Vehicle'),
#    'bicycle': (2, 'Vehicle'),
#    'bird': (3, 'Animal'),
#    'boat': (4, 'Vehicle'),
#    'bottle': (5, 'Indoor'),
#    'bus': (6, 'Vehicle'),
#    'car': (7, 'Vehicle'),
#    'cat': (8, 'Animal'),
#    'chair': (9, 'Indoor'),
#    'cow': (10, 'Animal'),
#    'diningtable': (11, 'Indoor'),
#    'dog': (12, 'Animal'),
#    'horse': (13, 'Animal'),
#    'motorbike': (14, 'Vehicle'),
#    'person': (15, 'Person'),
#    'pottedplant': (16, 'Indoor'),
#    'sheep': (17, 'Animal'),
#    'sofa': (18, 'Indoor'),
#    'train': (19, 'Vehicle'),
#    'tvmonitor': (20, 'Indoor'),
#}

#COCO_LABELS = {
#    "bench":  (14, 'outdoor') ,
#    "skateboard":  (37, 'sports') ,
#    "toothbrush":  (80, 'indoor') ,
#    "person":  (1, 'person') ,
#    "donut":  (55, 'food') ,
#    "none":  (0, 'background') ,
#    "refrigerator":  (73, 'appliance') ,
#    "horse":  (18, 'animal') ,
#    "elephant":  (21, 'animal') ,
#    "book":  (74, 'indoor') ,
#    "car":  (3, 'vehicle') ,
#    "keyboard":  (67, 'electronic') ,
#    "cow":  (20, 'animal') ,
#    "microwave":  (69, 'appliance') ,
#    "traffic light":  (10, 'outdoor') ,
#    "tie":  (28, 'accessory') ,
#    "dining table":  (61, 'furniture') ,
#    "toaster":  (71, 'appliance') ,
#    "baseball glove":  (36, 'sports') ,
#    "giraffe":  (24, 'animal') ,
#    "cake":  (56, 'food') ,
#    "handbag":  (27, 'accessory') ,
#    "scissors":  (77, 'indoor') ,
#    "bowl":  (46, 'kitchen') ,
#    "couch":  (58, 'furniture') ,
#    "chair":  (57, 'furniture') ,
#    "boat":  (9, 'vehicle') ,
#    "hair drier":  (79, 'indoor') ,
#    "airplane":  (5, 'vehicle') ,
#    "pizza":  (54, 'food') ,
#    "backpack":  (25, 'accessory') ,
#    "kite":  (34, 'sports') ,
#    "sheep":  (19, 'animal') ,
#    "umbrella":  (26, 'accessory') ,
#    "stop sign":  (12, 'outdoor') ,
#    "truck":  (8, 'vehicle') ,
#    "skis":  (31, 'sports') ,
#    "sandwich":  (49, 'food') ,
#    "broccoli":  (51, 'food') ,
#    "wine glass":  (41, 'kitchen') ,
#    "surfboard":  (38, 'sports') ,
#    "sports ball":  (33, 'sports') ,
#    "cell phone":  (68, 'electronic') ,
#    "dog":  (17, 'animal') ,
#    "bed":  (60, 'furniture') ,
#    "toilet":  (62, 'furniture') ,
#    "fire hydrant":  (11, 'outdoor') ,
#    "oven":  (70, 'appliance') ,
#    "zebra":  (23, 'animal') ,
#    "tv":  (63, 'electronic') ,
#    "potted plant":  (59, 'furniture') ,
#    "parking meter":  (13, 'outdoor') ,
#    "spoon":  (45, 'kitchen') ,
#    "bus":  (6, 'vehicle') ,
#    "laptop":  (64, 'electronic') ,
#    "cup":  (42, 'kitchen') ,
#    "bird":  (15, 'animal') ,
#    "sink":  (72, 'appliance') ,
#    "remote":  (66, 'electronic') ,
#    "bicycle":  (2, 'vehicle') ,
#    "tennis racket":  (39, 'sports') ,
#    "baseball bat":  (35, 'sports') ,
#    "cat":  (16, 'animal') ,
#    "fork":  (43, 'kitchen') ,
#    "suitcase":  (29, 'accessory') ,
#    "snowboard":  (32, 'sports') ,
#    "clock":  (75, 'indoor') ,
#    "apple":  (48, 'food') ,
#    "mouse":  (65, 'electronic') ,
#    "bottle":  (40, 'kitchen') ,
#    "frisbee":  (30, 'sports') ,
#    "carrot":  (52, 'food') ,
#    "bear":  (22, 'animal') ,
#    "hot dog":  (53, 'food') ,
#    "teddy bear":  (78, 'indoor') ,
#    "knife":  (44, 'kitchen') ,
#    "train":  (7, 'vehicle') ,
#    "vase":  (76, 'indoor') ,
#    "banana":  (47, 'food') ,
#    "motorcycle":  (4, 'vehicle') ,
#    "orange":  (50, 'food')
#  }

VOC_LABELS = {
    'none': (0, 'Background'),
    'seahorse': (1, 'animal')
}


# use dataset_inspect.py to get these summary
data_splits_num = {
    'train': 5,
    'val': 3,
}

 

 

    修改convert_tfrecords.py文件,这里是修改默认值,也可以不修改,到时运行时改输入参数也行。

tf.app.flags.DEFINE_string('dataset_directory', './dataset/VOC',
                           'All datas directory')
tf.app.flags.DEFINE_string('train_splits', 'VOC2007',
                           'Comma-separated list of the training data sub-directory')
tf.app.flags.DEFINE_string('validation_splits', 'VOC2007TEST',
                           'Comma-separated list of the validation data sub-directory')
tf.app.flags.DEFINE_string('output_directory', './dataset/tfrecords',
                           'Output data directory')

这里我们做个sh脚本,方便数据路径更改。换到SSD.TensorFlow-AbsoluteCoord目录。

新建tf_converdata.sh

touch tf_converdata.sh
chmod +x tf_converdata.sh

 内容写,数据目录dataset_directory   输出训练用数据目录output_directory,没有的目录自己建。

python3 dataset/convert_tfrecords.py --dataset_directory=dataset/VOCseahorse/ --output_directory=./dataset/tfrecords/

运行

./tf_converdata.sh

这样数据集就准备好了。

训练

    下载预训练模型。如果用的我上传的项目,已经在model文件夹里了。

下载链接:百度网盘  提取码:tg64

如果是TX2内存是没问题的,可以训练,nano就有点吃力。

nano关闭图形界面,训练时再关闭


# ubuntu关闭图形用户界面
sudo systemctl set-default multi-user.target
sudo reboot
 
# ubuntu启用图形用户界面
sudo systemctl set-default graphical.target
sudo reboot

切换到SSD.TensorFlow-AbsoluteCoord目录,新建文件train.sh脚本

touch train.sh
chmod +x train.sh

写入,保存

python3 train_ssd.py \
	--num_classes=2 \
	--batch_size=4 \
	--max_number_of_steps=5000 \
	--data_dir=./dataset/tfrecords \
	--model_dir=./logs \
	--save_summary_steps=2500 \
	--save_checkpoints_steps=5000 \
	--learning_rate=1e-3 \
	--checkpoint_path=./model \

 开始训练

./train.sh

 这里也许会报错,一般是内存不足,重新运行train.sh就行了,多试几次。

测试

     这里我训练了5000次。

修改simple_ssd_demo.py文件,一样下载的是我上传的话不用改。这里有个目录images,里面放的是测试用的出图片。

# Copyright 2018 Changan Wang

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# =============================================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import sys
import cv2

import tensorflow as tf
from scipy.misc import imread, imsave, imshow, imresize
import numpy as np

from net import ssd_net

from dataset import dataset_common
from preprocessing import ssd_preprocessing
from utility import anchor_manipulator
from utility import draw_toolbox
from utility import bbox_util

# scaffold related configuration
tf.app.flags.DEFINE_integer(
    'num_classes', 21, 'Number of classes to use in the dataset.')
# model related configuration
tf.app.flags.DEFINE_integer(
    'train_image_size', 300,
    'The size of the input image for the model to use.')
tf.app.flags.DEFINE_string(
    'data_format', 'channels_last', # 'channels_first' or 'channels_last'
    'A flag to override the data format used in the model. channels_first '
    'provides a performance boost on GPU but is not always compatible '
    'with CPU. If left unspecified, the data format will be chosen '
    'automatically based on whether TensorFlow was built for CPU or GPU.')
tf.app.flags.DEFINE_float(
    'select_threshold', 0.2, 'Class-specific confidence score threshold for selecting a box.')
tf.app.flags.DEFINE_float(
    'min_size', 4., 'The min size of bboxes to keep.')
tf.app.flags.DEFINE_float(
    'nms_threshold', 0.45, 'Matching threshold in NMS algorithm.')
tf.app.flags.DEFINE_integer(
    'nms_topk', 20, 'Number of total object to keep after NMS.')
tf.app.flags.DEFINE_integer(
    'keep_topk', 200, 'Number of total object to keep for each image before nms.')
# checkpoint related configuration
tf.app.flags.DEFINE_string(
    'checkpoint_path', './logs',
    'The path to a checkpoint from which to fine-tune.')
tf.app.flags.DEFINE_string(
    'model_scope', 'ssd300',
    'Model scope name used to replace the name_scope in checkpoint.')

FLAGS = tf.app.flags.FLAGS
#CUDA_VISIBLE_DEVICES

def get_checkpoint():
    if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
        checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
    else:
        checkpoint_path = FLAGS.checkpoint_path

    return checkpoint_path

def main(_):
    with tf.Graph().as_default():
        out_shape = [FLAGS.train_image_size] * 2

        image_input = tf.placeholder(tf.uint8, shape=(None, None, 3))
        shape_input = tf.placeholder(tf.int32, shape=(2,))

        features, output_shape = ssd_preprocessing.preprocess_for_eval(image_input, out_shape, data_format=FLAGS.data_format, output_rgb=False)
        features = tf.expand_dims(features, axis=0)
        output_shape = tf.expand_dims(output_shape, axis=0)

        all_anchor_scales = [(30.,), (60.,), (112.5,), (165.,), (217.5,), (270.,)]
        all_extra_scales = [(42.43,), (82.17,), (136.23,), (189.45,), (242.34,), (295.08,)]
        all_anchor_ratios = [(1., 2., .5), (1., 2., 3., .5, 0.3333), (1., 2., 3., .5, 0.3333), (1., 2., 3., .5, 0.3333), (1., 2., .5), (1., 2., .5)]
        # all_anchor_ratios = [(2., .5), (2., 3., .5, 0.3333), (2., 3., .5, 0.3333), (2., 3., .5, 0.3333), (2., .5), (2., .5)]

        with tf.variable_scope(FLAGS.model_scope, default_name=None, values=[features], reuse=tf.AUTO_REUSE):
            backbone = ssd_net.VGG16Backbone(FLAGS.data_format)
            feature_layers = backbone.forward(features, training=False)
            with tf.device('/cpu:0'):
                anchor_encoder_decoder = anchor_manipulator.AnchorEncoder(positive_threshold=None, ignore_threshold=None, prior_scaling=[0.1, 0.1, 0.2, 0.2])

                if FLAGS.data_format == 'channels_first':
                    all_layer_shapes = [tf.shape(feat)[2:] for feat in feature_layers]
                else:
                    all_layer_shapes = [tf.shape(feat)[1:3] for feat in feature_layers]
                all_layer_strides = [8, 16, 32, 64, 100, 300]
                total_layers = len(all_layer_shapes)
                anchors_height = list()
                anchors_width = list()
                anchors_depth = list()
                for ind in range(total_layers):
                    _anchors_height, _anchors_width, _anchor_depth = anchor_encoder_decoder.get_anchors_width_height(all_anchor_scales[ind], all_extra_scales[ind], all_anchor_ratios[ind], name='get_anchors_width_height{}'.format(ind))
                    anchors_height.append(_anchors_height)
                    anchors_width.append(_anchors_width)
                    anchors_depth.append(_anchor_depth)
                anchors_ymin, anchors_xmin, anchors_ymax, anchors_xmax, _ = anchor_encoder_decoder.get_all_anchors(tf.squeeze(output_shape, axis=0),
                                                                                anchors_height, anchors_width, anchors_depth,
                                                                                [0.5] * total_layers, all_layer_shapes, all_layer_strides,
                                                                                [0.] * total_layers, [False] * total_layers)
            location_pred, cls_pred = ssd_net.multibox_head(feature_layers, FLAGS.num_classes, anchors_depth, data_format=FLAGS.data_format)
            if FLAGS.data_format == 'channels_first':
                cls_pred = [tf.transpose(pred, [0, 2, 3, 1]) for pred in cls_pred]
                location_pred = [tf.transpose(pred, [0, 2, 3, 1]) for pred in location_pred]

            cls_pred = [tf.reshape(pred, [-1, FLAGS.num_classes]) for pred in cls_pred]
            location_pred = [tf.reshape(pred, [-1, 4]) for pred in location_pred]

            cls_pred = tf.concat(cls_pred, axis=0)
            location_pred = tf.concat(location_pred, axis=0)

        with tf.device('/cpu:0'):
            bboxes_pred = anchor_encoder_decoder.decode_anchors(location_pred, anchors_ymin, anchors_xmin, anchors_ymax, anchors_xmax)
            selected_bboxes, selected_scores = bbox_util.parse_by_class(tf.squeeze(output_shape, axis=0), cls_pred, bboxes_pred,
                                                            FLAGS.num_classes, FLAGS.select_threshold, FLAGS.min_size,
                                                            FLAGS.keep_topk, FLAGS.nms_topk, FLAGS.nms_threshold)

            labels_list = []
            scores_list = []
            bboxes_list = []
            for k, v in selected_scores.items():
                labels_list.append(tf.ones_like(v, tf.int32) * k)
                scores_list.append(v)
                bboxes_list.append(selected_bboxes[k])
            all_labels = tf.concat(labels_list, axis=0)
            all_scores = tf.concat(scores_list, axis=0)
            all_bboxes = tf.concat(bboxes_list, axis=0)

        saver = tf.train.Saver()
        with tf.Session() as sess:
            init = tf.global_variables_initializer()
            sess.run(init)

            saver.restore(sess, get_checkpoint())

            path = './images/'
            image_names = sorted(os.listdir(path))

            #def detecte(image_names):
            for f in os.listdir(path):
                np_image = imread(path + "/" +  f)
                labels_, scores_, bboxes_, output_shape_ = sess.run([all_labels, all_scores, all_bboxes, output_shape], feed_dict = {image_input : np_image, shape_input : np_image.shape[:-1]})
                bboxes_[:, 0] = bboxes_[:, 0] * np_image.shape[0] / output_shape_[0, 0]
                bboxes_[:, 1] = bboxes_[:, 1] * np_image.shape[1] / output_shape_[0, 1]
                bboxes_[:, 2] = bboxes_[:, 2] * np_image.shape[0] / output_shape_[0, 0]
                bboxes_[:, 3] = bboxes_[:, 3] * np_image.shape[1] / output_shape_[0, 1]

                img_to_draw = draw_toolbox.bboxes_draw_on_img(np_image, labels_, scores_, bboxes_, thickness=2)
                #print(bboxes_)
                #imshow(img_to_draw)
                cv2.imshow("SSD", img_to_draw)
                k = cv2.waitKey(0) & 0xff
                    #Exit if ESC pressed
                if k == 27 : break


#            np_image = imread('./demo/test.jpg')
#            labels_, scores_, bboxes_, output_shape_ = sess.run([all_labels, all_scores, all_bboxes, output_shape], feed_dict = {image_input : np_image, shape_input : np_image.shape[:-1]})
#            bboxes_[:, 0] = bboxes_[:, 0] * np_image.shape[0] / output_shape_[0, 0]
#            bboxes_[:, 1] = bboxes_[:, 1] * np_image.shape[1] / output_shape_[0, 1]
#            bboxes_[:, 2] = bboxes_[:, 2] * np_image.shape[0] / output_shape_[0, 0]
#            bboxes_[:, 3] = bboxes_[:, 3] * np_image.shape[1] / output_shape_[0, 1]

#            img_to_draw = draw_toolbox.bboxes_draw_on_img(np_image, labels_, scores_, bboxes_, thickness=2)
            #print(bboxes_)
#            imshow(img_to_draw)
            #imsave('./demo/test_out.jpg', img_to_draw)







if __name__ == '__main__':
  tf.logging.set_verbosity(tf.logging.INFO)
  tf.app.run()

新建脚本ssd.sh

touch ssd.sh
chmod +x ssd.sh

写入

python3 simple_ssd_demo.py \
    --num_classes=2 \

运行测试

./ssd.sh

任意键下一张图片,Esc退出。

             

 

 

 

 

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章