目錄
安裝python3
首先安裝python3
sudo apt-get install python3
安裝pip3,這裏可能用 apt get安裝
sudo apt-get install python3-pip
執行後可能會報錯
The following packages have unmet dependencies:
python3-pip : Depends: python-pip-whl (= 9.0.1-2) but 9.0.1-2.3~ubuntu1.18.04.1 is to be installed
這裏是python2的pip的依賴版本高了,導致無法安裝
先卸載python2-pip
sudo apt-get remove python-pip
sudo apt-get remove python-pip-whl
再執行安裝
sudo apt-get install python3-pip
再安裝回python2-pip就行了
sudo apt-get install python-pip
也可以在此鏈接下載 py腳本來安裝
https://bootstrap.pypa.io/get-pip.py
複製內容下來 儲存到文件,命名爲 get-pip.py
安裝命令
sudo python3 get-pip.py
安裝完成後可以升下級
python3 -m pip install pip
安裝完成後查看版本
pip3 -V
下載tensorflow
這裏使用pip3來安裝tensorflow-gpu,還有可以去github用源碼編譯,這裏就不展開說。
如果直接和pc上用pip命令直接安裝時不行的找不到安裝包,我們去nvidia官網下載安裝包。
https://developer.nvidia.com/embedded/downloads#?search=tensorflow&tx=$product,jetson_nano
打開頁面選擇要下載的版本
這裏選擇了1.13.1 nv19.5的版本
pip安裝下載可能有點慢,換國內源
修改 ~/.pip/pip.conf (沒有就創建一個文件夾及文件)
cd ~
mkdir .pip
cd .pip
touch pip.conf
pip.conf文件寫入,保存
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
[install]
trusted-host=mirrors.aliyun.com
下載好了後,切換到下載目錄,安裝
pip3 install tensorflow_gpu-1.13.1+nv19.5-cp36-cp36m-linux_aarch64.whl
等待安裝完成
安裝依賴
pip3 install matplotlib --user
pip3 install scipy --user
這裏scipy安裝編譯可能會出錯,apt安裝解決
sudo apt-get install python3-scipy
測試是否安裝成功
新建文本 tensorflowTest.py
touch tensorflowTest.py
寫入
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
#初始化一個Tensorflow的常量:Hello Google Tensorflow! 字符串,並命名爲greeting作爲一個計算模塊
greeting = tf.constant('Hello Google Tensorflow!')
#啓動一個會話
sess = tf.Session()
#使用會話執行greeting計算模塊
result = sess.run(greeting)
print(result)
sess.close()
運行
python3 tensorflowTest.py
看到命令行打印出
Hello Google Tensorflow!
成功
可能遇到錯誤
運行測時的錯誤
ImportError: numpy.core._multiarray_umath failed to import
ImportError: numpy.core.umath failed to import
這個是numpy庫的問題,我安裝1.14.0會出現這個錯誤,1.13.3也會出現,安裝1.17.4就沒問題,可以檢查看看。
pip安裝時的錯誤
ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/local/lib/python3.6/dist-packages/cycler.py'
Consider using the `--user` option or check the permissions.
這裏是權限問題可以用 sudo或者在末尾加 --user
例如
pip3 install scipy --user
sudo python3 -m pip install scipy
SSD.Tensorflow訓練自己的數據集
參考自
https://blog.csdn.net/czksnk/article/details/83010533
https://tensorflow.google.cn/install/source
https://blog.csdn.net/weixin_39881922/article/details/80569803
https://blog.csdn.net/w5688414/article/details/78395177
https://blog.csdn.net/liuyan20062010/article/details/78905517
https://blog.csdn.net/weixin_39881922/article/details/80569803
根據以上這些網站的信息,配置後只能跑他的模型來預測,訓練自己的數據就不收斂,預測出來滿屏的框框,不知道是否是我操作不對還是怎樣,這個github工程的星星幾千個,我看也有人反饋收斂問題,後來我換了個工程,就可以了,配置起來比上面那些還簡單點。
github頁面:https://github.com/HiKapok/SSD.TensorFlow/tree/AbsoluteCoord
他有其他版本,不過我自己標出來的數據集是絕對座標,voc也是絕對座標的,所以下載絕對座標版本。
這裏可以下載我改好過的版本,裏面有標好的數據集 和 測試圖片
鏈接:https://download.csdn.net/download/ourkix/12038648
製作voc數據集
介紹
目錄結構是這樣的
|--VOCseahorse
|--VOC2007
|--Annotations
|--ImageSets
|--Layout
|--Main
|--Segmentation
|--JPEGImages
·這裏解釋下Annotations是放xml的,就是放置標註(物體位置 類別)好了的文件,用的labelImg軟件標註的。
·ImageSets下面的Main放的是文本,裏面有4個文件 test.txt val,txt train.txt trainval.txt,分別爲測試圖片,驗證圖片,訓練圖片,訓練驗證圖片的文件名。其他Layout和Segmentation佔時用不到,可以放Main中一樣的文件進去。
·JPEGImages是放置圖片的
下載我上傳的文件的話,這個數據已經放在dataset裏面了。
生成訓練數據文件
首先切換到SSD.TensorFlow-AbsoluteCoord目錄,進入dataset目錄。
新建一個文件 splitImage.py,這個是用來把圖片分幾份一部分訓練一部分測試。
import os
import random
xmlfilepath=r'./VOCseahorse/VOC2007/Annotations'
saveBasePath=r"./VOCseahorse/VOC2007/"
trainval_percent=0.8
train_percent=0.7
total_xml = os.listdir(xmlfilepath)
num=len(total_xml)
list=range(num)
tv=int(num*trainval_percent)
tr=int(tv*train_percent)
trainval= random.sample(list,tv)
train=random.sample(trainval,tr)
print("train and val size",tv)
print("traub suze",tr)
ftrainval = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/trainval.txt'), 'w')
ftest = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/test.txt'), 'w')
ftrain = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/train.txt'), 'w')
fval = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/val.txt'), 'w')
for i in list:
name=total_xml[i][:-4]+'\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest .close()
切換到ImageSets/Main下,新建四個文件
touch train.txt
touch test.txt
touch val.txt
touch trainval.txt
回到dataset目錄,運行(前提是有數據 有標註好的xml文件)
python3 splitImage.py
這樣就分好了。
修改dataset_common.py文件,修改類別個數,訓練圖個數 驗證圖個數。如果用的我的數據集,註釋掉原來的,改成這樣就行了。至於train數和val數,去看你文件裏面有 Main文件夾裏的train.txt val.txt這兩個文件,數一數幾張圖就知道了。
#VOC_LABELS = {
# 'none': (0, 'Background'),
# 'aeroplane': (1, 'Vehicle'),
# 'bicycle': (2, 'Vehicle'),
# 'bird': (3, 'Animal'),
# 'boat': (4, 'Vehicle'),
# 'bottle': (5, 'Indoor'),
# 'bus': (6, 'Vehicle'),
# 'car': (7, 'Vehicle'),
# 'cat': (8, 'Animal'),
# 'chair': (9, 'Indoor'),
# 'cow': (10, 'Animal'),
# 'diningtable': (11, 'Indoor'),
# 'dog': (12, 'Animal'),
# 'horse': (13, 'Animal'),
# 'motorbike': (14, 'Vehicle'),
# 'person': (15, 'Person'),
# 'pottedplant': (16, 'Indoor'),
# 'sheep': (17, 'Animal'),
# 'sofa': (18, 'Indoor'),
# 'train': (19, 'Vehicle'),
# 'tvmonitor': (20, 'Indoor'),
#}
#COCO_LABELS = {
# "bench": (14, 'outdoor') ,
# "skateboard": (37, 'sports') ,
# "toothbrush": (80, 'indoor') ,
# "person": (1, 'person') ,
# "donut": (55, 'food') ,
# "none": (0, 'background') ,
# "refrigerator": (73, 'appliance') ,
# "horse": (18, 'animal') ,
# "elephant": (21, 'animal') ,
# "book": (74, 'indoor') ,
# "car": (3, 'vehicle') ,
# "keyboard": (67, 'electronic') ,
# "cow": (20, 'animal') ,
# "microwave": (69, 'appliance') ,
# "traffic light": (10, 'outdoor') ,
# "tie": (28, 'accessory') ,
# "dining table": (61, 'furniture') ,
# "toaster": (71, 'appliance') ,
# "baseball glove": (36, 'sports') ,
# "giraffe": (24, 'animal') ,
# "cake": (56, 'food') ,
# "handbag": (27, 'accessory') ,
# "scissors": (77, 'indoor') ,
# "bowl": (46, 'kitchen') ,
# "couch": (58, 'furniture') ,
# "chair": (57, 'furniture') ,
# "boat": (9, 'vehicle') ,
# "hair drier": (79, 'indoor') ,
# "airplane": (5, 'vehicle') ,
# "pizza": (54, 'food') ,
# "backpack": (25, 'accessory') ,
# "kite": (34, 'sports') ,
# "sheep": (19, 'animal') ,
# "umbrella": (26, 'accessory') ,
# "stop sign": (12, 'outdoor') ,
# "truck": (8, 'vehicle') ,
# "skis": (31, 'sports') ,
# "sandwich": (49, 'food') ,
# "broccoli": (51, 'food') ,
# "wine glass": (41, 'kitchen') ,
# "surfboard": (38, 'sports') ,
# "sports ball": (33, 'sports') ,
# "cell phone": (68, 'electronic') ,
# "dog": (17, 'animal') ,
# "bed": (60, 'furniture') ,
# "toilet": (62, 'furniture') ,
# "fire hydrant": (11, 'outdoor') ,
# "oven": (70, 'appliance') ,
# "zebra": (23, 'animal') ,
# "tv": (63, 'electronic') ,
# "potted plant": (59, 'furniture') ,
# "parking meter": (13, 'outdoor') ,
# "spoon": (45, 'kitchen') ,
# "bus": (6, 'vehicle') ,
# "laptop": (64, 'electronic') ,
# "cup": (42, 'kitchen') ,
# "bird": (15, 'animal') ,
# "sink": (72, 'appliance') ,
# "remote": (66, 'electronic') ,
# "bicycle": (2, 'vehicle') ,
# "tennis racket": (39, 'sports') ,
# "baseball bat": (35, 'sports') ,
# "cat": (16, 'animal') ,
# "fork": (43, 'kitchen') ,
# "suitcase": (29, 'accessory') ,
# "snowboard": (32, 'sports') ,
# "clock": (75, 'indoor') ,
# "apple": (48, 'food') ,
# "mouse": (65, 'electronic') ,
# "bottle": (40, 'kitchen') ,
# "frisbee": (30, 'sports') ,
# "carrot": (52, 'food') ,
# "bear": (22, 'animal') ,
# "hot dog": (53, 'food') ,
# "teddy bear": (78, 'indoor') ,
# "knife": (44, 'kitchen') ,
# "train": (7, 'vehicle') ,
# "vase": (76, 'indoor') ,
# "banana": (47, 'food') ,
# "motorcycle": (4, 'vehicle') ,
# "orange": (50, 'food')
# }
VOC_LABELS = {
'none': (0, 'Background'),
'seahorse': (1, 'animal')
}
# use dataset_inspect.py to get these summary
data_splits_num = {
'train': 5,
'val': 3,
}
修改convert_tfrecords.py文件,這裏是修改默認值,也可以不修改,到時運行時改輸入參數也行。
tf.app.flags.DEFINE_string('dataset_directory', './dataset/VOC',
'All datas directory')
tf.app.flags.DEFINE_string('train_splits', 'VOC2007',
'Comma-separated list of the training data sub-directory')
tf.app.flags.DEFINE_string('validation_splits', 'VOC2007TEST',
'Comma-separated list of the validation data sub-directory')
tf.app.flags.DEFINE_string('output_directory', './dataset/tfrecords',
'Output data directory')
這裏我們做個sh腳本,方便數據路徑更改。換到SSD.TensorFlow-AbsoluteCoord目錄。
新建tf_converdata.sh
touch tf_converdata.sh
chmod +x tf_converdata.sh
內容寫,數據目錄dataset_directory 輸出訓練用數據目錄output_directory,沒有的目錄自己建。
python3 dataset/convert_tfrecords.py --dataset_directory=dataset/VOCseahorse/ --output_directory=./dataset/tfrecords/
運行
./tf_converdata.sh
這樣數據集就準備好了。
訓練
下載預訓練模型。如果用的我上傳的項目,已經在model文件夾裏了。
下載鏈接:百度網盤 提取碼:tg64
如果是TX2內存是沒問題的,可以訓練,nano就有點喫力。
nano關閉圖形界面,訓練時再關閉
# ubuntu關閉圖形用戶界面
sudo systemctl set-default multi-user.target
sudo reboot
# ubuntu啓用圖形用戶界面
sudo systemctl set-default graphical.target
sudo reboot
切換到SSD.TensorFlow-AbsoluteCoord目錄,新建文件train.sh腳本
touch train.sh
chmod +x train.sh
寫入,保存
python3 train_ssd.py \
--num_classes=2 \
--batch_size=4 \
--max_number_of_steps=5000 \
--data_dir=./dataset/tfrecords \
--model_dir=./logs \
--save_summary_steps=2500 \
--save_checkpoints_steps=5000 \
--learning_rate=1e-3 \
--checkpoint_path=./model \
開始訓練
./train.sh
這裏也許會報錯,一般是內存不足,重新運行train.sh就行了,多試幾次。
測試
這裏我訓練了5000次。
修改simple_ssd_demo.py文件,一樣下載的是我上傳的話不用改。這裏有個目錄images,裏面放的是測試用的出圖片。
# Copyright 2018 Changan Wang
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# =============================================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import sys
import cv2
import tensorflow as tf
from scipy.misc import imread, imsave, imshow, imresize
import numpy as np
from net import ssd_net
from dataset import dataset_common
from preprocessing import ssd_preprocessing
from utility import anchor_manipulator
from utility import draw_toolbox
from utility import bbox_util
# scaffold related configuration
tf.app.flags.DEFINE_integer(
'num_classes', 21, 'Number of classes to use in the dataset.')
# model related configuration
tf.app.flags.DEFINE_integer(
'train_image_size', 300,
'The size of the input image for the model to use.')
tf.app.flags.DEFINE_string(
'data_format', 'channels_last', # 'channels_first' or 'channels_last'
'A flag to override the data format used in the model. channels_first '
'provides a performance boost on GPU but is not always compatible '
'with CPU. If left unspecified, the data format will be chosen '
'automatically based on whether TensorFlow was built for CPU or GPU.')
tf.app.flags.DEFINE_float(
'select_threshold', 0.2, 'Class-specific confidence score threshold for selecting a box.')
tf.app.flags.DEFINE_float(
'min_size', 4., 'The min size of bboxes to keep.')
tf.app.flags.DEFINE_float(
'nms_threshold', 0.45, 'Matching threshold in NMS algorithm.')
tf.app.flags.DEFINE_integer(
'nms_topk', 20, 'Number of total object to keep after NMS.')
tf.app.flags.DEFINE_integer(
'keep_topk', 200, 'Number of total object to keep for each image before nms.')
# checkpoint related configuration
tf.app.flags.DEFINE_string(
'checkpoint_path', './logs',
'The path to a checkpoint from which to fine-tune.')
tf.app.flags.DEFINE_string(
'model_scope', 'ssd300',
'Model scope name used to replace the name_scope in checkpoint.')
FLAGS = tf.app.flags.FLAGS
#CUDA_VISIBLE_DEVICES
def get_checkpoint():
if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
else:
checkpoint_path = FLAGS.checkpoint_path
return checkpoint_path
def main(_):
with tf.Graph().as_default():
out_shape = [FLAGS.train_image_size] * 2
image_input = tf.placeholder(tf.uint8, shape=(None, None, 3))
shape_input = tf.placeholder(tf.int32, shape=(2,))
features, output_shape = ssd_preprocessing.preprocess_for_eval(image_input, out_shape, data_format=FLAGS.data_format, output_rgb=False)
features = tf.expand_dims(features, axis=0)
output_shape = tf.expand_dims(output_shape, axis=0)
all_anchor_scales = [(30.,), (60.,), (112.5,), (165.,), (217.5,), (270.,)]
all_extra_scales = [(42.43,), (82.17,), (136.23,), (189.45,), (242.34,), (295.08,)]
all_anchor_ratios = [(1., 2., .5), (1., 2., 3., .5, 0.3333), (1., 2., 3., .5, 0.3333), (1., 2., 3., .5, 0.3333), (1., 2., .5), (1., 2., .5)]
# all_anchor_ratios = [(2., .5), (2., 3., .5, 0.3333), (2., 3., .5, 0.3333), (2., 3., .5, 0.3333), (2., .5), (2., .5)]
with tf.variable_scope(FLAGS.model_scope, default_name=None, values=[features], reuse=tf.AUTO_REUSE):
backbone = ssd_net.VGG16Backbone(FLAGS.data_format)
feature_layers = backbone.forward(features, training=False)
with tf.device('/cpu:0'):
anchor_encoder_decoder = anchor_manipulator.AnchorEncoder(positive_threshold=None, ignore_threshold=None, prior_scaling=[0.1, 0.1, 0.2, 0.2])
if FLAGS.data_format == 'channels_first':
all_layer_shapes = [tf.shape(feat)[2:] for feat in feature_layers]
else:
all_layer_shapes = [tf.shape(feat)[1:3] for feat in feature_layers]
all_layer_strides = [8, 16, 32, 64, 100, 300]
total_layers = len(all_layer_shapes)
anchors_height = list()
anchors_width = list()
anchors_depth = list()
for ind in range(total_layers):
_anchors_height, _anchors_width, _anchor_depth = anchor_encoder_decoder.get_anchors_width_height(all_anchor_scales[ind], all_extra_scales[ind], all_anchor_ratios[ind], name='get_anchors_width_height{}'.format(ind))
anchors_height.append(_anchors_height)
anchors_width.append(_anchors_width)
anchors_depth.append(_anchor_depth)
anchors_ymin, anchors_xmin, anchors_ymax, anchors_xmax, _ = anchor_encoder_decoder.get_all_anchors(tf.squeeze(output_shape, axis=0),
anchors_height, anchors_width, anchors_depth,
[0.5] * total_layers, all_layer_shapes, all_layer_strides,
[0.] * total_layers, [False] * total_layers)
location_pred, cls_pred = ssd_net.multibox_head(feature_layers, FLAGS.num_classes, anchors_depth, data_format=FLAGS.data_format)
if FLAGS.data_format == 'channels_first':
cls_pred = [tf.transpose(pred, [0, 2, 3, 1]) for pred in cls_pred]
location_pred = [tf.transpose(pred, [0, 2, 3, 1]) for pred in location_pred]
cls_pred = [tf.reshape(pred, [-1, FLAGS.num_classes]) for pred in cls_pred]
location_pred = [tf.reshape(pred, [-1, 4]) for pred in location_pred]
cls_pred = tf.concat(cls_pred, axis=0)
location_pred = tf.concat(location_pred, axis=0)
with tf.device('/cpu:0'):
bboxes_pred = anchor_encoder_decoder.decode_anchors(location_pred, anchors_ymin, anchors_xmin, anchors_ymax, anchors_xmax)
selected_bboxes, selected_scores = bbox_util.parse_by_class(tf.squeeze(output_shape, axis=0), cls_pred, bboxes_pred,
FLAGS.num_classes, FLAGS.select_threshold, FLAGS.min_size,
FLAGS.keep_topk, FLAGS.nms_topk, FLAGS.nms_threshold)
labels_list = []
scores_list = []
bboxes_list = []
for k, v in selected_scores.items():
labels_list.append(tf.ones_like(v, tf.int32) * k)
scores_list.append(v)
bboxes_list.append(selected_bboxes[k])
all_labels = tf.concat(labels_list, axis=0)
all_scores = tf.concat(scores_list, axis=0)
all_bboxes = tf.concat(bboxes_list, axis=0)
saver = tf.train.Saver()
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
saver.restore(sess, get_checkpoint())
path = './images/'
image_names = sorted(os.listdir(path))
#def detecte(image_names):
for f in os.listdir(path):
np_image = imread(path + "/" + f)
labels_, scores_, bboxes_, output_shape_ = sess.run([all_labels, all_scores, all_bboxes, output_shape], feed_dict = {image_input : np_image, shape_input : np_image.shape[:-1]})
bboxes_[:, 0] = bboxes_[:, 0] * np_image.shape[0] / output_shape_[0, 0]
bboxes_[:, 1] = bboxes_[:, 1] * np_image.shape[1] / output_shape_[0, 1]
bboxes_[:, 2] = bboxes_[:, 2] * np_image.shape[0] / output_shape_[0, 0]
bboxes_[:, 3] = bboxes_[:, 3] * np_image.shape[1] / output_shape_[0, 1]
img_to_draw = draw_toolbox.bboxes_draw_on_img(np_image, labels_, scores_, bboxes_, thickness=2)
#print(bboxes_)
#imshow(img_to_draw)
cv2.imshow("SSD", img_to_draw)
k = cv2.waitKey(0) & 0xff
#Exit if ESC pressed
if k == 27 : break
# np_image = imread('./demo/test.jpg')
# labels_, scores_, bboxes_, output_shape_ = sess.run([all_labels, all_scores, all_bboxes, output_shape], feed_dict = {image_input : np_image, shape_input : np_image.shape[:-1]})
# bboxes_[:, 0] = bboxes_[:, 0] * np_image.shape[0] / output_shape_[0, 0]
# bboxes_[:, 1] = bboxes_[:, 1] * np_image.shape[1] / output_shape_[0, 1]
# bboxes_[:, 2] = bboxes_[:, 2] * np_image.shape[0] / output_shape_[0, 0]
# bboxes_[:, 3] = bboxes_[:, 3] * np_image.shape[1] / output_shape_[0, 1]
# img_to_draw = draw_toolbox.bboxes_draw_on_img(np_image, labels_, scores_, bboxes_, thickness=2)
#print(bboxes_)
# imshow(img_to_draw)
#imsave('./demo/test_out.jpg', img_to_draw)
if __name__ == '__main__':
tf.logging.set_verbosity(tf.logging.INFO)
tf.app.run()
新建腳本ssd.sh
touch ssd.sh
chmod +x ssd.sh
寫入
python3 simple_ssd_demo.py \
--num_classes=2 \
運行測試
./ssd.sh
任意鍵下一張圖片,Esc退出。