一、本地數據集的製作

首先從百度PandlePandle平臺上獲得了蟲子的數據，原先已經在google colab上跑通了faster rcnn的源碼，跑通的部分記錄放在CSDN上面的。那部分的過程主要還是根據別人的教程一步一步來實現的，訓練測試的數據是VOC2007

所以這裏想自己將數據集格式改成VOC2007格式的，然後進行數據集的替換。然後訓練測試來一發，看看自定義數據集的效果。

數據集的更改：

xml文件的修改
- xml文件中的幾個標籤內容要改
蟲子圖片的格式爲jpeg，後面替換數據集後發現程序報錯，要把圖片格式改成jpg格式的。這個比較簡單，命令行界面下，到圖片集的目錄下，然後 ren *.jpeg *.jpg
xml文件中標籤內容批量修改時，用python批量修改後，這裏想強調幾點
- 修改完成時，寫入過程中保證編碼方式還是原來的 UTF-8 編碼
- 用記事本打開xml文件時會發現下方的信息 Unix(LF) UTF-8。但是用python批量修改之後再打開，變成了windows(CR LF)。這裏應該是換行的符號不同。還不知道對實驗有沒有影響。
然後還想記下一些以後可能用得到的代碼：

批量修改文件夾下所有xml文件中的標籤信息


# coding=utf-8

import os

import os.path

import xml.dom.minidom

path = "C:/Users/Administrator/Desktop/AI_studio/PaddleDetection/dataset/insect/Annotations"

files = os.listdir(path)  # 得到文件夾下所有文件名稱

for xmlFile in files:  # 遍歷文件夾

    if not os.path.isdir(xmlFile):  # 判斷是否是文件夾,不是文件夾纔打開

        print(xmlFile)

        # 將獲取的xml文件名送入到dom解析

        dom = xml.dom.minidom.parse(os.path.join(path, xmlFile))  # 輸入xml文件具體路徑

        root = dom.documentElement

        # 獲取標籤<name>以及<folder>的值

        # name = root.getElementsByTagName('name')

        folder = root.getElementsByTagName('folder')

        filename = root.getElementsByTagName('filename')

        # 對每個xml文件的多個同樣的屬性值進行修改。此處將每一個<name>屬性修改爲plane,每一個<folder>屬性修改爲VOC2007

        # for i in range(len(name)):

        #     print(name[i].firstChild.data)

        #     name[i].firstChild.data = 'plane'

        #     print(name[i].firstChild.data)

        folder[0].firstChild.data = "VOC2007"

        for i in range(len(filename)):

            # print(filename[i].firstChild.data)

            lista = filename[i].firstChild.data.split('.')

            ans = lista[0]+'.'+"jpg"

            filename[i].firstChild.data = ans

            # print(ans)

        # 將屬性存儲至xml文件中

        with open(os.path.join(path, xmlFile), 'w', encoding='UTF-8') as fh:   # 門道相當多啊

            dom.writexml(fh)

            print('已寫入')

將文件夾下所有圖片名讀取出來，寫入txt文件中


import os

import os.path

results = set()

path = "C:/Users/Administrator/Desktop/AI_studio/PaddleDetection/dataset/insect/JPEGImages/test"

files = os.listdir(path)  # 得到文件夾下所有文件名稱

for filenames in files:

    filename = filenames.split(sep='.')[0]

    results.add(filename)


print(len(results))

txt_path = "C:/Users/Administrator/Desktop/AI_studio/PaddleDetection/dataset/insect/ImageSets/test.txt"

new_file = open(txt_path,'w+',encoding='UTF-8')

for str in results:

    new_file.write(str+'\n')

new_file.close()

數據集的訓練、驗證、測試數據集的分割


*"""*

*將數據集進行比例分割，分割成train,val,test,*

*生成  train.txt, val.txt,  test.txt,  trainval.txt*

*"""*

import os

import random

trainval_percent = 0.8

train_percent = 0.8

xmlfilepath = 'C:/Users/Administrator/Desktop/AI_studio/PaddleDetection/dataset/insect/Annotations'

txtsavepath = 'C:/Users/Administrator/Desktop/AI_studio/PaddleDetection/dataset/insect/ImageSets/Main'

total_xml = os.listdir(xmlfilepath)

num = len(total_xml)

list = range(num)

tv = int(num * trainval_percent)

tr = int(tv * train_percent)

trainval = random.sample(list, tv)

train = random.sample(trainval, tr)

ftrainval = open('C:/Users/Administrator/Desktop/AI_studio/PaddleDetection/dataset/insect/ImageSets/Main/trainval.txt', 'w')

ftest = open('C:/Users/Administrator/Desktop/AI_studio/PaddleDetection/dataset/insect/ImageSets/Main/test.txt', 'w')

ftrain = open('C:/Users/Administrator/Desktop/AI_studio/PaddleDetection/dataset/insect/ImageSets/Main/train.txt', 'w')

fval = open('C:/Users/Administrator/Desktop/AI_studio/PaddleDetection/dataset/insect/ImageSets/Main/val.txt', 'w')

for i in list:

    name = total_xml[i][:-4] + '\n'

    if i in trainval:

        ftrainval.write(name)

        if i in train:

            ftrain.write(name)

        else:

            fval.write(name)

    else:

        ftest.write(name)

ftrainval.close()

ftrain.close()

fval.close()

ftest.close()

文件格式轉換 window 轉換爲unix

https://www.cnblogs.com/TurboWay/p/9687576.html

import sys
import os
import chardet

def turn(file):
    with open(file, 'rb') as f:
        data = f.read()
        encoding = chardet.detect(data)['encoding']
        data_str = data.decode(encoding)
        tp = 'LF'
        if '\r\n' in data_str:
            tp = 'CRLF'
            data_str = data_str.replace('\r\n', '\n')
        if encoding not in ['utf-8', 'ascii'] or tp == 'CRLF':
            with open(file, 'w', newline='\n', encoding='utf-8') as f:
                f.write(data_str)
            print(f"{file}: ({tp},{encoding}) trun to (LF,utf-8) success!")

if __name__ == "__main__":
    if sys.argv.__len__() != 2:
        print(f"param: python3 etl_file_check.py /home/getway/script/hql")
    else:
        dr = sys.argv[1]
        for path in os.listdir(dr):
            file = os.path.join(dr, path)
            if os.path.isfile(file):
                turn(file)

2.21日又開始調試了

今天將數據集搞好後，開始調試訓練代碼報的錯誤。還好有大佬們的博客博客幫忙

https://www.cnblogs.com/wind-chaser/p/11359521.html

博客上面一樣的問題我就不記錄了，下面是自己遇到的問題

訓練時報錯內容是

cls = self._class_to_ind[obj.find('name').text.lower().strip()]

網上的說法不盡相同，嘗試了幾次都沒有搞對，後面自己去看了下源代碼。簡單介紹下
源文件的處理過程是先從trainval.txt 中讀取出要訓練驗證的數據是哪些圖片和xml，然後程序從對應的xml文件中讀取出object標籤下的信息，也就是一張圖中所有的目標信息，這些目標信息再以程序自己的格式存儲下來（也就是定義多個列表，比如位置信息），最後還要存儲目標的類別信息，問題就出在這，程序會把讀到的類別信息轉成小寫，然後再去字典（這個字典是幾個類別分別對應的索引）查找該類索引，最後記錄類別信息也就是記錄字典的索引。因爲字典中的類別key值有幾個是大寫，所以沒有匹配上就一直報錯啦。OK

處理完這個錯誤之後繼續訓練，報錯如下：

也不知道調試了多久，主要網上沒找到相似的情況，所以把我都搞醉了，大概是去原作者的github中的說明中去找步驟，發現以前版本的代碼有一個腳本文件，而新版本代碼中倒是沒有，上面報錯的情況大概是配置的問題，有關GPU問題的，而且還是ROIAlign部分出的問題，本來這是Mask RCNN中才有的模塊，原作者也說了他們提供了幾種pooling技術而且都實現了，所以代碼應該是默認用了Mask RCNN中更好的模塊替代了 ROI Pooling吧。好了，說這麼多其實對於後面解決該問題也沒什麼邏輯可言。

我直覺覺得可能是模塊編譯出了問題，所以我就又編譯了一下，還是不對。

繼續研究目錄結構，在將作者的代碼clone下來時，lib文件夾下並沒有build文件夾，那就很好理解，肯定是編譯的時候產生的啦，之前重新編譯不行，那就大概率是因爲編譯文件已經產生了，就沒有重新來一遍。我就果斷的刪除了build文件，重新編譯，再次運行訓練命令，果真開始訓練了。那感覺，確實不錯！

後續train_net、 test_net、 demo都試了一下，不知道demo放入的數據是已經訓練的數據還是啥原因，看起來也太準了吧。後面再仔細分類一下百度的AI試蟲數據集，把未訓練的數據集放入項目中demo一下，看看準確率如何

情況如下：