Caltech Pedestrian Detection數據的預處理

參考博客鏈接:

  1. https://blog.csdn.net/a2008301610258/article/details/45873867
  2. https://github.com/hizhangp/caltech-pedestrian-converter
  3. http://www.kanadas.com/program-e/2015/06/converting_caltech_pedestrian.html

需要實現的功能:

  1. 將seq格式的視頻文件提取圖片,保存爲.jpg格式的圖片。
  2. 將vbb格式的bounding box標註文件轉換爲txt文件,供darknet,caffe-ssd等模型進行訓練。

目的:訓練yolov3和yolo-v3-tiny,需要大規模的行人檢測數據集,因此就選到了Caltech Pedestrian Detection這個數據集。

從seq格式文件提取圖片並保存

#-*-coding:utf-8-*-
import os
import numpy as np
import cv2
import os.path
import fnmatch
import shutil

def open_save(file,savepath):
    # read .seq file, and save the images into the savepath
    f = open(file,'rb')
    string = str(f.read())
    splitstring = "\xFF\xD8\xFF\xE0\x00\x10\x4A\x46\x49\x46"
    # split .seq file into segment with the image prefix
    strlist=string.split(splitstring)
    f.close()
    count = 0
    # delete the image folder path if it exists
    if os.path.exists(savepath):
        shutil.rmtree(savepath)
    # create the image folder path
    if not os.path.exists(savepath):
        os.mkdir(savepath)
    # deal with file segment, every segment is an image except the first one
    for img in strlist:
        filename = str(count)+'.jpg'
        filenamewithpath=os.path.join(savepath, filename)
        # abandon the first one, which is filled with .seq header
        if count > 0:
            i=open(filenamewithpath,'wb+')
            i.write(splitstring)
            i.write(img)
            i.close()
        count += 1

if __name__=="__main__":

    for i in range(11):
        rootdir = "./set{:02}/".format(i)
        print rootdir
        # walk in the rootdir, take down the .seq filename and filepath
        for parent, dirnames, filenames in os.walk(rootdir):
            for filename in filenames:
                # check .seq file with suffix
                if fnmatch.fnmatch(filename,'*.seq'):
                    # take down the filename with path of .seq file
                    thefilename = os.path.join(parent, filename)
                    # create the image folder by combining .seq file path with .seq filename
                    thesavepath = os.path.join(parent, filename.split('.')[0])
                    print("Filename=" + thefilename)
                    print("Savepath=" + thesavepath)
                    open_save(thefilename,thesavepath)

效果:提取出set00/V000/1.jpg這樣命令的圖片,這裏的圖片id和下面生成的bbx的id是一一對應的

從vbb格式文件提取出bounding box

clear
clc
root_dir = 'E:/唐聖欽/person/annotations/';
path1 = dir(root_dir);
for i=3:length(path1)
    dir1 = path1(i).name;
    path2 = dir([ root_dir , dir1])
    for j=3:length(path2)
        ful_path = [root_dir, dir1, '/', path2(j).name]
        temp = strsplit(ful_path, '.');
        save_path = strcat(temp(1), '.txt');
        save_path = save_path{1}
        A = vbb('vbbLoad', ful_path );
        c=fopen(save_path, 'w');
        for i = 1:A.nFrame
            iframe = A.objLists(1,i);
            iframe_data = iframe{1,1};
            n1length = length(iframe_data);
            for  j = 1:n1length
                iframe_dataj = iframe_data(j);
                if iframe_dataj.pos(1) ~= 0  %pos  posv
                    fprintf(c,'%d %f %f %f %f\n', i, iframe_dataj.pos(1),iframe_dataj.pos(2),iframe_dataj.pos(3),iframe_dataj.pos(4));
                end
            end
        end
        fclose(c);
    end
end

效果:生成annotations/set00/V000.txt這樣的txt文件, 格式:id, x, y, w, h (x,y爲左上角座標)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章