1. HOG+SVM使用的行人識別數據集

（1） INRIA Person Dataset（INRIA行人數據庫）——可見光數據集，樣本大小128*64

2. 紅外行人數據集：KAIST Multispectral Pedestrain Detection Benchmark

github地址：https://github.com/SoonminHwang/rgbt-ped-detection
這個是多光譜圖像：對齊的RGB+熱像儀圖像，640×480像素，手動註釋所有人，騎自行車的人。註釋包括邊界框（如Caltech Pedestrian Dataset）之間的時間對應。
許多人也在努力提高我們的基準行人檢測性能，如：
- FusionRPN + BDT [CVPR '17]: 29.83%
- Halfway Fusion [BMVC '16]: 36.22%
- LateFusion CNN [ESANN '16]: 43.80%
- CMT-CNN [CVPR '17]: 49.55%
- Baseline, ACF+T+THOG [CVPR '15]: 54.40%
其他研究人員也採用多模態的方法，Also, another researches to employ multi-modality are presented.
- Image-to-image translation [Arxiv '17]
- Calibrations

3. 南方科技大學的紅外數據集：SCUT_FIR_Pedestrian_Dataset

Github地址： https://github.com/SCUT-CV/SCUT_FIR_Pedestrian_Dataset
一些算法在此數據集下的表現：

執行之前，先運行一下starup，將所有的文件夾加入當前路徑，才能方便的調用其他函數
extract_img_anno_scut
dbExtract_scut 從數據庫seq提取images，同時提取annotations到Txt files
從pth的annatation裏面load ground truth，即annatations文件夾要放在pth的路徑下面
然後將提取到的Images放在tDir的images文件夾下面
seq文件要放在pth的videos文件夾下面。
然後運行 extract_img_anno_scut函數即可

得到的結果如下：

import cv2
import os
import sys
import glob
from PIL import Image

# VEDAI 圖像存儲位置
src_img_dir = "F:\\pedestrain detection benchmark\\SCUT_FIR_Pedestrian_Dataset-master\\train02\\images\\"
# VEDAI 圖像的 ground truth 的 txt 文件存放位置
src_txt_dir = "F:\\pedestrain detection benchmark\\SCUT_FIR_Pedestrian_Dataset-master\\train02\\annotations\\"
# 所有的圖像名稱
img_Lists = glob.glob(src_img_dir + '/*.jpg')

img_basenames = []  # e.g. 100.jpg
for item in img_Lists:
    img_basenames.append(os.path.basename(item))

img_names = []  # e.g. 100
for item in img_basenames:
    temp1, temp2 = os.path.splitext(item)
    img_names.append(temp1)

for img in img_names:
    # open the crospronding txt file
    with open(src_txt_dir + '/' + img + '.txt',"r") as f:
        line_count = 0
        for line in f.readlines():
            if line_count == 0:
                line_count += 1
                continue
            line = line.strip('\n')
            spt = line.split(' ')
            cat_name = str(spt[0])
            x_min = int(spt[1])
            y_min = int(spt[2])
            bbox_wid = int(spt[3])
            bbox_hei = int(spt[4])

            if cat_name == "people" or cat_name == "walk person":
                if bbox_hei >= 10 and bbox_hei <= 30:
                    bbox_hei_new = 2*bbox_wid
                    if y_min+bbox_hei > 576:
                        y_min = 576-bbox_hei_new
                    else:
                        y_min = y_min-(bbox_hei_new-bbox_hei)/2

                    # 確保尺寸爲1:2，擴充爲原來的1.5倍
                    '''
                    bbox_wid_new = int(bbox_wid)
                    bbox_hei_new = int(2 * bbox_wid_new)
                    # 防止越界
                    if y_min >= 0.25*bbox_hei and y_min + bbox_hei_new< 576:
                        y_min = y_min - 0.25*bbox_hei
                    elif y_min < 0.25*bbox_hei:
                        y_min = 0
                    else:
                        y_min = 576 - bbox_hei_new

                    if x_min >= 0.25 * bbox_wid and x_min + bbox_wid_new < 720:
                        x_min = x_min - 0.25 * bbox_wid
                    elif y_min < 0.25 * bbox_wid:
                        x_min = 0
                    else:
                        x_min = 720 - bbox_wid_new
                    '''

                    # 讀入圖像
                    img_data = cv2.imread((src_img_dir + '/' + img + '.jpg'), cv2.IMREAD_UNCHANGED)
                    cropped = img_data[int(y_min):int(y_min+bbox_hei_new),int(x_min):int(x_min+bbox_wid)]  # y爲高度，x爲寬度
                    cv2.imwrite("G:\\scut_pedestrain_crop\\train\\" + "test_" + str(img) + ".jpg", cropped)
                    # cv2.rectangle(img_data, (x_min, y_min), (x_min+bbox_wid, y_min+bbox_hei), (255, 0, 0), 2)
                    # cv2.imshow('label', img_data)
                    # cv2.waitKey(0)


        '''
    # gt = open(src_txt_dir + '/gt_' + img + '.txt').read().splitlines()

    # write the region of image on xml file
    for img_each_label in gt:
        spt = img_each_label.split(' ')  # 這裏如果txt裏面是以逗號‘，’隔開的，那麼就改爲spt = img_each_label.split(',')。


        xml_file.write('        <name>' + str(spt[4]) + '</name>\n')
        xml_file.write('        <pose>Unspecified</pose>\n')
        xml_file.write('        <truncated>0</truncated>\n')
        xml_file.write('        <difficult>0</difficult>\n')
        xml_file.write('        <bndbox>\n')
        xml_file.write('            <xmin>' + str(spt[0]) + '</xmin>\n')
        xml_file.write('            <ymin>' + str(spt[1]) + '</ymin>\n')
        xml_file.write('            <xmax>' + str(spt[2]) + '</xmax>\n')
        xml_file.write('            <ymax>' + str(spt[3]) + '</ymax>\n')
        xml_file.write('        </bndbox>\n')
        xml_file.write('    </object>\n')


# 獲取路徑下的所有txt格式的文件名
def file_name(file_dir):
    File_Name=[]
    img_dir = "F:\\pedestrain detection benchmark\\SCUT_FIR_Pedestrian_Dataset-master\\train02\\images\\"
    for files in os.listdir(file_dir):
        if os.path.splitext(files)[1] == '.txt':
            img_dir += files[:-3]+"jpg"
            # File_Name.append(files)
            # 以原格式讀入圖像，原圖像爲灰度圖
            img = cv2.imread(img_dir,cv2.IMREAD_UNCHANGED)
            # 需要讀入標籤信息



    return File_Name
txt_file_name=file_name("F:\\pedestrain detection benchmark\\SCUT_FIR_Pedestrian_Dataset-master\\train02\\annotations\\")
print("txt_file_name",txt_file_name)
        '''

行人檢測（3）——數據集

1. HOG+SVM使用的行人識別數據集

2. 紅外行人數據集：KAIST Multispectral Pedestrain Detection Benchmark

3. 南方科技大學的紅外數據集：SCUT_FIR_Pedestrian_Dataset

學習路線與規劃

[吳恩達機器學習exercise3：多分類 one vs all和神經網絡]

行人檢測（3）——數據集

Python入門基礎二：Opencv的安裝

《Hands-On Machine Learning with Scikit-Learn & TensorFlow》第三章分類

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結