VOC格式數據集數據處理小工具(Python腳本)

1. 源代碼

1.1 接口說明

parse_vocxml: 解析voc_xml文件,返回一個列表bboxes = [bbox_1, bbox_2, …],其中邊界框bbox = [cls, x_min, y_min, x_max, y_max],參數cls是類別class縮寫,即返回所有的標註框。
del_specific_cls: 刪除voc_xml文件中的指定類別標註框,參數clss是classes的縮寫,數據類型爲set。
change_cls_name: 將voc_xml文件中某一類別的舊名稱替換成指定的新名稱,參數cls_old2new_dict的數據類型是dictionary。
merge_xmls_for_same_image: 將同一幅圖片的多個voc_xml文件合併爲一個voc_xml文件,參數args = [xml_save_path, xml_1_path, xml_2_path, …],其中xml_save_path爲輸出xml的路徑,剩餘的爲需要合併的xml的路徑。需求:一個人負責標記類別A,另一個人負責標記類別B,最後需要整理合並。

以下代碼的接口僅處理一個voc_xml文件,想要處理整個數據集可考慮如下方法:

# An example
xml_paths = [(dir_path + var) for var in os.listdir(dir_path) if var.endswith('.xml')]
for xml_path in xml_paths:
        del_specific_cls(xml_path, {'cls_1', 'cls_2'})

1.2 代碼

import os
import xml.etree.ElementTree as ET


def parse_vocxml(xml_path):
    '''
    return bboxes = [bbox_1, bbox_2, ...]
    where bbox = [cls, x_min, y_min, x_max, y_max].
    '''
    if not os.path.exists(xml_path):
        raise FileNotFoundError

    tree = ET.parse(xml_path)
    bboxes = []

    for var in tree.iter():
        if var.tag == 'object':
            cls, x_min, y_min, x_max, y_max = None, None, None, None, None
            for element in list(var):
                if element.tag == 'name':
                    cls = element.text
                elif element.tag == 'bndbox':
                    for coordinate in list(element):
                        if coordinate.tag == 'xmin':
                            x_min = int(coordinate.text)
                        elif coordinate.tag == 'ymin':
                            y_min = int(coordinate.text)
                        elif coordinate.tag == 'xmax':
                            x_max = int(coordinate.text)
                        elif coordinate.tag == 'ymax':
                            y_max = int(coordinate.text)
            bbox = [cls, x_min, y_min, x_max, y_max]
            bboxes.append(bbox)

    return bboxes


def del_specific_cls(xml_path, clss):
    '''
    delete specific clss = set([cls_1, cls_2, ...]) from a voc-xml file.
    '''
    if os.path.exists(xml_path) == False:
        raise FileNotFoundError

    tree = ET.parse(xml_path)
    root = tree.getroot()

    annos = [anno for anno in root.iter()]
    for i, anno in enumerate(annos):
        if anno.tag == 'object':
            for element in list(anno):
                if element.tag == 'name':
                    if element.text in clss:
                        root.remove(annos[i])

    tree = ET.ElementTree(root)
    tree.write(xml_path, encoding="utf-8", xml_declaration=True)


def change_cls_name(xml_path, cls_old2new_dict):
    '''
    change cls name from cls_old to cls_new for a voc-xml file.
    cls_old2new_dict = {cls_old: cls_new} is a dictionary.
    '''
    if os.path.exists(xml_path) == False:
        raise FileNotFoundError

    tree = ET.parse(xml_path)
    root = tree.getroot()

    annos = [anno for anno in root.iter()]
    for i, anno in enumerate(annos):
        if anno.tag == 'object':
            for element in list(anno):
                if element.tag == 'name':
                    if element.text in cls_old2new_dict.keys():
                        element.text = cls_old2new_dict[element.text]

    tree = ET.ElementTree(root)
    tree.write(xml_path, encoding="utf-8", xml_declaration=True)


def merge_xmls_for_same_image(*args):
    '''
    args = (target_xml_save_path, xml_1, xml_2, ...)
    '''

    def _append_obj(root, bbox):
        '''
        bbox = [cls, x_min, y_min, x_max, y_max]
        '''
        obj = ET.Element('object')
        name = ET.SubElement(obj, 'name')
        name.text = bbox[0]
        pose = ET.SubElement(obj, 'pose')
        pose.text = 'Unspecified'
        truncated = ET.SubElement(obj, 'truncated')
        truncated.text = '0'
        difficult = ET.SubElement(obj, 'difficult')
        difficult.text = '0'
        bndbox = ET.SubElement(obj, 'bndbox')
        xmin = ET.SubElement(bndbox, 'xmin')
        xmin.text = str(bbox[1])
        ymin = ET.SubElement(bndbox, 'ymin')
        ymin.text = str(bbox[2])
        xmax = ET.SubElement(bndbox, 'xmax')
        xmax.text = str(bbox[3])
        ymax = ET.SubElement(bndbox, 'ymax')
        ymax.text = str(bbox[4])
        root.append(obj)
        return root

    if args == None or len(args) < 2:
        raise Exception('args is None, or len(args) < 2.')

    target_xml_save_path = args[0]

    tree = ET.parse(args[1])
    root = tree.getroot()

    for arg in args[2:]:
        bboxes = parse_vocxml(arg)
        for bbox in bboxes:
            _append_obj(root, bbox)

    tree.write(target_xml_save_path, encoding='utf-8', xml_declaration=True)


if __name__ == '__main__':
    dir_path = ''
    xml_paths = [(dir_path + var) for var in os.listdir(dir_path) if var.endswith('.xml')]
    for xml_path in xml_paths:
        del_specific_cls(xml_path, {'cls_1', 'cls_2'})

2. Reference

VOC格式數據集操作類構建-3.刪除指定類別標籤和修改指定標籤類別名稱
Github項目地址(附有使用說明書):https://github.com/A-mockingbird/VOCtype-datasetOperation)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章