歡迎大家關注筆者，你的關注是我持續更博的最大動力

原創文章，轉載告知，盜版必究

xml介紹與解析，及xml庫包使用

文章目錄：

1 XML簡單介紹

2 XML.etree.ElementTree的使用

xml庫包，就是用來解析xml文件，提取出xml文件中存儲的具體內容

1 XML簡單介紹

1.1 XML簡單介紹

XML（eXtensible Markup Language）：

可擴展標記語言：被設計用來傳輸和存儲數據

常見的XML編程接口有DOM和SAX，這兩種接口處理XML文件的方式不同。

python有三種方法解析XML文件參考：

SAX（simple API for XML）：
DOM（Document Object Model）：將XML數據映射到內存中（比較慢、耗內存），解析成一個樹，通過對樹的操作XML
ElementTree（元素樹）：ElementTree像一個輕量級的DOM，具有方便友好的API，大媽可用性好，速度快，內存消耗少。

1.2 XML語法結構

下面主要以ElementTree來講解如何解析一個xml文件

2 XML.etree.ElementTree的使用

測試的xml文件：test.xml 數據如下：

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

import xml.etree.ElementTree as ET

2.1 讀取xml文件，然後返回根元素

ET.parse()接受參數可以是xml文件路徑，也可以是讀取xml的文件句柄

parse(source: {read}, parser: Any = None) -> ElementTree

ET.parse()：接受參數爲xml文件路徑

tree = ET.parse('./test.xml')
root = tree.getroot()
print(type(root), root)
# <class 'xml.etree.ElementTree.Element'>
# <Element 'data' at 0x00000243ECFE5958>

ET.parse()：接受參數爲讀取xml的文件句柄

tree = ET.parse(open('./test.xml'))
root = tree.getroot()
print(root)  
# <Element 'data' at 0x00000243ECFE5958>

2.2 獲取子元素及子元素屬性

作爲元素Element，如果有子元素

可以使用tag屬性獲取子元素名，元素名以字符串返回
可以使用attrib屬性獲取子元素中定義的屬性，屬性以字典鍵值對返回

for child in root:
    print(child.tag, child.attrib)
    print(type(child.tag), type(child.attrib))
'''
country {'name': 'Liechtenstein'}
<class 'str'> <class 'dict'>
country {'name': 'Singapore'}
<class 'str'> <class 'dict'>
country {'name': 'Panama'}
<class 'str'> <class 'dict'>
'''

2.3 獲取元素標籤中存儲的數據

使用text屬性可以獲取元素標籤中存儲的數據，使用text屬性返回的是字符串類型

print(root.tag)
print(root.attrib)  # 根標籤元素中沒有屬性，因此返回一個空字典
print(root[0])
print(root[0].text, type(root[0].text))
print(root[0][1])
# root根元素的第一個子元素標籤是country，然後country的子元素的第二個元素
print(root[0][1].text, type(root[0][1].text))
'''
data
{}
<Element 'country' at 0x0000018269E1F9A8>
         <class 'str'>
<Element 'year' at 0x00000274B9790A48>
2008 <class 'str'>
'''

2.4 查找指定元素標籤中存儲的數據

Element.iter() 方法可以遞歸遍歷其下所有子樹（包括子級、子級的子級等）

如下：遞歸的找到根元素下所有的 'neighbor’元素

for neighbor in root.iter('neighbor'):
    # print(neighbor.tag)  # 輸出都是neighbor
    print(neighbor.attrib)
'''
{'name': 'Austria', 'direction': 'E'}
{'name': 'Switzerland', 'direction': 'W'}
{'name': 'Malaysia', 'direction': 'N'}
{'name': 'Costa Rica', 'direction': 'W'}
{'name': 'Colombia', 'direction': 'E'}
'''

1、查找指定元素中存儲的數據

rootElement.find('childElement_name')：返回子元素中存儲的數據，返回類型字符串

2、獲取中元素的屬性的屬性值

rootElement.get('rootElementAttrib_name')：返回元素屬性的值，返回類型字符串

for country in root.iter('country'):
    # 查找country元素下的子元素rank，然後輸出rank元素中存儲的值
    # contry.find('rank')：find()方法是查找country元素的子元素rank存儲的值
    rank = country.find('rank').text
    print(type(rank), rank)
    # country.get('name')：get()方法是獲取元素country中屬性name對應的屬性值
    name = country.get('name')
    print(type(name), name)

'''
<class 'str'> 1
<class 'str'> 4
<class 'str'> 68
'''

2.5 修改XML文件

2.5.1 修改xml文件

修改元素存儲數據：使用text屬性
修改元素的屬性：使用Element.set()方法

for rank in root.iter('rank'):
    new_rank = int(rank.text) + 1  # rank元素標籤中存儲的值加1
    rank.text = str(new_rank)  # 轉換成字符串類型
    rank.set('update', 'yes')  # 給rank元素設置屬性  update="yes"

tree.write('output.xml')

修改之後的文件如下：

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

2.5.1 刪除xml文件中一些元素

Element.remove()刪除元素。假設我們要刪除排名高於50的所有國家/地區:

for country in root.findall('country'):
    rank = int(country.find('rank').text)
    if rank > 50:
        root.remove(country)

tree.write('output2.xml')

刪除之後的結果如下：

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
</data>

參考1：https://docs.python.org/zh-cn/3/library/xml.html

♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠

xml介紹與解析，及xml庫包使用

xml介紹與解析，及xml庫包使用

文章目錄：

1 XML簡單介紹

1.1 XML簡單介紹

1.2 XML語法結構

2 XML.etree.ElementTree的使用

2.1 讀取xml文件，然後返回根元素

2.2 獲取子元素及子元素屬性

2.3 獲取元素標籤中存儲的數據

2.4 查找指定元素標籤中存儲的數據

2.5 修改XML文件

2.5.1 修改xml文件

2.5.1 刪除xml文件中一些元素

lightdb hash index的性能和限制

Linux的關機命令和重啓命令

python列表生成式和if語句、if...else語句、zip函數結合使用

python Flask框架如何請求及返回數據——flask詳細教程

Linux中gcc的編譯、靜態庫和動態庫的製作

最新版本的mmdetection2.0 （v2.0.0版本）環境搭建、訓練自己的數據集、測試以及常見錯誤集合

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

xml介紹與解析，及xml庫包使用

xml介紹與解析，及xml庫包使用文章目錄：

1 XML簡單介紹

1.1 XML簡單介紹

1.2 XML語法結構

2 XML.etree.ElementTree的使用

2.1 讀取xml文件，然後返回根元素

2.2 獲取子元素及子元素屬性

2.3 獲取元素標籤中存儲的數據

2.4 查找指定元素標籤中存儲的數據

2.5 修改XML文件

2.5.1 修改xml文件

2.5.1 刪除xml文件中一些元素

xml介紹與解析，及xml庫包使用

文章目錄：