Lxml模块

lxml：它可以分析xml文件,html是xml的子集，所以分析html文档可以使用正则也可以使用lxml
示例文档

<bookstore>
        <li id='test3'> li test3</li>
        <book>
          <title>Harry Potter</title>
          <author>J K. Rowling</author>
          <year>2005</year>
          <price>29.99</price>
          <li>li test1</li>
          <li id='test2'>li test2</li>
        </book>
</bookstore>
<test>
    <li id='test3'>li test4</li>
</test>

lxml示例

实例1：找到

Harry Potter
/bookstore/book/title
实例：找到book里面所有li
/bookstore/book/li
实例：找到bookstore里面所有li
/bookstore/book/li|/bookstore/li (|表示或的意思)
/bookstore//li //表示不管层次只要是li全部找到
实例：找到整个文档中的li
//li
实例：找到所有含有id属性的li
//li[@id]
实例：找到所有含有id属性的li,并且id的值为test3
//li[@id=‘test3’]
实例：找到所有li的id属性
//li/@id 得到标签中的属性值
//li/text() 得到标签中的内容

一个完整示例：

from lxml import etree
html = '''    <bookstore>
            <li id='test3'> li test3</li>
            <book>
              <title>Harry Potter</title>
              <author>J K. Rowling</author>
              <year>2005</year>
              <price>29.99</price>
              <li>li test1</li>
              <li id='test2'>li test2</li>
            </book>
    </bookstore>
    <test>
        <li id='test3'>li test4</li>
    </test>'''

dom = etree.HTML(html)
ret = dom.xpath('//li/text()')
print(ret)
ret = dom.xpath('//li/@id')
print(ret)

一个完整示例：

   from lxml import etree
    html = '''    <bookstore>
                <li id='test3'> li test3</li>
                <book>
                  <title>Harry Potter</title>
                  <author>J K. Rowling</author>
                  <year>2005</year>
                  <price>29.99</price>
                  <li>li test1</li>
                  <li id='test2'>li test2</li>
                </book>
        </bookstore>
        <test>
            <li id='test3'>li test4</li>
        </test>'''

    dom = etree.HTML(html)
    ret = dom.xpath('//li[@id]')
    for li in ret:
        print(li.text)
        print(li.attrib['id'])
        print(etree.tostring(li).decode())
        print('=' * 50)

【SQL进阶】CASE语句的使用

npm error Cannot read properties of null (reading 'isDescendantOf')

python電子版

構建數據包，Ansible

畫圖

Ansible-playbook

電子郵件

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結