Python基礎-25 JSONPath用法

25 使用Python處理JSON數據

25.1 JSON簡介

25.1.1 什麼是JSON

    JSON全稱爲JavaScript Object Notation,一般翻譯爲JS標記,是一種輕量級的數據交換格式。是基於ECMAScript的一個子集,採用完全獨立於編程語言的文本格式來存儲和表示數據。簡潔和清晰的層次結構使得JSON成爲理想的數據交換語言,其主要特點有:易於閱讀易於機器生成有效提升網絡速度等。

25.1.2 JSON的兩種結構

    JSON簡單來說,可以理解爲JavaScript中的數組對象,通過這兩種結構,可以表示各種複雜的結構。

25.1.2.1 數組

    數組在JavaScript是使用中括號[ ]來定義的,一般定義格式如下所示:

let array=["Surpass","28","Shanghai"];

    若要對數組取值,則需要使用索引。元素的類型可以是數字字符串數組對象等。

25.1.2.2 對象

    對象在JavaScript是使用大括號{ }來定義的,一般定義格式如下所示:

let personInfo={
  name:"Surpass",
  age:28,
  location:"Shanghai"
}

    對象一般是基於keyvalue,在JavaScript中,其取值方式也非常簡單variable.key即可。元素value的類型可以是數字字符串數組對象等。

25.1.3 支持的數據格式

    JSON支持的主要數據格式如下所示:

  • 數組:使用中括號
  • 對象:使用大括號
  • 整型浮點型布爾類型null
  • 字符串類型:必須使用雙引號,不能使用單引號

    多個數據之間使用逗號做爲分隔符,基與Python中的數據類型對應表如下所示:

JSON Python
Object dict
array list
string str
number(int) int
number(real) float
true True
false False
null None

25.2 Python對JSON的支持

25.2.1 Python 和 JSON 數據類型

    在Python中主要使用json模塊來對JSON數據進行處理。在使用前,需要導入json模塊,用法如下所示:

import json

    json模塊中主要包含以下四個操作函數,如下所示:

    在json的處理過種中,Python中的原始類型與JSON類型會存在相互轉換,具體的轉換表如下所示:

  • Python 轉換爲 JSON
Python JSON
dict Object
list array
tuple array
str string
int number
float number
True true
False false
None null
  • JSON 轉換爲 Python
JSON Python
Object dict
array list
string str
number(int) int
number(real) float
true True
false False
null None

25.2.2 json模塊常用方法

    關於Python 內置的json模塊,可以查看之前我寫的文章:https://www.cnblogs.com/surpassme/p/13034972.html

25.3 使用JSONPath處理JSON數據

    內置的json模塊,在處理簡單的JSON數據時,易用且非常非常方便,但在處理比較複雜且特別大的JSON數據,還是有一些費力,今天我們使用一個第三方的工具來處理JSON數據,叫JSONPath

25.3.1 什麼是JSONPath

    JSONPath是一種用於解析JSON數據的表達語言。經常用於解析和處理多層嵌套的JSON數據,其用法與解析XML數據的XPath表達式語言非常相似。

25.3.2 安裝

    安裝方法如下所示:

# pip install -U jsonpath

25.3.3 JSONPath語法

    JSONPath語法與XPath非常相似,其對應參照表如下所示:

XPath JSONPath 描述
/ $ 根節點/元素
. @ 當前節點/元素
/ . or [] 子元素
.. n/a 父元素
// .. 遞歸向下搜索子元素
* * 通配符,表示所有元素
@ n/a 訪問屬性,JSON結構的數據沒有這種屬性
[] [] 子元素操作符(可以在裏面做簡單的迭代操作,如數據索引,根據內容選值等)
| [,] 支持迭代器中做多選
n/a [start :end :step] 數組分割操作
[] ?() 篩選表達式
n/a () 支持表達式計算
() n/a 分組,JSONPath不支持

以上內容可查閱官方文檔:https://goessner.net/articles/JsonPath/

    我們以下示例數據爲例,來進行對比,如下所示:

{ "store": 
  {
    "book": [ 
      { "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
      },
      { "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
      },
      { "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
      },
      { "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
      }
    ],
    "bicycle": {
      "color": "red",
      "price": 19.95
    }
  }
}
XPath JSONPath 結果
/store/book/author $.store.book[*].author 獲取book節點中所有author
//author $..author 獲取所有author
/store/* $.store.* 獲取store的元素,包含book和bicycle
/store//price $.store..price 獲取store中的所有price
//book[3] $..book[2] 獲取第三本書所有信息
//book[last()] $..book[(@.length-1)]
$..book[-1:]
獲取最後一本書的信息
//book[position()❤️] $..book[0,1]
$..book[:2]
獲取前面的兩本書
//book[isbn] $..book[?(@.isbn)] 根據isbn進行過濾
//book[price<10] $..book[?(@.price<10)] 根據price進行篩選
//* $..* 所有元素

在XPath中,下標是1開始,而在JSONPath中是從0開始

JSONPath在線練習網址:http://jsonpath.com/

25.3.4 JSONPath用法

    其基本用法形式如下所示:

jsonPath(obj, expr [, args])

    基參數如下所示:

  • obj (object|array):

    JSON數據對象

  • expr (string):

    JSONPath表達式

  • args (object|undefined):

    改變輸出格式,比如是輸出是值還是路徑,

args.resultType可選的輸出格式爲:"VALUE"、"PATH"、"IPATH"

  • 返回類型爲(array|false):

    若返回array,則代表成功匹配到數據,false則代表未匹配到數據。

25.3.5 在Python中的使用

from jsonpath import  jsonpath
import json

data = {
    "store":
        {
            "book": [
                {
                    "category": "reference",
                    "author": "Nigel Rees",
                    "title": "Sayings of the Century",
                    "price": 8.95
                },
                {
                    "category": "fiction",
                    "author": "Evelyn Waugh",
                    "title": "Sword of Honour",
                    "price": 12.99
                },
                {
                    "category": "fiction",
                    "author": "Herman Melville",
                    "title": "Moby Dick",
                    "isbn": "0-553-21311-3",
                    "price": 8.99
                },
                {
                    "category": "fiction",
                    "author": "J. R. R. Tolkien",
                    "title": "The Lord of the Rings",
                    "isbn": "0-395-19395-8",
                    "price": 22.99
                }
            ],
            "bicycle": {
                "color": "red",
                "price": 19.95
            }
        }
}

#  獲取book節點中所有author
getAllBookAuthor=jsonpath(data,"$.store.book[*].author")
print(f"getAllBookAuthor is :{json.dumps(getAllBookAuthor,indent=4)}")
#  獲取book節點中所有author
getAllAuthor=jsonpath(data,"$..author")
print(f"getAllAuthor is {json.dumps(getAllAuthor,indent=4)}")
#  獲取store的元素,包含book和bicycle
getAllStoreElement=jsonpath(data,"$.store.*")
print(f"getAllStoreElement is {json.dumps(getAllStoreElement,indent=4)}")
# 獲取store中的所有price
getAllStorePriceA=jsonpath(data,"$[store]..price")
getAllStorePriceB=jsonpath(data,"$.store..price")
print(f"getAllStorePrictA is {getAllStorePriceA}\ngetAllStorePriceB is {getAllStorePriceB}")
# 獲取第三本書所有信息
getThirdBookInfo=jsonpath(data,"$..book[2]")
print(f"getThirdBookInfo is {json.dumps(getThirdBookInfo,indent=4)}")
# 獲取最後一本書的信息
getLastBookInfo=jsonpath(data,"$..book[-1:]")
print(f"getLastBookInfo is {json.dumps(getLastBookInfo,indent=4)}")
# 獲取前面的兩本書
getFirstAndSecondBookInfo=jsonpath(data,"$..book[:2]")
print(f"getFirstAndSecondBookInfo is {json.dumps(getFirstAndSecondBookInfo,indent=4)}")
#  根據isbn進行過濾
getWithFilterISBN=jsonpath(data,"$..book[?(@.isbn)]")
print(f"getWithFilterISBN is {json.dumps(getWithFilterISBN,indent=4)}")
# 根據price進行篩選
getWithFilterPrice=jsonpath(data,"$..book[?(@.price<10)]")
print(f"getWithFilterPrice is {json.dumps(getWithFilterPrice,indent=4)}")
# 所有元素
getAllElement=jsonpath(data,"$..*")
print(f"getAllElement is {json.dumps(getAllElement,indent=4)}")
# 未能匹配到元素時
noMatchElement=jsonpath(data,"$..surpass")
print(f"noMatchElement is {noMatchElement}")
# 調整輸出格式
controlleOutput=jsonpath(data,expr="$..author",result_type="PATH")
print(f"controlleOutput is {json.dumps(controlleOutput,indent=4)}")

    最終輸出結果如下揚塵:

getAllBookAuthor is :[
    "Nigel Rees",
    "Evelyn Waugh",
    "Herman Melville",
    "J. R. R. Tolkien"
]
getAllAuthor is [
    "Nigel Rees",
    "Evelyn Waugh",
    "Herman Melville",
    "J. R. R. Tolkien"
]
getAllStoreElement is [
    [
        {
            "category": "reference",
            "author": "Nigel Rees",
            "title": "Sayings of the Century",
            "price": 8.95
        },
        {
            "category": "fiction",
            "author": "Evelyn Waugh",
            "title": "Sword of Honour",
            "price": 12.99
        },
        {
            "category": "fiction",
            "author": "Herman Melville",
            "title": "Moby Dick",
            "isbn": "0-553-21311-3",
            "price": 8.99
        },
        {
            "category": "fiction",
            "author": "J. R. R. Tolkien",
            "title": "The Lord of the Rings",
            "isbn": "0-395-19395-8",
            "price": 22.99
        }
    ],
    {
        "color": "red",
        "price": 19.95
    }
]
getAllStorePrictA is [8.95, 12.99, 8.99, 22.99, 19.95]
getAllStorePriceB is [8.95, 12.99, 8.99, 22.99, 19.95]
getThirdBookInfo is [
    {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
    }
]
getLastBookInfo is [
    {
        "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
    }
]
getFirstAndSecondBookInfo is [
    {
        "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
    },
    {
        "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
    }
]
getWithFilterISBN is [
    {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
    },
    {
        "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
    }
]
getWithFilterPrice is [
    {
        "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
    },
    {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
    }
]
getAllElement is [
    {
        "book": [
            {
                "category": "reference",
                "author": "Nigel Rees",
                "title": "Sayings of the Century",
                "price": 8.95
            },
            {
                "category": "fiction",
                "author": "Evelyn Waugh",
                "title": "Sword of Honour",
                "price": 12.99
            },
            {
                "category": "fiction",
                "author": "Herman Melville",
                "title": "Moby Dick",
                "isbn": "0-553-21311-3",
                "price": 8.99
            },
            {
                "category": "fiction",
                "author": "J. R. R. Tolkien",
                "title": "The Lord of the Rings",
                "isbn": "0-395-19395-8",
                "price": 22.99
            }
        ],
        "bicycle": {
            "color": "red",
            "price": 19.95
        }
    },
    [
        {
            "category": "reference",
            "author": "Nigel Rees",
            "title": "Sayings of the Century",
            "price": 8.95
        },
        {
            "category": "fiction",
            "author": "Evelyn Waugh",
            "title": "Sword of Honour",
            "price": 12.99
        },
        {
            "category": "fiction",
            "author": "Herman Melville",
            "title": "Moby Dick",
            "isbn": "0-553-21311-3",
            "price": 8.99
        },
        {
            "category": "fiction",
            "author": "J. R. R. Tolkien",
            "title": "The Lord of the Rings",
            "isbn": "0-395-19395-8",
            "price": 22.99
        }
    ],
    {
        "color": "red",
        "price": 19.95
    },
    {
        "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
    },
    {
        "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
    },
    {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
    },
    {
        "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
    },
    "reference",
    "Nigel Rees",
    "Sayings of the Century",
    8.95,
    "fiction",
    "Evelyn Waugh",
    "Sword of Honour",
    12.99,
    "fiction",
    "Herman Melville",
    "Moby Dick",
    "0-553-21311-3",
    8.99,
    "fiction",
    "J. R. R. Tolkien",
    "The Lord of the Rings",
    "0-395-19395-8",
    22.99,
    "red",
    19.95
]
noMatchElement is False
controlleOutput is [
    "$['store']['book'][0]['author']",
    "$['store']['book'][1]['author']",
    "$['store']['book'][2]['author']",
    "$['store']['book'][3]['author']"
]

原文地址:https://www.jianshu.com/p/a69a9cf293bd

本文同步在微信訂閱號上發佈,如各位小夥伴們喜歡我的文章,也可以關注我的微信訂閱號:woaitest,或掃描下面的二維碼添加關注:

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章