JSON查询语言——JMESPath

简介

JMESPath是JSON的查询语言,特点:

  • 完整规范
    JMESPath语言是由完整规范的ABNF语法描述的,确保了精确定义语言语法
  • 遵循性测试套件
    JMESPath有一整套数据驱动的测试用例,确保多个库的一致性
  • 多语言库
    每个JMESPath库都通过了一整套遵循性测试,有多种语言,包括Python、PHP、JavaScript和Lua

在这里插入图片描述




安装

pip install jmespath

本人开发了GUI方便学习——代码地址

在这里插入图片描述




API

Python 的 jmespath 库提供了两个解析操作:

def search(expression, data, options=None)

import jmespath

path = jmespath.search('foo.bar', {'foo': {'bar': 'baz'}})
print(path)
# baz



def compile(expression)

类似re模块,可以使用 compile() 编译JMESPath表达式,并执行重复搜索

import jmespath

expression = jmespath.compile('foo.bar')
print(expression.search({'foo': {'bar': 'baz'}}))
print(expression.search({'foo': {'bar': 'other'}}))
# baz
# other




基本表达式

最简单的JMESPath表达式是标识符,在JSON对象中选择一个键:

取键a

expression

a

value

{"a": "foo", "b": "bar", "c": "baz"}

result

"foo"




若键不存在,则返回null

import jmespath

expression = jmespath.compile('d')  # 选择键a
value = {"a": "foo", "b": "bar", "c": "baz"}
print(expression.search(value))
# None




子表达式返回JSON的嵌套值

expression

a.b.c.d

value

{"a": {"b": {"c": {"d": "value"}}}}

result

"value"




索引表达式,类似数组访问,从0开始

expression

[1]

value

["a", "b", "c", "d", "e", "f"]

result

"b"

从后面开始数,如[-1]、[-2]

expression

[-2]

result

"e"




组合标识符、子表达式和索引表达式

expression

a.b.c[0].d[1][0]

value

{"a": {
  "b": {
    "c": [
      {"d": [0, [1, 2]]},
      {"d": [3, 4]}
    ]
  }
}}

result

1




切片

切片可以获得数组的子集:

  • 左闭右开
  • 切片的三个参数[起始位置:终止位置:步长]




取数组前五个元素

expression为[0:5]或[:5]一样

[:5]

value

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

result

[
  0,
  1,
  2,
  3,
  4
]




间隔为2取数组元素

expression

[::2]

result

[
  0,
  2,
  4,
  6,
  8
]




逆序

expression

[::-1]

result

[
  9,
  8,
  7,
  6,
  5,
  4,
  3,
  2,
  1,
  0
]




投影

投影是JMESPath的关键特性之一,有五种投影:

  • 列表投影
  • 切片投影
  • 对象投影
  • 扁平投影
  • 过滤投影

列表和切片投影

通配符 * 创建列表投影

取键people,取所有元素,取键first

expression

people[*].first

value

{
  "people": [
    {"first": "James", "last": "d"},
    {"first": "Jacob", "last": "e"},
    {"first": "Jayden", "last": "f"},
    {"missing": "different"}
  ],
  "foo": {"bar": "baz"}
}

result

[
  "James",
  "Jacob",
  "Jayden"
]




切片投影,取前两个元素,取键first

expression

people[:2].first

result

[
  "James",
  "Jacob"
]

对象投影

同样可以使用通配符 *




取键ops,取所有键,取键numArgs

expression

ops.*.numArgs

value

{
  "ops": {
    "functionA": {"numArgs": 2},
    "functionB": {"numArgs": 3},
    "functionC": {"variadic": true}
  }
}

result

[
  2,
  3
]

扁平投影

扁平化操作符 [] 将当前结果中的子列表合并为单个列表,操作如下:

  • 创建一个空的结果列表
  • 遍历当前结果元素
  • 如果当前元素不是列表,则将其添加到结果列表的末尾
  • 如果当前元素是列表,则将当前元素的每个元素添加到结果列表的末尾




只用列表投影,结果不是扁平的

expression

reservations[*].instances[*].state

value

{
  "reservations": [
    {
      "instances": [
        {"state": "running"},
        {"state": "stopped"}
      ]
    },
    {
      "instances": [
        {"state": "terminated"},
        {"state": "running"}
      ]
    }
  ]
}

result

[
  [
    "running",
    "stopped"
  ],
  [
    "terminated",
    "running"
  ]
]




使用扁平投影 []

expression

reservations[].instances[].state

result

[
  "running",
  "stopped",
  "terminated",
  "running"
]




嵌套列表拆开一层

expression

[]

value

[
  [0, 1],
  2,
  [3],
  4,
  [5, [6, 7]]
]

result

[
  0,
  1,
  2,
  3,
  4,
  5,
  [
    6,
    7
  ]
]




拆开两层

expression

[][]

result

[
  0,
  1,
  2,
  3,
  4,
  5,
  6,
  7
]

过滤投影

类似判断语句,操作符 ? ,语法:[? <expression> <comparator> <expression>]

比较符号有:

  • ==
  • !=
  • <
  • <=
  • >
  • >=




只取state为running的

expression

machines[?state=='running'].name

value

{
  "machines": [
    {"name": "a", "state": "running"},
    {"name": "b", "state": "stopped"},
    {"name": "b", "state": "running"}
  ]
}

result

[
  "a",
  "b"
]




管表达式

对投影的结果进行操作




取键people的所有元素,取键first,取结果的第一个元素

expression

people[*].first | [0]

value

{
  "people": [
    {"first": "James", "last": "d"},
    {"first": "Jacob", "last": "e"},
    {"first": "Jayden", "last": "f"},
    {"missing": "different"}
  ],
  "foo": {"bar": "baz"}
}

result

"James"




多选(新数据)

创建JSON元素


多选列表

expression

people[].[name, state.name]

value

{
  "people": [
    {
      "name": "a",
      "state": {"name": "up"}
    },
    {
      "name": "b",
      "state": {"name": "down"}
    },
    {
      "name": "c",
      "state": {"name": "up"}
    }
  ]
}

result

[
  [
    "a",
    "up"
  ],
  [
    "b",
    "down"
  ],
  [
    "c",
    "up"
  ]
]




加上键

expression

people[].{Name: name, State: state.name}

result

[
  {
    "Name": "a",
    "State": "up"
  },
  {
    "Name": "b",
    "State": "down"
  },
  {
    "Name": "c",
    "State": "up"
  }
]




函数

JMESPath支持函数表达式,请查阅:

长度

expression

length(people)

value

{
  "people": [
    {
      "name": "b",
      "age": 30,
      "state": {"name": "up"}
    },
    {
      "name": "a",
      "age": 50,
      "state": {"name": "down"}
    },
    {
      "name": "c",
      "age": 40,
      "state": {"name": "up"}
    }
  ]
}

result

3




输出年纪最大的人

expression

max_by(people, &age).name

value

{
  "people": [
    {
      "name": "b",
      "age": 30
    },
    {
      "name": "a",
      "age": 50
    },
    {
      "name": "c",
      "age": 40
    }
  ]
}

result

"a"




函数与过滤结合使用

expression

myarray[?contains(@, 'foo') == `true`]

value

{
  "myarray": [
    "foo",
    "foobar",
    "barfoo",
    "bar",
    "baz",
    "barbaz",
    "barfoobaz"
  ]
}

result

[
  "foo",
  "foobar",
  "barfoo",
  "barfoobaz"
]




例子

筛选出20岁以上的人,要姓名和年龄

expression

people[?age > `20`].{name:name, age:age}

value

{
  "people": [
    {
      "age": 20,
      "other": "foo",
      "name": "Bob"
    },
    {
      "age": 25,
      "other": "bar",
      "name": "Fred"
    },
    {
      "age": 30,
      "other": "baz",
      "name": "George"
    }
  ]
}

result

[
  {
    "name": "Fred",
    "age": 25
  },
  {
    "name": "George",
    "age": 30
  }
]




函数封装

import json
import jmespath


def parse(expression, value):
    '''使用JMESPath解析JSON

    :param expression: JMESPath字符串
    :param value: JSON字符串
    :return: 解析结果

    >>> parse('a', '{"a": "foo", "b": "bar", "c": "baz"}')  # 基本表达式
    'foo'
    >>> parse('[:5]', '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]')  # 切片
    [0, 1, 2, 3, 4]
    >>> parse("machines[?state=='running'].name", '{"machines": [{"name": "a", "state": "running"}, {"name": "b", "state": "stopped"}]}')  # 投影
    ['a']
    >>> parse('people[*].first | [0]', '{"people": [{"first": "James"}, {"first": "Jacob"}]}')  # 管表达式
    'James'
    >>> parse('people[].{id: name}', '{"people": [{"name": "James"}, {"name": "Jacob"}]}')  # 多选
    [{'id': 'James'}, {'id': 'Jacob'}]
    >>> parse('length(people)', '{"people": [{"name": "James"}, {"name": "Jacob"}]}')  # 函数表达式
    2
    '''
    try:
        expression = jmespath.compile(expression)
        value = json.loads(value)
        result = expression.search(value)
    except Exception as e:
        return None
    else:
        return result




参考文献

  1. JMESPath 官网
  2. JMESPath 教程
  3. JMESPath 例子
  4. JMESPath 表达式
  5. jmespath.py: JMESPath is a query language for JSON.
  6. RFC4234 定义的 ABNF
  7. 语法规范:BNF与ABNF
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章