ES索引元字段_7_4_1

_id字段

表示doc的唯一标识,且该字段要求长度在512字节以内;
_id字段对应的值可以通过条件查询(term,terms,match,query_string,simple_query_string)进行检索;

//定义mapping
PUT identity_id_index
{
  "mappings": {
    "properties": {
      "text":{
        "type": "text"
      }
    }
  }
}

PUT identity_id_index/_doc/1
{
  "text":"doc with id 1"
}

PUT identity_id_index/_doc/2
{
  "text":"doc with id 2"
}
//可根据_id字段使用terms查询
GET identity_id_index/_search
{
  "query": {
    "terms":{
      "_id":["1","2"]
    }
  }
}

_id字段值可以进行排序和聚合操作,但是如此做会加载大量数据到内存中,一般不建议这样做;若存在排序和聚合操作的场景,可以在doc中额外定义一个字段(doc_values为true),该字段值与_id字段值相同;

//额外定义一个字段id,其值与id字段值相同
PUT identity_id_dup_index
{
  "mappings": {
    "properties": {
      "text":{
        "type": "text"
      },
      "id":{
        "type": "keyword"
      }
    }
  }
}

PUT identity_id_index/_doc/1
{
  "text":"doc with id 1",
  "id":1
}

PUT identity_id_index/_doc/2
{
  "text":"doc with id 2",
  "id":2
}

//此处可使用id进行聚合和排序操作,不使用_id字段
GET identity_id_index/_search
{
  "aggs": {
    "id_aggs": {
      "terms": {
        "field": "id",
        "size": 10
      }
    }
  },
  "sort": [
    {
      "id": {
        "order": "desc"
      }
    }
  ]
}

//使用_id字段进行聚合和排序查询提示过期,并建议改用另外一个字段处理 
//Deprecation: Loading the fielddata on the _id field is deprecated and will be removed in future versions. If you require sorting or aggregating on this field you should also include the id in the body of your documents, and map this field as a keyword field that has [doc_values] enabled
GET identity_id_index/_search
{
  "aggs": {
    "id_aggs": {
      "terms": {
        "field": "_id",
        "size": 10
      }
    }
  },
  "sort": [
    {
      "_id": {
        "order": "desc"
      }
    }
  ]
}

_routing字段

文档被分配到具体分片上采用以下公式:

shard_num = hash(_routing) % num_primary_shards

默认情况下_routing值与_id值相同,若在索引时指定了routing值,则在get/delete/update操作时都需要带上routing值,否则可能会无法找到值

//查询自定义routing值
GET /index_type_1/_search?routing=type1,type2
{
  "query":{
    "match": {
      "title": "document"
    }
  }
}

以上查询只在type1与type2哈希值的分片上执行,如此可以避免在所有分片索引上查找;
强制routing值必须指定,唯一id需要额外指定

PUT index_type_3
{
  "mappings": {
    "_routing": {
      "required": true
    }
  }
}

PUT index_type_3/_doc/1?routing=doc
{
  "text":"doc in type 3"
}

_source字段

_source字段记录原始json格式的文档,_source字段本身未建立索引(故而是不可对该字段进行查询),但_source字段对应值会被存储以便在执行获取请求(get或search)时可以将其返回;

_source字段虽然非常方便,但_source字段也会增加索引的存储开销,若需禁止可设置enabled参数:

//索引映射时指定参数enabled为false
PUT doc_source_index
{
  "mappings": {
    "_source": {
      "enabled": false
    }
  }
}

参数enabled置为false,则ES将不再支持以下特性:

序号 特性
1 update、update_by_query以及reindex等api
2 高亮显示处理相关api
3 从一个es索引重新索引到另一个索引,索引映射更改以及分析分析,或将索引升级到新的主要版本等功能
4 索引自动修复
PUT doc_source_index/_doc/1
{
  "text":"doc source enabled test3"
}

//_update_by_query功能不再支持,[doc_source_index][_doc][1] didn't store _source
POST doc_source_index/_update_by_query

//_update功能不再支持,[_doc][1]: document source missing
POST doc_source_index/_update/1
{
  "doc":{
    "text":"test update"
  }
}
//同上
POST doc_source_index/_update/1
{
  "script": "ctx._source.text = 'test update'"
}

指定_source字段对应值包含/不包含的字段
在文档被索引之后且_source字段存储之前可以对_source字段进行修剪以满足特定场景;

tips:从_source字段中移除字段与禁用_source存在相同的缺点,尤其是无法将文档从一个ES重新索引到另外一个ES中,可以考虑使用_source过滤条件替代;

//includes/excludes参数支持通配符,以下包含以count结尾,meta开头的(排除meta.desc,meta.other.*)
PUT doc_source_ie_index
{
  "mappings": {
    "_source": {
      "includes": [
        "*.count",
        "meta.*"
      ],
      "excludes": [
        "meta.desc",
        "meta.other.*"
      ]
    }
  }
}


PUT doc_source_ie_index/_doc/1
{
  "request":{
    "count":10,
    "foo":"bar"
  },
  "meta":{
    "name":"es _source",
    "desc":"es _source includes/excludes",
    "other":{
      "foo":"one",
      "baz":"two"
    }
  }
}

//meta.other.foo该字段虽然不在_source展示字段中,但是其依然建立了索引,故而可以根据这个条件查询
GET doc_source_ie_index/_search
{
  "query": {
    "match": {
      "meta.other.foo": "one"
    }
  }
}

_type字段

_type字段在ES的6.0.0版本中已经标记为过期;
每个文档都会被指定一个_type字段和一个_id字段,对_type字段索引是为了通过名称能更快查询;

PUT doc_type_index/_doc/1
{
  "text":"doc with type '_doc'"
}

PUT doc_type_index/_doc/2
{
  "text":"it's a nice day"
}

//提示信息:Deprecation: [types removal] Using the _type field in queries and aggregations is deprecated, prefer to use a field instead.
//提示信息:Deprecation: [types removal] Looking up doc types [_type] in scripts is deprecated.
//同_id字段一样,建议如果需要建一个字段代替原来表示索引类型
GET doc_type_index/_search
{
  "query": {
    "term": {
      "_type": {
        "value": "_doc"
      }
    }
  },
  "aggs": {
    "types": {
      "terms": {
        "field": "_type",
        "size": 10
      }
    }
  },
  "sort": [
    {
      "_type": {
        "order": "desc"
      }
    }
  ],
  "script_fields": {
    "type": {
      "script": {
        "lang": "painless",
        "source": "doc['_type']"
      }
    }
  }
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章