索引字段類型參數_7_4_4

meta參數

元數據附加在字段上且對es不透明,元數據只用於多應用使用同一個索引下的共享元數據信息,該字段允許重載;
meta字段中允許最多創建5個key-value,且key-value均要求爲string類型,key的長度在20個字符之內,value的長度在50個字符內;

//以下meta將觸發兩類異常
//[meta] values can only be strings, but got Boolean[false] for field [length]
//[meta] can't have more than 5 entries, but got 6 on field [length]
PUT param_meta_index
{
  "mappings": {
    "properties": {
      "length":{
        "type": "long",
        "meta":{
          "unit":"cm",
          "default":"10",
          "shard":false,
          "addr":"hz",
          "date":"2020-05-30",
          "limit":"50"
        }
      }
    }
  }
}

fields參數

很多場景下需要針對同一個字段有不同的訪問模式,fields參數允許定義字段不同的類型來達到這種需求;
例如映射爲text類型的string字段可以進行全文索引,若希望同一個字段也可以進行排序或聚合操作就需要映射keyword類型,fields可以處理這種情況

//city.raw字段是city字段的keyword類型
PUT /param_fields_index
{
  "mappings": {
    "properties": {
      "city":{
        "type": "text",
        "fields": {
          "raw":{
            "type":"keyword"
          }
        }
      }
    }
  }
}

PUT param_fields_index/_doc/1
{
  "city":"Hang Zhou"
}

PUT param_fields_index/_doc/2
{
  "city":"Shang Hai"
}

PUT param_fields_index/_doc/3
{
  "city":"Su Zhou"
}

//city字段可以使用全文索引,city.raw可以進行排序和聚合操作
GET param_fields_index/_search
{
  "query": {
    "match": {
      "city": "Zhou"
    }
  },
  "sort": [
    {
      "city.raw": {
        "order": "desc"
      }
    }
  ],
  "aggs": {
    "cities": {
      "terms": {
        "field": "city.raw",
        "size": 10
      }
    }
  }
}

多字段映射不會改變原始的_source字段的json信息

GET param_fields_index/_search

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "param_fields_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "city" : "Hang Zhou"
        }
      },
      {
        "_index" : "param_fields_index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "city" : "Shang Hai"
        }
      },
      {
        "_index" : "param_fields_index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "city" : "Su Zhou"
        }
      }
    ]
  }
}

針對索引映射中已存在字段可以進行多字段映射

PUT param_fields_index/_mapping
{
  "properties":{
    "city":{
      "type":"text",
      "fields":{
        "unit":{
          "type":"keyword"
        }
      }
    }
  }
}

GET param_fields_index/_mapping

{
  "param_fields_index" : {
    "mappings" : {
      "properties" : {
        "city" : {
          "type" : "text",
          "fields" : {
            "raw" : {
              "type" : "keyword"
            },
            "unit" : {
              "type" : "keyword"
            }
          }
        }
      }
    }
  }
}

多字段與多分詞器
多字段的另一個用例是以不同的方式分析同一字段以獲得更好的相關性,例如可以使用standard分詞器(文本分爲單詞)爲字段索引,在使用english分詞器(將單詞轉成詞根形式)進行索引;

//desc字段默認分詞器爲standard,desc.english字段分詞器爲english
PUT param_fields_analyzer_index
{
  "mappings": {
    "properties": {
      "desc":{
        "type": "text",
        "fields": {
          "english":{
            "type":"text",
            "analyzer":"english"
          }
        }
      }
    }
  }
}

//添加兩個doc,區別只在於首字母大小寫
PUT param_fields_analyzer_index/_doc/1
{
  "desc":"Once the RestClient has been created"
}


PUT param_fields_analyzer_index/_doc/2
{
  "desc":"once the RestClient has been created"
}

//同時查詢desc與desc.english
GET param_fields_analyzer_index/_search
{
  "query": {
    "multi_match": {
      "query": "Once the RestClient has been created",
      "fields": [
        "desc",
        "desc.english"
      ],
      "type": "most_fields"
    }
  }
}

兩個doc中desc字段包含了created,但是desc.english字段只包含了create,因爲該字段只保存詞根,可通過create查詢結果

GET param_fields_analyzer_index/_search
{
  "query": {
    "match": {
      "desc.english": "create"
    }
  }
}

normalizer參數

analyzer只能用於text類型的字段,normalizer則只能用於keyword類型,兩個參數功能類似,不過normalizer保證分析處理鏈只產生一個詞(token);
normalizer在字段建立索引之前先執行,在查詢時同樣會先執行(使用match查詢或者term級別的查詢);

//normalize只用於keyword類型字段,若指定其他類型將報錯, Mapping definition for [desc_message] has unsupported parameters:  [normalizer : custom_normalizer]
PUT param_normalizer_index
{
  "settings": {
    "analysis": {
      "normalizer":{
        "custom_normalizer":{
          "type":"custom",
          "char_filter":[],
          "filter":["lowercase","asciifolding"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "desc_message":{
        "type": "text",
        "normalize": "custom_normalizer"
      }
    }
  }
}


//自定義normalizer
PUT param_normalizer_index
{
  "settings": {
    "analysis": {
      "normalizer":{
        "custom_normalizer":{
          "type":"custom",
          "char_filter":[],
          "filter":["lowercase","asciifolding"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "desc_message":{
        "type": "keyword",
        "normalizer": "custom_normalizer"
      }
    }
  }
}


PUT param_normalizer_index/_doc/1
{
  "desc_message":"déjàvu"
}

PUT param_normalizer_index/_doc/2
{
  "desc_message":"dejavu"
}

PUT param_normalizer_index/_doc/3
{
  "desc_message":"DEJAVU"
}

PUT param_normalizer_index/_doc/4
{
  "desc_message":"dejau"
}

//term與match查詢返回結果一致,在實際查詢時會先使用normalizer將'DEJAVU'轉換成dejavu再執行查詢操作;
GET param_normalizer_index/_search
{
  "query": {
    "term": {
      "desc_message": {
        "value": "DEJAVU"
      }
    }
  }
}

GET param_normalizer_index/_search
{
  "query": {
    "match": {
      "desc_message": "DEJAVU"
    }
  }
}


//可以通過聚合操作查看實際索引中記錄的token,此處爲'dejavu'
GET param_normalizer_index/_search
{
  "size": 0, 
  "aggs": {
    "desc_terms": {
      "terms": {
        "field": "desc_message"
      }
    }
  }
}

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "desc_terms" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "dejavu",
          "doc_count" : 3
        },
        {
          "key" : "dejau",
          "doc_count" : 1
        }
      ]
    }
  }
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章