本文基於es7.1版本。
針對空值的測試,使用瞭如下幾種值:null、“null”、“”、[ ];
測試代碼太長,先說結論,對於所有類型,null、“”、[ ]均可以被索引,但是無法檢索。對於部分數據類型,由於“null”不能轉換爲對應的類型,因此索引時會報錯,但是對於keywork、text等可以索引string類型的字段,“null”被視作普通的string,可被索引與檢索。不可以被直接檢索的原因,套用es權威指南中的一句原話: If a field has no values, how is it stored in an inverted index?現實是,空值字段在倒排索引中沒有存儲,it isn’t stored at all。
需要注意的是,如果是基於es2.x版本,可使用exists,或者missing來檢索非null/null值。分別等同於關係數據庫中的is not null 和is null。但是missing在7.1版本中已不可用。直接使用會報錯:“no [query] registered for [missing]”。
在程序設計時,爲了給null值設置默認值,可使用null_value屬性。類似於關係數據庫中的default默認值,但又有不同,這個請繼續往下看第3點。但是需要注意的是,如下三點:
1,在es中,只有顯示設置null時,null_value纔會生效,設置空數組如[ ],空字符串如""均不生效。
2,null_value默認值應該匹配數據類型。例如,date類型不能設置字符串默認值。
3,null_value僅可以讓字段以null_value值被倒排索引存儲,以便可以讓此文檔被檢索。並不會替換_source中的實際json文檔值。
創建測試對象:
PUT ac_blog1
{
"mappings": {
"properties": {
"title":{
"type": "text"
},
"body":{
"type": "text"
},
"author":{
"type": "keyword"
},
"views":{
"type": "long"
}
}
}
}
錄入數據:
POST ac_blog1/_doc
{
"views":null
}
POST ac_blog1/_doc
{
"views":[]
}
POST ac_blog1/_doc
{
"views":""
}
測試一下,獲取全部數據:
GET ac_blog1/_search
{
"query": {
"match_all": {}
},
"size":100
}
響應:
{
"took" : 355,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "ac_blog1",
"_type" : "_doc",
"_id" : "HFBiSW0Bf1cVbYphJHEo",
"_score" : 1.0,
"_source" : {
"views" : null
}
},
{
"_index" : "ac_blog1",
"_type" : "_doc",
"_id" : "HVBiSW0Bf1cVbYphPHEa",
"_score" : 1.0,
"_source" : {
"views" : [ ]
}
},
{
"_index" : "ac_blog1",
"_type" : "_doc",
"_id" : "HlBiSW0Bf1cVbYphRXGX",
"_score" : 1.0,
"_source" : {
"views" : ""
}
}
]
}
}
可見文檔數據都已被索引。下面來查一下:
測試null的情況:
GET ac_blog1/_search
{
"query": {
"term": {
"views":null
}
}
}
響應:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "field name is null or empty"
}
],
"type": "illegal_argument_exception",
"reason": "field name is null or empty"
},
"status": 400
}
測試[ ]的情況:
GET ac_blog1/_search
{
"query": {
"term": {
"views":[]
}
}
}
響應:
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[term] query does not support array of values",
"line": 4,
"col": 15
}
],
"type": "parsing_exception",
"reason": "[term] query does not support array of values",
"line": 4,
"col": 15
},
"status": 400
}
測試""的情況:
GET ac_blog1/_search
{
"query": {
"term": {
"views":""
}
}
}
響應:
{
"error": {
"root_cause": [
{
"type": "query_shard_exception",
"reason": "failed to create query: {\n \"term\" : {\n \"views\" : {\n \"value\" : \"\",\n \"boost\" : 1.0\n }\n }\n}",
"index_uuid": "f_2YYPS6RAaew5bXcQwlzQ",
"index": "ac_blog1"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "ac_blog1",
"node": "oJRDxfVrQlGOJ9eqCGozDg",
"reason": {
"type": "query_shard_exception",
"reason": "failed to create query: {\n \"term\" : {\n \"views\" : {\n \"value\" : \"\",\n \"boost\" : 1.0\n }\n }\n}",
"index_uuid": "f_2YYPS6RAaew5bXcQwlzQ",
"index": "ac_blog1",
"caused_by": {
"type": "number_format_exception",
"reason": "empty String"
}
}
}
]
},
"status": 400
}
因爲views爲null類型,無法測試“null”的情況,會報錯null無法轉換爲long類型,這個顯而易見是es做的處理,並不是底層lucene的功能。換用keyword類型的author來測試:
POST ac_blog1/_doc
{
"author":"null"
}
GET ac_blog1/_search
{
"query": {
"term": {
"author":"null"
}
}
}
響應:
{
"took" : 416,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "ac_blog1",
"_type" : "_doc",
"_id" : "H1BoSW0Bf1cVbYphtHF9",
"_score" : 0.2876821,
"_source" : {
"author" : "null"
}
}
]
}
}
以上。