零、提要

1.简单的搜索方式：query string search

2.强大的搜索方式：query DSL

3.搜索中的过滤：query filter

4.全文搜索：full-text search

5.部分内容搜索：phrase search

6.高亮搜索：highlight search

一、准备

环境准备：

见上一篇《技.艺.道：elasticsearch概念梳理及基础操作》

数据准备：

PUT /ecommerce/product/1

{

    "name" : "gaolujie yagao",

    "desc" : "gaoxiao meibai",

    "price" : 30,

    "producer" : "gaolujie producer",

    "tags" : ["meibai","fangzhu"]

}



PUT /ecommerce/product/2

{

    "name" : "jiajieshi yagao",

    "desc" : "youxiao meibai",

    "price" : 35,

    "producer" : "gaolujie producer",

    "tags" : ["meibai","fangzhu"]

}



PUT /ecommerce/product/3

{

    "name" : "zhonghua yagao",

    "desc" : "caoben meibai",

    "price" : 20,

    "producer" : "gaolujie producer",

    "tags" : ["qingxin"]

}



PUT /ecommerce/product/4

{

  "name":"heiren yagao",

  "desc":"heiren meibai",

  "price":50,

  "producer":"heiren yagao producer",

  "tags":["meibai"]

}

二、详述

1.query string search

功能：实现简单的条件搜索

基本语法：GET /yourindex/yourtype/_search

实例-搜索全部商品：GET /ecommerce/product/_search

条件查询语法：GET /yourindex/yourtype/_search?q=name:yagao&sort=price:desc

实例-搜索name字段包含"yaogao"商品，结果按"price"字段倒序排列：GET /ecommerce/product/_search?q=name:yagao&sort=price:desc

查询结果：

{

  "took": 2,

  "timed_out": false,

  "_shards": {

    "total": 5,

    "successful": 5,

    "failed": 0

  },

  "hits": {

    "total": 3,

    "max_score": 1,

    "hits": [

      {

        "_index": "ecommerce",

        "_type": "product",

        "_id": "2",

        "_score": 1,

        "_source": {

          "name": "jiajieshi yagao",

          "desc": "youxiao meibai",

          "price": 35,

          "producer": "gaolujie producer",

          "tags": [

            "meibai",

            "fangzhu"

          ]

        }

      },

      {

        "_index": "ecommerce",

        "_type": "product",

        "_id": "1",

        "_score": 1,

        "_source": {

          "name": "good gaolujie yagao",

          "desc": "gaoxiao meibai",

          "price": 30,

          "producer": "gaolujie producer",

          "tags": [

            "meibai",

            "fangzhu"

          ]

        }

      },

      {

        "_index": "ecommerce",

        "_type": "product",

        "_id": "3",

        "_score": 1,

        "_source": {

          "name": "zhonghua yagao",

          "desc": "caoben meibai",

          "price": 20,

          "producer": "gaolujie producer",

          "tags": [

            "qingxin"

          ]

        }

      }

    ]

  }

}

查询结果说明：

took：本次查询耗时（毫秒）
timed_out：是否超时，这里是没有
_shards：数据拆成了5个分片，所以对于搜索请求，会发送至所有primary shard（或者是它的某个replica shard也可以）
hits.total：查询结果的数量，3个document。
hits.hits：包好了匹配搜索的document的详细数据。

2.query DSL（Domain Specified Language）

功能：实现复杂的条件搜索

查询所有数据：

GET /ecommerce/product/_search

{

"query": {"match_all" : {} }

}

查询“name”字段包含“yagao”的数据，结果按“price”字段的值正序排列：

GET /ecommerce/product/_search
{
	"query" : {
		"match" : {
			"name" : "yagao"
		}
	},
	"sort" : [
	  {
		  "price":"asc"
	  }
	]
}

范围查询：

实例：

GET /ecommerce/product/_search
{
	"query" :  {"match_all" : {} },
	"from" : 1,
	"size" : 2
}

查询指定字段：

GET /ecommerce/product/_search
{
	"query": {"match_all" : {} },
	"_source":["name","price"]
}

3.query filter

功能：实现数据过滤

查询出价格高于25的数据：

GET /ecommerce/product/_search
{
	"query" : {
		"bool" : {
			"must" : {
				"match" : {
					"name" : "yagao"
				}
			},
			"filter": {
				"range":{
					"price" : {"gt":30}
				}
			}
		}
	}
}

4.full-text search

功能：实现全文检索，即：只要“producer”字段中包含 yagao或producer都会被搜索出来，也就是说“部分匹配”的对象会被搜索到。

GET /ecommerce/product/_search
{
	"query":{
		"match" : {
			"producer":"yagao producer"
		}
	}
}

结果说明：

max_score：本次查询结果中最高的匹配度

hits.hits._score：具体数据的匹配度

5.phrase search

功能：短语匹配搜索，即：只有当对象的指定字段包含完整关键词时，才会被搜索出来。

跟全文检索相反，全文检索会将输入的搜索穿拆散开，去倒排索引里面去一一匹配，只要能够匹配上任意一个拆解后的单词，就可以作为结果返回phrase search

要求输入的搜索串，必须在指定的字段文本中，完全包含一模一样的，才可以算匹配，才能作为结果返回。

GET /ecommerce/product/_search
{
	"query":{
		"match_phrase":{
			"producer":"yagao producer"
		}
	}
}

6.highlight search

功能：搜索结果高亮显示

GET /ecommerce/product/_search
{
	"query":{
		"match" : {
			"producer":"producer"
		}
	},
	"highlight":{
		"fields" : {
			"producer" : {}
		}
	}
}

看到你的输出发现并没有哪里是高亮的对吧？难道是自己语句写错了？别担心，如果你的输出没有报“error”，只是没有高亮显示，那么你写的没错，因为这里的高亮，指的是输出的内容再html中是高亮显示的，即只是将输出结果中的关键词放到了一个关键词中。

输出：

{

"took": 29,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

"hits": {

"total": 4,

"max_score": 0.25811607,

"hits": [

{

"_index": "ecommerce",

"_type": "product",

"_id": "1",

"_score": 0.25811607,

"_source": {

"name": "good gaolujie yagao",

"desc": "gaoxiao meibai",

"price": 30,

"producer": "gaolujie producer",

"tags": [

"meibai",

"fangzhu"

]

"highlight": {

"producer": [

"gaolujie producer"

]

}

{

"_index": "ecommerce",

"_type": "product",

"_id": "3",

"_score": 0.25811607,

"_source": {

"name": "zhonghua yagao",

"desc": "caoben meibai",

"price": 20,

"producer": "gaolujie producer",

"tags": [

"qingxin"

]

"highlight": {

"producer": [

"gaolujie producer"

]

}

{

"_index": "ecommerce",

"_type": "product",

"_id": "2",

"_score": 0.1805489,

"_source": {

"name": "jiajieshi yagao",

"desc": "youxiao meibai",

"price": 35,

"producer": "gaolujie producer",

"tags": [

"meibai",

"fangzhu"

]

"highlight": {

"producer": [

"gaolujie producer"

]

}

{

"_index": "ecommerce",

"_type": "product",

"_id": "4",

"_score": 0.14638957,

"_source": {

"name": "heiren yagao",

"desc": "heiren meibai",

"price": 50,

"producer": "heiren yagao producer",

"tags": [

"meibai"

]

"highlight": {

"producer": [

"heiren yagao producer"

]

}

]

}

7.倒排索引

倒排索引的意义在于“按词找文”，它是全文搜索的基础。

什么是倒排索引：

为了搞清什么是倒排索引，我们分步推演。

首先什么是索引？

我不能说索引是什么，可是我可以说它像什么。它就像是一本书的目录。我们看着目录就知道自己想看的内容在哪一页了。所以，索引的目的是让我们更快找到对象的工具。为了实现这种功能，有很多经典的实现，比如B树索引，唯一索引等等。就像一本书的目录也需要印在纸上，因此一本添加了目录的书会占用更多的书页。同理，在数据库中，一个有索引的表会占用更多的存储空间。也就是拿空间换时间。消耗更多的空间资源，节约更多的搜索时间。

倒排索引：

A数据的words字段的值：hello today dog cat

B数据的words字段的值：boby free hello fish

C数据的words字段的值：shell today hello

D数据的words字段的值：hi fish

===>

words字段的值中的词	在哪个数据出现过
hello	A,B,C
today	A,C
dog	A
cat	A
boby	B
free	B
fish	B,D
shell	C
hi	D

好理解吗？

再直白一点，我来讲个故事。

那是去年冬天的一个深夜，我们产品需要上线，于是我们就很自然的加了班。那天很冷，经理决定请大家吃夜宵，看了一下这么晚只有烧烤了。于是我们决定叫点烧烤撸一撸，于是开始点菜。

小张说：我要茄子，土豆，平菇，大鱿鱼，羊肉，水饺

小李说：我要韭菜，年糕，羊肉，土豆，猪腰子，水饺

小王说：我要五花肉，大鱿鱼，年糕，羊肉，牛肉

经理说：好的，我要茄子，年糕，羊肉，馄饨。

于是经理整理一下大家的菜单，怕直接发过去，老板给弄错了。于是得到下面这份订单：

菜品	点餐人
茄子	小张，经理
土豆	小张，小李
平菇	小张
大鱿鱼	小张，小王
羊肉	小张，小李，小王，经理
水饺	小张，小李
韭菜	小李
年糕	小李，小王，经理
猪腰子	小李
五花肉	小王
牛肉	小王
馄饨	经理

对，就是这样，假装我就是吃混沌那个！

简单的说就是：

正排索引：按文找词

倒排索引：按词找文

小结：方法很多，根据场景适时用即可。

技.艺.道：elasticsearch常用搜索方法详解

零、提要

一、准备

二、详述

1.query string search

2.query DSL（Domain Specified Language）

3.query filter

4.full-text search

5.phrase search

6.highlight search

7.倒排索引

Apache-Tez（阿帕奇太子）初識

線程A的一輩子

java常識2：equals()與hashCode()的覆寫

進程與線程夢話演繹

技.藝.道：SQL的各種join與開窗函數

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結