elasticsearch概念(集合了数据保存和数据分析的一个搜索引擎)
1.集群:一个或者多个节点组织在一起
2.节点:一个节点是集群中的一个服务器,默认是随机的一个漫画角色
3.分片:将索引划分为多分的能力,允许水平分割和拓展容量,多个分片相应请求,提高性能和吞吐量
4.副本:就是备份X份,可分配在多个服务器
elasticsearch与数据库的对应关系:
index-数据库
type-表
documents-行
fields-列
在kibana中进行所有elasticsearch的操作
在kibana中创建索引(数据库):
建立索引,注意不是“ ”而是“”;不是空格而是下划线,原生的关键字都前面带“”
number_of_shards:分片不可以修改
number_of_replicas:备份可以修改
PUT lagou
{
"settings": {
"index": {
"number_of_shards":5,
"number_of_replicas":1
}
}
}
获取信息
#GET获取信息
GET lagou/_settings
GET _all/_settings
保存文档,插入数据,ID唯一
#PUT保存文档
PUT lagou/job/1
{
"title":"python研发",
"salary_min":1000000,
"city":"深圳",
"company":{
"name":"baidu百度",
"addr":"南山"
},
"publish_date":"2017-4-16",
"compents":15
}
#post插入数据,不指明id的话,用post
PUT lagou/job/2
{
"title":"python研发",
"salary_min":2000000,
"city":"深圳",
"company":{
"name":"baidu百度",
"addr":"南山"
},
"publish_date":"2017-4-16",
"compents":15
}
修改数据
#方法一,把全部数据列出来覆盖修改
PUT lagou/job/AV6PBQZEIcVdkiaxlbUt
{
"title":"python研发",
"salary_min":2000000,
"city":"北京",
"company":{
"name":"baidu百度",
"addr":"南山"
},
"publish_date":"2017-4-16",
"compents":15
}
#方法二 ,只局部更新数据,注意“doc”
POST lagou/job/1/_update
{
"doc":{
"company":{
"addr":"USA"
}
}
}
删除数据
#删除
DELETE index/type/id
批量操作,ES批量操作 使用_mget,_bulk
GET _mget
{
"docs":[
{
"_index":"lagou",
"_type":"job",
"_id":1
},
{
"_index":"lagou",
"_type":"job",
"_id":2
},
{
"_index":"lagou",
"_type":"jobs",
"_id":1
}
]
}
#在GET中加入index可以省略在数据中指明index,type同理
GET lagou/_mget
{
"docs":[
{
"_type":"job",
"_id":1
},
{
"_type":"job",
"_id":2
},
{
"_type":"jobs",
"_id":1
}
]
}
#bulk支持同时多个操作,支持index,delete,create,update,注意要不可回车换行
#写法:action_and_meta_data\n
option_source\n
POST _bulk
#
{"creat":{"_index":"test","_type":"type1","_id":1}}
{"field":"value"}
#
{"delete":{"_index":"test","_type":"type1","_id":1}}
#
{"index":{"_index":"test","_type":"type1","_id":1}}
{"field":"value"}
#
{"update":{"_index":"test","_type":"type1","_id":1}}
{"doc":{"field":"value"}}
#不要换行
POST _bulk
{"index":{"_index":"lagou","_type":"job","_id":1}}
{"title":"python研发","salary_min":1000000,"city":"深圳","company":{"name":"baidu百度","company_addr":"南山"},"publish_date":"2017-4-16","comment":15}
{"index":{"_index":"lagou","_type":"job","_id":2}}
{"title":"C++研发","salary_min":2000000,"city":"北京","company":{"name":"阿里巴巴","company_addr":"丰台"},"publish_date":"2017-4-17","comment":16}
{"index":{"_index":"lagou","_type":"job","_id":3}}
{"title":"Java研发","salary_min":3000000,"city":"广州","company":{"name":"腾讯","company_addr":"花都"},"publish_date":"2017-4-18","comment":17}
{"index":{"_index":"lagou","_type":"job","_id":4}}
{"title":"PHP研发","salary_min":4000000,"city":"上海","company":{"name":"小米","company_addr":"陆家嘴"},"publish_date":"2017-4-19","comment":18}
查看analyze分析结果
GET _analyze
{
"analyzer": "ik_max_word"(选择分析器),
"text": "需要分析的内容"
}
建立映射(mappings),预先定义字段类型及其属性,已经创建了mappings之后不可以修改mappings信息类型,但可以新增信息类型
有时候range,term等不好用,很可能就是mappings没做好
内置类型
string类型:text,keyword
数字类型:long,integer,short,byte,double,float
日期类型:date
布尔类型:boolean
binary(二进制)类型:binary
复杂类型:object,nested
geo类型:geo-point,geo-shape
专业类型:ip,competion
内置属性(主要属性)
store: yes标识字段会存储
index: yes标识会分析
null_value:如果字段为空,可是设置默认值
analyzer:设置分析器,中文分词器用ik
include_in_all:默认es为每个字段定义的域_all, false则不可搜索
format:时间格式字符串的模式
创建mappings:
#creat mappings
PUT lagou
{
"mappings":{
"job":{
"properties":{
"title":{
"type":"text",
"analyzer":"ik_max_word"
},
"salary_min":{
"type":"integer"
},
"city":{
"type":"keyword"
},
"company":{
"properties":{
"name":{
"type":"text",
"analyzer":"ik_max_word"
},
"company_addr":{
"type":"text"
}
}
},
"publish_date":{
"type":"date",
"format":"yyyy-mm-dd"
},
"comment":{
"type":"integer"
}
}
}
}
}
获取mappings信息:
GET _all/_mappings
GET index/_mappings
elasticsearch属性信息
搜索参考文档
elasticsearch简单搜索
1.基本查询:使用ES内置查询条件进行查询,参与打分
2.组合查询:多个查询组合在一起进行复合查询
3.过滤:通过filter条件在不影响打分的情况下筛选数据
1.match查询:传入字符串利用analyzer进行分词
GET index/_search
{
"query": {
"match": {
"FIELD": "TEXT"
}
}
}
2.term查询:传递进来的查询值不做处理(不好用)
GET index/_search
{
"query": {
"term": {
"FIELD": "TEXT"
}
}
}
3.terms查询:FIELD列表内的关键字都会被查询到
GET lagou/_search
{
"query": {
"terms": {
"FIELD": []
}
}
}
4.“from”,”size”进行分页控制,from控制从第几个搜索结果开始展示,size表示暂时数据的长度
GET index/_search
{
"query": {
"match": {
"FIELD": "TEXT"
}
},
"from":1
"size":10
}
,
5.match_all:返回所有数据信息
GET lagou/_search
{
"query": {
"match_all": {}
}
}
elasticsearch短语查询
1.match_phrase
GET lagou/_search
{
“query”: {
“match_phrase”: {
“FIELD”: {
“query”: “TEXT”,
“slop”:integer,数字表示分词后的词间距
}
}
}
}
2.multi_match
GET lagou/_search
{
"query": {
"multi_match": {
"query":"TEXT",
"fields":["FIELD1","FIELD2"]
}
}
}
}
elasticsearch搜索结果排序
sort排序,order指定排序方式,asc升序,desc降序
GET index/_search
{
"query": {
"match_all": {},
},
"sort": [
{
"FIELD": {
"order": "desc"
}
}
]
}
elasticsearch range查询
gte:大于等于; gt:大于; lte:小于等于; lt:小于; boots:表示权重
GET lagou/_search
{
"query": {
"range": {
"FIELD": {
"gt": integer,
"lt": integer
}
}
},
"sort": [{
"FIELD": {
"order": "desc"
}
}
]
}
elasticsearch 模糊查询
wildcard是模糊查询功能,value中的“*”标识通配符
GET lagou/_search
{
"query": {
"wildcard": {
"FIELD": {
"value": "TEXT",
"boost": integer
}
}
}
}
elasticsearch bool查询
filter:字段过滤并且不参与打分,过滤掉非数组内的内容
must:满足数组中所有的条件,“与”
should:数组中的查询条件满足一个或多个,“或”
must_not:数组中的查询条件一个都不能去满足,“非”
“bool”:{
“filter”:[],
“must”:[],
“should”:[],
“must_not”:[]
}
默认获取所有数据
GET test_data/position/_search
{
"query": {
"bool": {
"must": [{
"match_all":{}
}],
"filter": {
"terms": {
"FIELD": "query"
}
},
"must_not": [{
"match": {
"FIELD": "query"
}
}],
"should": [{
"match": {
"FIELD": "query"
}
}]
}
}
}
嵌套查询
查找空值
GET test_data/position/_search
{
"query": {
"bool": {
"filter": {
"exists": {
"field": "FIELD"(列表名)
}
}
}
}
}