Spring Boot 20天入門(day10)
Springboot與搜索
ElasticSearch
基本概念:
- Index:一系列文檔的集合,類似於mysql中的數據庫
- Type:在Index裏面可以定義不同的type,type的概念類似於mysql中的表。
- Document:文檔的概念類似於mysql中的一條存儲記錄,並且爲json格式,在Index下的不同type下,可以有許多的document
- Shards:在數據量很大的時候,進行水平的擴展,提高搜索性能
- Replicas:防止某個分片的數據丟失,可以並行在備份數據裏及搜索提高性能
ElasticSearch查詢語法
_cat API
- cat:輸出_cat api中所有支持的查詢命令
- cat health:檢查es集羣運行的情況
- cat count:可以快速的查詢集羣或者index中文檔的數量
- cat indices: 查詢當前集羣中所有index的數據,包括index的分片數、document的數量、存儲所用的空間大小…
- 其他cat api參考官方文檔: https://www.elastic.co/guide/en/elasticsearch/reference/5.5/cat.html
Search API
查詢方式:
REST request URI:輕便快速的URI查詢語法
REST request body:可以有許多限制條件的json格式查詢方法
“query”:請求體中的query
允許我們用Query DSL
的方式查詢。
“term”:查詢時判斷某個document是否包含某個具體的值,不會對被查詢的值進行分詞查詢
“match” : 將被查詢值進行分詞,然後用評分機制(TF/IDF)進行打分
“match_phrase”:查詢指定段落
“Bool”:結合其他真值查詢,通常和[must should mustnot
](與或非)一起組合出複雜的查詢
“range”:查詢時指定某個字段在某個特定的範圍
"range": {
"FIELD": {# 指定具體過濾的字段
"gte": 1,# gte: >=, gt: >
"lte": 10
}
}
“from”:以一定的偏移量來查看我們檢索的結果,默認從檢索的第一條數據開始顯示(0位置開始)
“size”:允許我們將檢索的結果以指定的字段進行排序顯示
“_source”:指定檢索結果輸出的字段
“script_fields”:該類型允許我們通過一個腳本來計算document中不存在的值,比如我們需要計算install/click得到cti之類的
"script_fields": {
"FIELD": {# 指定腳本計算之後值得名稱
"script": {# 腳本內的運算
}
}
}
“aggs”:基於搜索查詢,可以嵌套去和來組合複雜的需求
"aggs": {
"NAME": {# 指定結果的名稱
"AGG_TYPE": {# 指定具體的聚合方法,
TODO: # 聚合體內製定具體的聚合字段
}
}
TODO: # 該處可以嵌套聚合
}
Query and filter context
查詢語句的性爲取決於它是使用查詢型上下文還是過濾型上下文
- Query context:在這種上下文環境中,查詢語句的返回結果是**”結果和查詢語句的匹配程序如何“**,返回的結果數據中都會帶上
_score
值,象徵匹配程度。 - Filter context:過濾型上下文環境中,查詢語句則表面匹配與否(yes or no)。es內置式爲
filter context
保存緩存用來提高查詢性能,因此filter context
比query context
查詢的速度快
ElasticSearch查詢示例
_Cat API查詢示例
_Cat API查詢集羣的健康情況
_Search API查詢示例
創建索引
URI
PUT localhost:9200/test
Output
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "test"
}
插入數據
URI
PUT localhost:9200/test/user/1
Body
{
"username" : "張三",
"password" : "123546",
"age" : "18",
"yyyymmdd" : "2017-08-07T16:00:00"
}
Output
{
"_index": "test", # 索引
"_type": "user", # type類型
"_id": "1", # 唯一id,如果不指定將自動生成一個自增的uuid,且這個uuid永不重複
"_version": 1,# 版本,每提交一個重複的版本+1
"result": "created",# 操作
"_shards": { # 拓展
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
查詢數據
查詢所有
URI
GET localhost:9200/test/user/_search?q=*
Request Body
GET localhost:9200/test/user/_search
{
"query":{
"match_all":{}
}
}
Output
{
"took": 1121,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "test",
"_type": "user",
"_id": "1",
"_score": 1.0,
"_source": {
"username": "張三",
"password": "123546",
"age": "18",
"yyyymmdd": "2017-08-07T16:00:00"
}
}
]
}
}
查詢特定字段,並按照某個字段進行排序
URI
GET localhost:9200/test/user/_search?q=username:張三&&sort=yyyymmdd:asc
Request Body
GET localhost:9200/test/user/_search
{
"query": {
"match": {
"username": "張三"
}
},
"sort": [
{
"yyyymmdd": {
"order": "desc"
}
}
]
}
查詢特定字段,並指定輸出字段
RequestBody
localhost:9200/test/user/_search
{
"query": {
{
"match": {
"username": "張三"
}
}
},
"sort": [
{
"yyyymmdd": {
"order": "desc"
}
}
],
"_source": [
"yyyymmdd",
"username"
]
}
Output
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "test",
"_type": "user",
"_id": "1",
"_score": null,
"_source": {
"yyyymmdd": "2017-08-07T16:00:00",
"username": "張三"
},
"sort": [
1502121600000
]
}
]
}
}
bool組合複雜查詢
Request Body
GET localhost:9200/test/user/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"username": "張三"
}
}
],
"must_not": [
{
"range": {
"age": {
"gt" : 18
}
}
}
],"should" : [
{
"match" : {
"yyyymmdd" : "2017-08-07T16:00:00"
}
}]
}
},
"sort": [
{
"yyyymmdd": {
"order": "desc"
}
}
],
"_source": [
"yyyymmdd",
"username"
],
"highlight": {
"fields": {
"username": {}
}
}
}
Output
{
"took": 328,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "test",
"_type": "user",
"_id": "1",
"_score": null,
"_source": {
"yyyymmdd": "2017-08-07T16:00:00",
"username": "張三"
},
"highlight": {
"username": [ # 高亮的字段
"<em>張</em><em>三</em>"
]
},
"sort": [
1502121600000
]
}
]
}
}
聚合查詢
下例是類似於sql中的聚合查詢,查詢每天不同類型對應的intall總量
Requst Body
PUT /rta_daily_report/campaign/164983850_rba_20170808?pretty
{
"doc": {
"cid": 164983850,
"advertiser_id": 799,
"trace_app_id": "com.zeptolab.cats.google",
"network_cid": "6656665",
"platform": 1,
"direct": 2,
"last_second_domain": "",
"jump_type": 2,
"direct_trace_app_id": "",
"mode": 0,
"third": "kuaptrk.com",
"hops": 9,
"yyyymmdd": "2017-08-07T16:00:00",
"type": "rba",
"click": 2
}
}
GET localhost:9200/test/user/_search
{
"size": 0,
"aggs": {
"sum_install": {
"date_histogram": {
"field": "yyyymmdd",
"interval": "day"
},
"aggs": {
"types": {
"terms": {
"field": "type.keyword",
"size": 10
},
"aggs": {
"install": {
"sum": {
"field": "install"
}
}
}
}
}
}
}
}
Output
"aggregations": {
"sum_install": {
"buckets": [
{
"key_as_string": "2017-07-31T00:00:00.000Z",
"key": 1501459200000,
"doc_count": 659553,
"types": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "rba",
"doc_count": 321811,
"install": {
"value": 73835
}
},
{
"key": "m_normal",
"doc_count": 321711,
"install": {
"value": 18964
}
}
script查詢
下例通過document中的click,install字段,計算出文檔中不存在的數據。
GET /rta_daily_report/campaign/_search?pretty
{
"query" : {
"bool": {
"must": [
{
"range": {
"click": {
"gt": 0
}
}
},
{
"range": {
"install": {
"gt": 0
}
}
}
]
}},
"size": 100,
"script_fields": {
"cti": {
"script": {
"lang": "painless",
"inline": "1.0 * doc['install'].value / doc['click'].value"
}
}
}
}
Output
"hits": {
"total": 23036,
"max_score": 2,
"hits": [
{
"_index": "rta_daily_report",
"_type": "campaign",
"_id": "160647918_rta_20170801",
"_score": 2,
"fields": {
"cti": [
0.0005970149253731343
]
}
},
{
"_index": "rta_daily_report",
"_type": "campaign",
"_id": "162293741_rta_20170801",
"_score": 2,
"fields": {
"cti": [
0.00007796055196070789
]
}
},
Springboot2.x整合 Elastic Search
引入依賴
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.12</version>
</dependency>
實體類
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;
import java.io.Serializable;
/**
* @Description : TODO
* @Author : Weleness
* @Date : 2020/05/30
*/
// 索引,類型
@Document(indexName = "student",type = "weleness",shards = 1,replicas = 0)
@Data
@NoArgsConstructor
@AllArgsConstructor
public class Student implements Serializable {
private static final long serialVersionUID = 4315447603254880943L;
// 當作主鍵
@Id
private String id;
@Field(type = FieldType.Keyword)
private String name;
@Field(type = FieldType.Integer)
private Integer age;
@Field(type = FieldType.Double)
private Double score;
@Field(type = FieldType.Text)
private String info;
}
repository
import com.github.springbootelasticsearch.bean.Student;
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;
import java.util.List;
/**
* @author Weleness
* @date 2020/05/30
* @description TODO
*/
// 這個接口封裝了我們一些基本的crud操作,也可以添加我們自己的一些自定義方法
public interface EsRepository extends ElasticsearchRepository<Student,String> {
/**
* 根據年齡區間查詢
*
* @param min 最小值
* @param max 最大值
* @return 滿足條件的用戶列表
*/
List<Student> findByAgeBetween(Integer min, Integer max);
}
測試
@SpringBootTest
class SpringbootElasticsearchApplicationTests {
Logger logger = LoggerFactory.getLogger(SpringbootElasticsearchApplicationTests.class);
@Autowired
private EsRepository esRepository;
@Autowired
private ElasticsearchRestTemplate elasticsearchRestTemplate;
// 創建索引
@Test
void contextLoads() throws IOException {
elasticsearchRestTemplate.createIndex(Student.class);
elasticsearchRestTemplate.putMapping(Student.class);
}
// 新增
@Test
void save(){
Student student = new Student("1","張三",18,75.5,"撒vu");
Student save = esRepository.save(student);
logger.info("【save】= {}", save);
}
// 批量新增
@Test
void saveList(){
List<Student> list = new ArrayList<>();
for (int i = 0; i < 10; i++) {
Student student = new Student(i+"","小明"+i,10+i,75.6+i,"shabi"+i);
list.add(student);
}
esRepository.saveAll(list);
}
// 查詢
@Test
void search(){
Iterable<Student> all = esRepository.findAll();
for (Student student : all) {
logger.info("[student] = {}",student);
}
}
// 刪除
@Test
void delete(){
esRepository.deleteById("1");
}
// 高級查詢
@Test
void advanceSearch(){
MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("id", "0");
esRepository.search(matchQueryBuilder).forEach(student -> {
logger.info("[student]={}",student);
});
}
// 自定義高級查詢
@Test
void customAdvanceSelect(){
// 構造查詢條件
NativeSearchQueryBuilder queryBuilder = new NativeSearchQueryBuilder();
queryBuilder.withQuery(QueryBuilders.matchAllQuery());
queryBuilder.withSort(SortBuilders.fieldSort("age").order(SortOrder.DESC));
Page<Student> search = esRepository.search(queryBuilder.build());
search.forEach(student -> {
logger.info("[student]={}",student);
});
}
}
以上…