es数据库查询API

原創

Mr.Lee jack

2019-08-08 17:19

1.背景

ES数据库是非关系型数据库

2.ES数据库优点

1.存储优化

内存中使用有限状态机FST优化

本质上是前缀树加上后缀树的结合，利用这个数据结构可以把Term更节省内存地放置并查询，它有着字典树的查询时间复杂度，但是由于做了后缀合并会更节约内存

传统Bitmap优化

使用Bitmap来记录文档的Id，每个bit对应一个文档，表示它是否存在。

2.联合查询优化

若要对多个term做联合查询，比如做AND来查询，实际上便是联合各个term产生的跳表Skip-list做查询

Term Query中使用AND操作就是利用跳表来做联合查询。比如搜索Term为Address中同时包含关键字Road和District的文章，就可以找到二者的倒排然后选取短的序列用作遍历，长的用作构造跳表，随后只要遍历短的列表里的文章逐个去跳表里寻找就可以了

3.算法

ES数据库中集成了数据分析器Analyzer，例如Charater Filter(特殊字符替换) -> Tokenizer(分词) -> Token Filter(每个词处理)

在数据分析，自然语言处理中非常常见

4.ES数据库集群化，分片，主从备份机制，容灾能力

3.ES数据库缺点

1.浪费空间

由于ES数据库采用文档Id分布，当文档很稀疏时，将会很浪费空间

4.ES数据库restful操作

域名：127.0.0.1:9200

操作的index: twitter

注意如果index未定义mapping，会根据你当前的数据结构，自动定义mapping,

1.根据_id写入数据，如果存在则无法写入
	PUT twitter/_create/1
	{
	    "user" : "kimchy",
	    "post_date" : "2009-11-15T14:12:12",
	    "message" : "trying out Elasticsearch"
	}
 2. 根据_id更新数据或写入数据（不会报错，整体更新）
	PUT twitter/_doc/1?timeout=5m
	{
	    "user" : "kimchy",
	    "post_date" : "2009-11-15T14:12:12",
	    "message" : "trying out Elasticsearch"
	}

	timeout=5m 这里设置超时操作
 3.增量式写入数据（自动将生成随机_id）
	POST twitter/_doc/
	{
	    "user" : "kimchy",
	    "post_date" : "2009-11-15T14:12:12",
	    "message" : "trying out Elasticsearch"
	}
 4.根据_id查询数据
 	GET twitter/_doc/0?_source=false
		_source=false 表示将数据屏蔽调
		_source_includes=message,post_date 表示加载的数据资源

	GET twitter/_source/1?_source_includes=message,post_date
		_source_includes=message,post_date 表示加载的数据资源
  5.根据_id删除数据
	DELETE /twitter/_doc/1?timeout=5m

6.更新数据中某字段
	POST /twitter/_update/1
	{
	    "doc" : {
	        "user" : "lijiacai",
	        "age": 12
	    },
	    "detect_noop": false,
	    "doc_as_upsert" : true
	}
	"detect_noop": false  如果在发送请求之前name是new_name，那么将忽略整个更新请求。如果请求被忽略，则响应中的result元素返回noop
	doc_as_upsert: true 表示存在则更新该字段，不存在则插入

7.根据条件批量获取数据
	GET /twitter/_mget  或者 /twitter/_doc/_mget

	其中_index 与url中的twitter对应，如果url未给出则参数中给出，反之url中给出，以下接口同理
	{
	    "docs" : [
	        {
	            "_index" : "twitter",
	            "_type" : "_doc",
	            "_id" : "1"
	        },
	        {
	            "_index" : "twitter",
	            "_type" : "_doc",
	            "_id" : "2"
	        }
	    ]
	}

8.根据条件批量筛选字段
	GET /test/_mget
	{
	    "docs" : [
	        {
	            "_id" : "1"
	        },
	        {
	            "_id" : "2",
	            "_source" : {
	                "include": ["others"],
	                "exclude": ["others.name"]
	            }
	        }
	    ]
	}
	include  包含字段
	exclude  不包含字段


9.批量写
	POST _bulk
		{ "index" : { "_index" : "test", "_id" : "1" } }
		{ "field1" : "value1" }
		{ "delete" : { "_index" : "test", "_id" : "2" } }
		{ "create" : { "_index" : "test", "_id" : "3" } }
		{ "field1" : "value3" }
		{ "update" : {"_id" : "1", "_index" : "test"} }
		{ "doc" : {"field2" : "value2"} }
		
	注意这里格式，是多级结构得字典，传入参数是以换行符区分的


10.按条件删除数据
	
	POST twitter,other_index/_delete_by_query
	{
	  "query": { 
	    "match": {
	      "message": "some message"
	    }
	  }
	}

	可以给多个index表

11.根据条件查询数据
	POST /twitter/_search
	{
	    "query": {
	        "bool" : {
	            "must" : {
	                "query_string" : {
	                    "query" : "some query string here"
	                }
	            },
	            "filter" : {
	                "term" : { "user" : "kimchy" }
	            }
	        }
	    }
	}

12.请求body查询
	其他参数见：https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-from-size.html
	GET /twitter/_search
	{
	    "query" : {
	        "term" : { "user" : "kimchy" }
	    }
	}
 
 13.查询表结构
	GET /twitter/_mapping
 
 14.查询集群状态
	GET /_cluster/health

15.查询index映射关系
	GET /_cat/aliases?v

16.查询集群文档数量
	GET /_cat/count?v

17.query查询URI
	GET /twitter/_search?q=field: value
	多个字段 使用AND 或者 OR
	参数：
		sort，from，size，q，_source等
    详情参照文档：https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html
18.查询index的文档数量
	GET /twitter/_count?q=user:kimchy，

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

es数据库查询API

Android启动过程-万字长文(Android14)

【SQL进阶】CASE语句的使用

这种嵌套字典类型的数据，我想把它读取到df里，如何操作？

optional install error: Error: Unsupported URL Type: npm:vue-loader@^16.1.0

微调真的能让LLM学到新东西吗:引入新知识可能让模型产生更多的幻觉

iNeuOS工业互联网操作系统，增加电力IEC104协议

微服务实践k8s&dapr开发部署实验（3）订阅发布

kbgressdb之数据结构V0.2

appium體驗

OSError: mysql_config not found報錯解決

基於python圖像處理API

c++高級進階，文件流，異常，模板，命名空間，信號處理，多線程等

hadoop的mapreducer處理數據（Python）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結