Elasticsearch8/7/6各版本特性

Elasticsearch8/7/6各版本特性 - MyOldTime的个人空间 - OSCHINA - 中文开源技术交流社区

版本	新特性	说明
8.1	Doc-values-only search on numeric, date, keyword, ip, and boolean fields	numeric(数字类型), date, keyword, ip, and boolean 等字段，可以执行 term（=）和 range（范围）的查询，在该字段只作为存储（意思就是不开index，在mappings 中设置这些字段 index为false）。查询会变慢，但是，索引、存储、聚合和排序的效率和速度会提升很多。
8	7.x REST API compatibility	简单来说就是ES8 有新得REST API ，但是做了对7.x版本 REST API 得兼容
	Security features are enabled and configured by default	就是默认会开启一些安全配置，如用户认证、用户授权、节点间通信TLS加密，与kibana间通信TLS加密。以前得版本默认是关闭的。
	Better protection for system indices	对系统索引的保护默认开启，无法通过API 访问系统索引。
	New kNN search API	新的k近邻算法的支持
	Storage savings for keyword, match_only_text, and text fields	优化存储结构，减少空间占用。match_only_text 字段减少14.4% 。整体磁盘占用减少3.5%
	Faster indexing of geo_point, geo_shape, and range fields	geo_point字段, geo_shape字段和范围(不清是什么字段)类字段，索引(存储入库)速度快了 10%-15%
	PyTorch model support for natural language processing (NLP)	PyTorch 模型、和自然语言处理(NLP)
	SQL: Support for cross-cluster search	支持通过sql跨集群查询
7.17	https://www.elastic.co/guide/en/elasticsearch/reference/7.17/release-highlights.html
7.16	Search: Improved can-match phase for scalability	如果搜索命中大量碎片，搜索操作将包括一个称为“can-match ”阶段的预过滤阶段。在此阶段，Elasticsearch检查受影响的碎片是否包含可能与搜索查询匹配的数据。如果不是，Elasticsearch不会在碎片上运行查询。之前，搜索的协调节点向can-match 阶段检查的每个碎片发送了一个单独的请求。然而，如果搜索需要检查数千个碎片，那么协调节点将需要处理数千个请求，从而导致高开销。在7.16中，协调节点在can-match 阶段向每个数据节点发送一个请求。此请求覆盖可以匹配检查节点上所有受影响的碎片，从而显著减少请求数量和相关开销
7.15	Index disk usage API	一个新的API，可以支持，查看索引里面每个字段以及索引本身的磁盘占用率
	Search vector tile API	为地图数据生产新的矢量瓦片数据。
	Composite runtime fields	可以组合多个运行时字段，同时支持 grok 和 dissect模式
7.14	Cross-cluster EQL search	EQL支持跨集群查询
	Async SQL search	支持异步sql查询，针对查询大数据，可以异步返回结果。
	Transforms: support for top metrics	支持了一种新的聚合top_metrics，这可以提高按多个字段分组时的性能。以前只能通过script实现。
	Anomaly detection: reset job API	异常检测重置任务API
	New field type match_only_text	新字段类型支持 match_only_text
	More memory-efficient composite aggregations	复合聚合使用更少的内存。字段的复合聚合不在使用 global ordinals
	New migrate to data tiers routing API	迁移数据层路由API
	New terms enum API	新的针对于terms的api
	Automatic database updates for the GeoIP processor	地理位置IP数据库自动更新功能
7.13	Frozen tier is now GA	冻结层和共享快照缓存正式可用了。索引生命周期ILM阶分为hot、warm、cold、frozen、delete。Frozen就是这次新加的一个阶段。
	Index runtime fields	运行时字段允许您在查询时从其他字段和文档属性动态创建字段。这些查询时运行时字段针对速度的灵活性进行了优化，允许您随时更改它们。
	Match IPv4 and IPv6 addresses against CIDR ranges in Painless	可以通过CIDR API 直接对ip范围匹配，范围掩码等。
	New combined_fields query type	一个新的查询类型 combined_fields
	Faster terms aggregations	在一定条件下的terms 聚合更快了。哪些条件可以看官方文档。
	Data frame analytics and inference are generally available	在7.13中，可以训练异常值检测、回归和分类模型，然后使用这些模型根据输入数据进行推断。
	Trained model aliases	引入训练模型的别名
7.12	Frozen tier and shared snapshot caches	冻结层和共享快照缓存的实验性功能。7.13正式可用。
	Analyze snapshot repositories	新增了存储分析的API
	EQL: Case-insensitive in lookups and functionse	EQL支持不区分大小写的查询和函数
	EQL: like and regex keywords	EQL引入like 和 regex 关键字
	Retention policy for transforms
	Hyperparameter importance	大概就是数据训练过程中有一些重点的指标，我们可以指定这些指标来加速训练速度
	Search-time runtime fields support for transforms	大概就是Transforms过程中支持使用runtime_mappings来查询字段
7.11	Runtime fields	支持runtime_mappings
	Speed improvements to the date histogram	在histogram是最外层聚合且没有子聚合时，速度提升了85%
	Cross-cluster replication (CCR) now supports data streams	数据流支持跨集群复制索引
	New audit record for security configuration changes via API	为安配置变化添加了日志审计的API。
	EQL: Wildcard and list lookup support for the : operator	模糊查询支持 : 的操作(可以在一组模糊查询数据中查询)
	New garbage collection defaults for small heaps	为小于8G的堆使用新的gc配置，提升性能。
	Data frame analytics is now beta!	数据训练分析的测试版，7.13 正式可用
	Latest document transform
7.10	Indexing speed improvement	提升索引速度20%。对于全文搜索和其他分析密集型用例，性能提升较低。
	More space-efficient indices	提升压缩效率减少存储空间0-10%。可配置。
	Data tiers	引入数据层(分hot、warm、cold、frozen 7.13、delete)，可用通过ILM 管理这些层。
	AUC ROC evaluation metrics for classification analysis
	Custom feature processors in data frame analytics
	Points in time (PITs) for search	引入PITs的查询方式，感觉和异步查询类似。
	Request-level circuit breakers on coordinating nodes
	EQL: Case-sensitivity and the : operator	EQL 添加区分大小写，和 : 操作符
	REST API access to system indices is deprecated	启用通过API 访问系统索引，和 es8 那个对应上了
	New thread pools for system indices	线程池改造，分为 system_read和system_write 。把读写操作分在不同的线程池中。
7.9	Fixed retries for cross-cluster replication	修复跨集群复制中重试的问题
	Fixed index throttling	索引限速
	EQL	一直新的查询语法支持
	Data streams	它是索引、模板、rollover、ilm 基于时序性数据的综合产物。更加方便的去管理实时数据流。大概就是分索引，生命周期，模板，滚动几个的统一管理。
	Enable fully concurrent snapshot operations	快照操作现在可以以完全并发的方式执行。
	Improve speed and memory usage of multi-bucket aggregations	提高多桶聚合的速度和内存使用率
	Allow index filtering in field capabilities API	容许field capabilities API中使用 index filtering 。field capabilities就是不指定或指定模糊的索引名，查某个字段的值是xx的结果， index filtering 可以过滤这个结果中查询的索引范围
	Support terms and rare_terms aggregations in transforms
	Optimize date_histograms across daylight savings time	提升 date_histograms 性能
	Improved resilience to network disruption	增加了一个断连机制，增强了对网络中断的恢复能力
	Wildcard field optimised for wildcard queries	优化通配符字段
	Indexing metrics and back pressure	新的API 记录索引请求的跟踪数据，配置可能会降低负载
	Inference in pipeline aggregations	好像和，模型训练预测相关
7.8	Composable index templates	组合模板，组件化各种配置，使模板的使用更加灵活
	Geo improvements	geo 的改进
	Add support for t-test aggregations	新增一种聚合 t-test
	Expose aggregation usage in feature usage API	提供一个API 获取从上次启动到现在，索引分片等使用次数的统计。
	Support value_count and avg aggregations over histogram fields	histogram 支持，平均值和文档数的统计
	Reduce aggregation memory consumption	减少聚合内存使用
	Scalar functions now supported in SQL aggregations	sql聚合中支持 scalar 函数
	Increase the performance and scalability of transforms with throttling	提升Transforms 的性能
	Better estimates for machine learning model memory usage	更好地估计机器学习模型的内存使用
	Additional loss functions for regression
	Extended upload limit and explanations for Data Visualizer	扩展 Data Visualizer 上传大小的限制为1GB
	Fixed out-of-memory error when using cross-cluster replication with large documents	修复大量文本跨集群复制产生的内存溢出
7.7	Fixed index corruption on shrunk indices
	Significant reduction of heap usage of segments	降低了打开Lucene segments所需的内存
	Transforms – now in GA!	Transforms 的正式支持。这个玩意感觉类似 insert into select from 。就是通过查询转换，把结果存到另一个索引中。
	Introducing multiclass classification	预览版分类机器学习
	Feature importance at inference time	特征重要性现在可以在推理时计算。
	Finer memory control for bucket aggregations	对桶聚合内存精细控制
	A new way of searching: asynchronously	添加一种异步查询的方式
	Password protection for the keystore
	A new aggregation: top_metrics	新的聚合 top_metrics
	Query speed-up for sorted queries on time-based indices	加快，基于时间索引的排序查询
	A new aggregation: boxplot	新的聚合 boxplot
	AArch64 support	支持AArch64
7.6	New histogram field type
	Optimized sorting on long field types	优化long类型字段的排序
	Simplifying and operationalizing machine learning
	Cross-cluster search in transforms	transforms 可以跨集群查询
7.5	Enrich processor
	Shape support in SQL
	Snapshot lifecycle management retention
	Pause cross-cluster replication	添加对索引复制的暂定恢复 API
	Machine learning classification analysis
7.4	Results pinning	提升文档评分排名，置顶查询结果
	New shape field type	新的字段shape
	Aggregations on range fields	直方图和日期直方图聚合现在支持范围字段类型。
	Cumulative cardinality aggregation	新的聚合类型
	Snapshot lifecycle management	We’re introducing snapshot lifecycle management (SLM), which allows an administrator to define policies, via API or Kibana UI, that manage when and how often snapshots are taken. You can use SLM to ensure that appropriate, recent backups are ready if disaster strikes or you need to restore Elasticsearch data.
	API key management
	TLS settings for email notifications
	Automatic query cancellation	自动关闭查询，当请求终止时
	Support for AdoptOpenJDK	自带绑定AdoptOpenJDK 13
	Regression analysis - Experimental	机器学习回归分析实验性
	New vector distance functions for document script scoring - Experimental
7.3	Voting-only master nodes	新增仅投票的节点
	Reloading of search-time synonyms
	New flattened field type
	Functions on vector fields
	Prefix and wildcard support for intervals	intervals 支持模糊匹配
	Rare terms aggregation	新的聚合类型
	Aliases are replicated via cross-cluster replication
	SQL supports frozen indices
	Fixed memory leak when using templates in document-level security
	More memory-efficient aggregations on keyword fields	Terms aggregations 聚合 keyword fields 使用更少的内存
	Data frames: transform and pivot your streaming data	beta，有关数据预测的
	Discover your most unusual data using outlier detection
7.2	Data frames	就是那个7.7 的insert into select from 的测试版
	Closed indices are now replicated	容许关闭的索引复制
	Geo features in SQL
	OpenId Connect authentication realme	授权相关的
	Search as you type field mapping type	新增 search_as_you_type 字段类型
	Distance Feature Query	distance_feature 查询针对时间距离，根据给定的源信息，以距离为评分点来对查询结果评分。
7.1	TLS is now licensed under the Elastic Basic license
7.1	RBAC is now licensed under the Elastic Basic license
7.0	Adaptive replica selection enabled by defaul	对于查询节点的自适应，查询时会将请求分给不同节点，以前是循环的方式。现在会根据节点的负载，自动将请求分给负载较少的节点
	Skip shard refreshes if a shard is "search idle"	优化查询和索引刷新
	Default to one shard	分片数默认从5调整为1
	Lucene 8	Lucene 8
	Introduce the ability to minimize round-trips in cross-cluster search	在跨群集搜索中引入最小化往返的功能，跨集群搜索更快了
	New cluster coordination implementation	集群协调，新的方案
	Better support for small heaps (the real-memory circuit breaker)
	Cross-cluster replication is production-ready	跨集群复制
	Index lifecycle management is production-ready	LIM 正式可用
	SQL is production-ready	sql 正式可用
	High-level REST client is feature-complete	High-level REST client
	Support nanosecond timestamps	支持纳秒级时间戳
	Faster retrieval of top hits
	Support for TLS 1.3
	Bundle JDK in Elasticsearch distribution	从7.0开始将自带jdk 并默认绑定自带jdk。目的是为了连 es是一个java项目都不知道的人更方便部署。
	Rank features
	JSON logging
	Script score query (aka function score 2.0)
	https://www.elastic.co/guide/en/elasticsearch/reference/6.8/release-highlights.html
6.8.13	Fixed retries for cross-cluster replication
6.8.11	Fixed out-of-memory error when using cross-cluster replication with large documents
6.8.0	TLS is now licensed under the Elastic Basic license
6.8.0	RBAC is now licensed under the Elastic Basic license
6.7.0	Cross-cluster replication
	Index lifecycle management
	Elasticsearch SQL
6.6.0	Index lifecycle management (Beta)
	Frozen indices
	BKD-backed geoshapes
6.5.0	Audit security events in new structured logs
	Discover the structure of text files
	Improved machine learning results for partitioned multi-metric jobs
	Find multi-bucket anomalies in machine learning jobs
	Create source-only snapshots
	Apply token filters conditionally
	Use ODBC to connect to Elasticsearch SQL
	Delegate authorization to other realms
	Cross-cluster replication (beta*)
	Monitor Elasticsearch with Metricbeat (beta*)	可以使用 Metricbeat 收集数据
6.4.0	Analysis	在文本字段中索引短语的选项-文本字段中添加了一个新的索引短语选项。添加对韩语支持。添加多路复用令牌过滤器
	Mappings	字段可以取别名，添加_ignored meta field
	Rank Eval API
	Search
6.3.0	License management and X-Pack code	默认自带X-Pack 且免费
	SQL
	Rollups
	Java 10 Support

Elasticsearch8/7/6各版本特性

salesforce零基础学习（一百三十八）零碎知识点小总结（十）

关于接口协议，你必须要知道这些！

一键自动化博客发布工具,用过的人都说好(头条篇)

01 稳定性（一）如何应对事故并做好覆盘？

美团一面：项目中有 10000 个 if else 如何优化？想了半天，被问懵了！

FolkMq v1.4.6 发布（可以内嵌的消息中间件）

京东面试：如何进行JVM调优？

线程池那些坑爹的参数-核心线程数&最大线程数&工作队列

Stream流常用方法总结

kvm啓用

分享2024年主流的五款產品原型設計工具，設計協作和雲同步格外受關注！

界面組件DevExpress WPF v23.2 - 富文本編輯器、電子表格組件升級

Java實現抓取在線視頻並提取視頻語音爲文本

大型前端應用如何做系統融合？

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結