ElasticSearch從入門到放棄（五） -- Java API【基於官方文檔7.5】

點擊查看原文（包含源碼和圖片）：http://note.youdao.com/noteshare?id=c52ed63c837df7658e2939e06d69ad04&sub=58B8DFA324AF48B0ABA7F2F2C8DD4ACD

1.概述

本節描述了Elasticsearch提供的Java API。所有的Elasticsearch操作都是使用Client對象執行的。所有操作本質上都是完全異步的(要麼接受一個listener，要麼返回一個future)

2.maven

<dependency> <groupId>org.elasticsearch.client</groupId> <artifactId>transport</artifactId> <version>7.5.0</version> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-core</artifactId> <version>2.11.1</version> </dependency>

3.創建客戶端Client

你可以用使用Java客戶端:

在現有集羣上執行標準的索引、獲取、刪除和搜索操作
在運行的集羣上執行管理任務

獲取Elasticsearch客戶機非常簡單。獲得客戶機的最常見方法是創建一個連接到集羣的TransportClient

// 設置集羣屬性 //val settings = Settings.builder().put("cluster.name", "elasticSearch").build() // 創建client val client = new PreBuiltTransportClient(Settings.EMPTY) .addTransportAddress(new TransportAddress(InetAddress.getByName("localhost"), 9300)) //.addTransportAddress(new TransportAddress(InetAddress.getByName("host2"), 9300)) import scala.collection.JavaConverters._ client.connectedNodes().asScala.foreach(println(_))

4. 生成JSON 文檔

生成JSON文檔有幾種不同的方法:

使用本機字節[]或作爲字符串手動(也就是自己做)

使用將自動轉換爲其JSON等價物的映射

使用第三方庫對您的bean(如Jackson)進行序列化

使用內置的助手XContentFactory.jsonBuilder()

String json = "{" + "\"user\":\"kimchy\"," + "\"postDate\":\"2013-01-30\"," + "\"message\":\"trying out Elasticsearch\"" + "}";

Map<String, Object> json = new HashMap<String, Object>(); json.put("user","kimchy"); json.put("postDate",new Date()); json.put("message","trying out Elasticsearch");

import com.fasterxml.jackson.databind.*; // instance a json mapper ObjectMapper mapper = new ObjectMapper(); // create once, reuse // generate json byte[] json = mapper.writeValueAsBytes(yourbeaninstance);

import static org.elasticsearch.common.xcontent.XContentFactory.*; XContentBuilder builder = jsonBuilder() .startObject() .field("user", "kimchy") .field("postDate", new Date()) .field("message", "trying out Elasticsearch") .endObject()

5. Index API

// 創建json val builder = XContentFactory.jsonBuilder .startObject() .field("user", "kimchy") .field("postDate", new Date(System.currentTimeMillis())) .field("message", "trying out Elasticsearch") .endObject() val response = client.prepareIndex("myIndex","_doc","1") .setSource(builder) .get()

import org.elasticsearch.common.xcontent.XContentType val json = "{" + "\"user\":\"kimchy\"," + "\"postDate\":\"2013-01-30\"," + "\"message\":\"trying out Elasticsearch\"" + "}" val response = client.prepareIndex("twitter", "_doc").setSource(json, XContentType.JSON).get

val index = response.getIndex val `type` = response.getType val id = response.getId println(s"--------${index}--${`type`}--${id}")

6. GET API

@Test def testGetAPI(): Unit ={ val response:GetResponse = client.prepareGet("my-index","_doc","1").get() println(response.toString) }

7.DELETE API

/** * delete by query API允許根據查詢結果刪除給定的一組文檔 */ @Test def deleteByQueryAPI(): Unit ={ val reponse = new DeleteByQueryRequestBuilder(client,DeleteByQueryAction.INSTANCE) .filter(QueryBuilders.matchQuery("gender","male")) // query 條件 .source("persons") // idnex .get() // excute val deleted = reponse.getDeleted // 刪除的條數 } /** * delete by query API允許根據查詢結果刪除給定的一組文檔 * 由於它可能是一個長時間運行的操作，如果希望異步執行，可以調用execute而不是get，並提供一個類似的偵聽器 */ def deleteByQueryAsync(): Unit ={ new DeleteByQueryRequestBuilder(client,DeleteByQueryAction.INSTANCE) .filter(QueryBuilders.matchQuery("gender","male")) .source("persons") .execute(new ActionListener[BulkByScrollResponse] { override def onResponse(response: BulkByScrollResponse): Unit = { val deleted = response.getDeleted } override def onFailure(e: Exception): Unit = { // 處理失敗 } }) }

8.UPDATE API

/** * 測試更新文檔 */ def testUpdateRequest(): Unit ={ val req = new UpdateRequest() req.index("index") req.id("1") req.doc( XContentFactory.jsonBuilder .startObject() .field("user", "kimchy") .field("postDate", new Date()) .field("message", "trying out Elasticsearch") .endObject() ) client.update(req).get() } /** * 測試更新文檔 -- 使用腳本 */ def testUpdateRequestByScript(): Unit ={ val req = new UpdateRequest("idnex","1").script( new script.Script( "ctx._source.gender = \"male\"", ) ) client.update(req).get() } def testPrepareUpdate(): Unit ={ // 使用腳本 client.prepareUpdate("index","_doc","1").setScript( new script.Script( "ctx._source.gender = \"male\"", ) ) // 不使用腳本 client.prepareUpdate("index","_doc","1").setDoc( XContentFactory.jsonBuilder .startObject() .field("user", "kimchy") .field("postDate", new Date()) .field("message", "trying out Elasticsearch") .endObject() ) }

9.UPSERT API

/** * 測試Upsert */ def testUpsert(): Unit ={ val index = new IndexRequest("idnex","1") val upsert = new UpdateRequest("index","1").doc( XContentFactory.jsonBuilder .startObject() .field("user", "kimchy") .field("postDate", new Date()) .field("message", "trying out Elasticsearch") .endObject() ).upsert(index) client.update(upsert) }

10.Multi Get

/** * 測試 MultiGet */ def testMultiGET(): Unit ={ // get iterate over the result set val rep = client.prepareMultiGet() .add("twitter", "_doc", "1") // get by a single id .add("twitter", "_doc", "2", "3", "4") // by a list of ids for the same index .add("another", "_doc", "foo") // get from another index .get() for(e:MultiGetItemResponse <- rep){ val rep = e.getResponse // you can check if the document exists if(rep.isExists){ // access to the _source field rep.getSourceAsString } } }

11.Bulk API

/** * bulk API允許在一個請求中索引和刪除多個文檔 */ def testBulkAPI(): Unit ={ val bulk: BulkRequestBuilder = client.prepareBulk() // 插入一條 bulk.add(client.prepareIndex("index","_doc","1").setSource( XContentFactory.jsonBuilder .startObject() .field("user", "kimchy") .field("postDate", new Date()) .field("message", "trying out Elasticsearch") .endObject() )) // 插入一條 bulk.add(client.prepareIndex("index","_doc","2").setSource( XContentFactory.jsonBuilder .startObject() .field("user", "kimchy") .field("postDate", new Date()) .field("message", "trying out Elasticsearch") .endObject() )) if(bulk.get().hasFailures){ // fail } }

12.Search API

/** * Search API */ def testSearch(): Unit = { // 其中的都是可選項 client.prepareSearch("index1", "index2") .setSearchType(SearchType.DFS_QUERY_THEN_FETCH) .setQuery(QueryBuilders.termsQuery("FIELD", "FIELD_VALUE")) .setPostFilter(QueryBuilders.rangeQuery("FIELD_1").from(10).to(20)) .setFrom(0) .setSize(10) .setExplain(true) .get() }

13.Scrolls API

/** * Scrolls API */ def scoreAPI(): Unit = { var scrollResp = client.prepareSearch("index1") .addSort("FIELD", SortOrder.ASC) .setQuery(QueryBuilders.termQuery("FIELD", "FIELD_VALUE")) .setScroll(new TimeValue(60000)) .setSize(100) .get() do { scrollResp.getHits.getHits.map(_) scrollResp = client.prepareSearchScroll(scrollResp.getScrollId()).setScroll(new TimeValue(60000)).execute().actionGet() } while (scrollResp.getHits.getHits.length != 0) }

14.MultiSearch API

/** * MultiSearch API */ def multiSearch(): Unit = { val srb1 = client.prepareSearch("index01").setQuery(QueryBuilders.queryStringQuery("elasticsearch")).setSize(1) val srb2 = client.prepareSearch("index02").setQuery(QueryBuilders.matchQuery("name", "kimchy")).setSize(1) val sr: MultiSearchResponse = client.prepareMultiSearch() .add(srb1) .add(srb2) .get() // You will get all individual responses from MultiSearchResponse#getResponses() var nbHits = 0L for (item: Item <- sr.getResponses()) { val response = item.getResponse() nbHits += response.getResponse.getHits.getTotalHits.value } }

15.Aggregations

/** * Aggregations * 如何在搜索中添加兩個聚合 */ def aggregations(): Unit = { val sr = client.prepareSearch("index01").setQuery(QueryBuilders.matchAllQuery) .addAggregation(AggregationBuilders.terms("agg1").field("field")) .addAggregation(AggregationBuilders.dateHistogram("agg2").field("birth").calendarInterval(DateHistogramInterval.YEAR)) .get // Get your facet results val agg1 = sr.getAggregations.get("agg1") val agg2 = sr.getAggregations.get("agg2") }

16.TerminateAfter 設定查詢終止

/** * 設定查詢終止 * 爲每個碎片收集的最大文檔數量，當達到該數量時，查詢執行將提前終止。 * 如果設置了，您將能夠通過在SearchResponse對象中請求isterminate()來檢查操作是否提前終止 */ def setTerminateAfter(): Unit = { val sr = client.prepareSearch("INDEX") .setTerminateAfter(1000) // Finish after 1000 docs .get() if (sr.isTerminatedEarly()) { // We finished early } }

17. searchTemplate 模板查詢

/** * 模板查詢 * */ def searchTemplate_01(): Unit = { val template_params = new mutable.HashMap[String, Object]() template_params.put("param_gender", "male") // 可以在config/scripts中使用存儲的搜索模板。 // 例如，如果您有一個名爲config/scripts/template_gender的文件 /* { "query" : { "match" : { "gender" : "{{param_gender}}" } } } */ import scala.collection.JavaConverters._ client.admin().cluster().preparePutStoredScript() .setId("template_gender") .setContent(new BytesArray( "{\n" + " \"query\" : {\n" + " \"match\" : {\n" + " \"gender\" : \"{{param_gender}}\"\n" + " }\n" + " }\n" + "}"), XContentType.JSON).get() val sr = new SearchTemplateRequestBuilder(client) .setScript("template_gender") // 模板名稱 .setScriptType(ScriptType.STORED) // template stored on disk in gender_template.mustache .setScriptParams(template_params.asJava) // parameters .setRequest(new SearchRequest()) // set the execution context (ie. define the index name here) .get() // execute and get the template response .getResponse() // get from the template response the search response itself } /** * 模板查詢02 * */ def searchTemplate_02(): Unit = { val template_params = new mutable.HashMap[String, Object]() template_params.put("param_gender", "male") // 可以在config/scripts中使用存儲的搜索模板。 // 例如，如果您有一個名爲config/scripts/template_gender的文件 /* { "query" : { "match" : { "gender" : "{{param_gender}}" } } } */ import scala.collection.JavaConverters._ new SearchTemplateRequestBuilder(client) .setScript("{\n" + " \"query\" : {\n" + " \"match\" : {\n" + " \"gender\" : \"{{param_gender}}\"\n" + " }\n" + " }\n" + "}") .setScriptType(ScriptType.INLINE) .setScriptParams(template_params.asJava) .setRequest(new SearchRequest()) .get() .getResponse() }

18.Structuring Aggregations結構化聚合查詢

/** * Structuring Aggregations結構化聚合查詢 * 此例包含三個聚合 * 第一個名爲“by_country” 的 Terms aggregation * 第二個名爲“by_year” 的 Date Histogram aggregation * 第三個名爲“avg_children” 的 Average aggregation */ def structuringAggregations(): Unit = { // 構建聚合條件 val sr:SearchResponse = client.prepareSearch() .addAggregation( AggregationBuilders.terms("by_country").field("country") .subAggregation(AggregationBuilders.dateHistogram("by_year") .field("dateOfBirth") .calendarInterval(DateHistogramInterval.YEAR) .subAggregation(AggregationBuilders.avg("avg_children").field("children")) ) ) .execute().actionGet() // 處理響應 sr.getAggregations.get("age").asInstanceOf[Max].getValue sr.getAggregations.get("age").asInstanceOf[Stats].getAvg sr.getAggregations.get("age").asInstanceOf[Stats].getSum sr.getAggregations.get("age").asInstanceOf[Stats].getCount }

19.MetricsAggr 算術聚合

/** * 算術聚合 */ def MetricsAggr(): Unit ={ val min : MinAggregationBuilder = AggregationBuilders .min("agg") .field("height") val max : MaxAggregationBuilder = AggregationBuilders .max("agg") .field("height") val sum = AggregationBuilders .sum("agg") .field("height") val avg = AggregationBuilders .avg("agg") .field("height") val stats = AggregationBuilders .stats("agg") .field("height") val extendedStats = AggregationBuilders .extendedStats("agg") .field("height") val count = AggregationBuilders .count("agg") .field("height") // val aggregation = AggregationBuilders // .scriptedMetric("agg") // .initScript(new Script("state.heights = []")) // .mapScript(new Script("state.heights.add(doc.gender.value == 'male' ? doc.height.value : -1.0 * doc.height.value)")) // .combineScript(new Script("double heights_sum = 0.0; for (t in state.heights) { heights_sum += t } return heights_sum")) }

20.Bucket aggregations

/** * Bucket aggregations */ def testBucketAggregations(): Unit ={ AggregationBuilders.filter("agg", QueryBuilders.termQuery("gender", "male")) val aggregation :AggregationBuilder = AggregationBuilders .filters("agg", new FiltersAggregator.KeyedFilter("men", QueryBuilders.termQuery("gender", "male")), new FiltersAggregator.KeyedFilter("women", QueryBuilders.termQuery("gender", "female"))) // 使用 //Filters agg = sr.getAggregations().get("agg"); // for (entry <- agg.getBuckets) { // val key = entry.getKeyAsString // // bucket key // val docCount = entry.getDocCount // Doc count // logger.info("key [{}], doc_count [{}]", key, docCount) // } }

21.Query API

/** * Query API */ import org.elasticsearch.index.query.QueryBuilders._ def testQuery(): Unit ={ // matchAll matchAllQuery() matchQuery("field_name","kimchy elasticsearch") multiMatchQuery("kimchy elasticsearch", "field_user", "field_message") termQuery( "field_name", "kimchy") termsQuery("tags", "field_blue", "field_pill") rangeQuery("price") .from(5) .to(10) .includeLower(true) .includeUpper(false) rangeQuery("age") .gte("10") .lt("20") existsQuery("field_name") // 與包含具有指定前綴的術語的文檔相匹配的查詢 prefixQuery( "field_brand", "prefix_heine") wildcardQuery( "user", "k?mch*") // 將包含術語的文檔與指定的正則表達式匹配的查詢 regexpQuery( "name.first", "s.*y") fuzzyQuery( "name", "kimchy") // 構造一個只匹配所有類型中特定id的查詢 idsQuery() .addIds("1", "4", "100") // 一個查詢包裝另一個查詢，並簡單地返回一個常數分數等於 // 查詢中的每個文檔的查詢 constantScoreQuery( termQuery("name","kimchy")) .boost(2.0f) boolQuery() .must(termQuery("content", "test1")) .must(termQuery("content", "test4")) .mustNot(termQuery("content", "test2")) .should(termQuery("content", "test3")) .filter(termQuery("content", "test5")) disMaxQuery() .add(termQuery("name", "kimchy")) .add(termQuery("name", "elasticsearch")) .boost(1.2f) .tieBreaker(0.7f) }

22.Admin API

/** * Admin API */ def testAdmin(): Unit ={ val indicesAdminClient = client.admin.indices /** * create index */ client.admin.indices.prepareCreate("twitter").get // set index client.admin().indices().prepareCreate("twitter") .setSettings(Settings.builder() .put("index.number_of_shards", 3) .put("index.number_of_replicas", 2) ) .get() /** * put mapping */ client.admin().indices().prepareCreate("twitter") .addMapping("_doc", "message", "type=text") .get() import org.elasticsearch.common.xcontent.XContentType client.admin.indices.preparePutMapping("twitter").setType("_doc") .setSource("{\n" + " \"properties\": {\n" + " \"name\": {\n" + " \"type\": \"text\"\n" + " }\n" + " }\n" + "}", XContentType.JSON).get // You can also provide the type in the source document client.admin.indices.preparePutMapping("twitter").setType("_doc") .setSource("{\n" + " \"_doc\":{\n" + " \"properties\": {\n" + " \"name\": {\n" + " \"type\": \"text\"\n" + " }\n" + " }\n" + " }\n" + "}", XContentType.JSON).get /** * Refresh */ client.admin.indices.prepareRefresh().get client.admin.indices.prepareRefresh("twitter").get client.admin.indices.prepareRefresh("twitter", "company").get /** * get setting */ val response = client.admin.indices.prepareGetSettings("company", "employee").get import scala.collection.JavaConversions._ for (cursor <- response.getIndexToSettings) { val index = cursor.key val settings = cursor.value val shards = settings.getAsInt("index.number_of_shards", null) val replicas = settings.getAsInt("index.number_of_replicas", null) } /** * Update Indices Settings */ client.admin().indices().prepareUpdateSettings("twitter") .setSettings(Settings.builder() .put("index.number_of_replicas", 0) ) .get() }

23.Cluster Api

/** * Cluster Api */ def testClusterApi(): Unit ={ val clusterAdminClient = client.admin.cluster /** * Health * 集羣健康API允許獲得關於集羣健康狀況的非常簡單的狀態， * 還可以爲每個索引提供關於集羣狀態的一些技術信息 */ val healths = client.admin.cluster.prepareHealth().get val clusterName = healths.getClusterName val numberOfDataNodes = healths.getNumberOfDataNodes val numberOfNodes = healths.getNumberOfNodes import scala.collection.JavaConversions._ for (health <- healths.getIndices.values) { val index = health.getIndex val numberOfShards = health.getNumberOfShards val numberOfReplicas = health.getNumberOfReplicas val status = health.getStatus } /** * 存儲的腳本API允許用戶與Elasticsearch中存儲的腳本和模板進行交互。 * 它可以用於創建、更新、獲取和刪除存儲的腳本和模板 * put存儲腳本API允許設置存儲腳本的語言。如果沒有提供，則使用默認的腳本語言 */ import org.elasticsearch.common.bytes.BytesArray import org.elasticsearch.common.xcontent.XContentType val response1 = client.admin.cluster.preparePutStoredScript.setId("script1") .setContent(new BytesArray("{\"script\": {\"lang\": \"painless\", \"source\": \"_score * doc['my_numeric_field'].value\"} }"), XContentType.JSON).get val response2 = client.admin.cluster.prepareGetStoredScript.setId("script1").get val response3 = client.admin.cluster.prepareDeleteStoredScript.setId("script1").get }

ElasticSearch從入門到放棄（五） -- Java API【基於官方文檔7.5】

Flink系列（二）-- Flink的數據源詳解

ElasticSearch從入門到放棄（五） -- Java API【基於官方文檔7.5】

JAVA 定時調取器的使用

Python_ML-Day05: TensorFlow的線程隊列與IO操作、TFRecords文件的存取

從零開始搭建CDH大數據平臺（二）-- CDH 5.3.6集羣搭建篇

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結