elasticsearch—索引與檢索（一）

原創

Haqiu.Huang

2020-06-15 15:04

1. 索引介紹

You know, for search (and analysis)

一個 Elasticsearch 集羣可以包含多個索引(index)，相應的每個索引可以包含多個類型(type) 。這些不同的類型存儲着多個文檔，每個文檔又有多個屬性。

索引（名詞）：
如前所述，一個索引類似於傳統關係數據庫中的一個數據庫，是一個存儲關係型文檔的地方。索引 (index) 的複數詞爲 indices 或 indexes 。
索引（動詞）：
索引一個文檔就是存儲一個文檔到一個索引（名詞）中以便被檢索和查詢。這非常類似於 SQL 語句中的 INSERT 關鍵詞，除了文檔已存在時，新文檔會替換舊文檔情況之外。
倒排索引：
關係型數據庫通過增加一個索引比如一個 B樹（B-tree）索引到指定的列上，以便提升數據檢索速度。Elasticsearch 和 Lucene 使用了一個叫做倒排索引的結構來達到相同的目的。一個文檔中的每一個屬性都是被索引的（有一個倒排索引）和可搜索的。一個沒有倒排索引的屬性是不能被搜索到的。

以 員工文檔 的形式存儲：一個文檔代表一個員工。存儲數據到 Elasticsearch 的行爲叫做索引，但在索引一個文檔之前，需要確定將文檔存儲在哪裏;對於員工目錄,有以下需求：

每個員工索引一個文檔，文檔包含該員工的所有信息。
每個文檔都將是 employee 類型。
該類型位於索引 megacorp 內。
該索引保存在我們的 Elasticsearch 集羣中。

在kibana上進行操作如下：

PUT /megacorp/employee/1
{
    "first_name" : "John",
    "last_name" :  "Smith",
    "age" :        25,
    "about" :      "I love to go rock climbing",
    "interests": [ "sports", "music" ]
}
PUT /megacorp/employee/2
{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         32,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}

PUT /megacorp/employee/3
{
    "first_name" :  "Douglas",
    "last_name" :   "Fir",
    "age" :         35,
    "about":        "I like to build cabinets",
    "interests":  [ "forestry" ]
}

2. 文檔檢索

執行一個 HTTP GET 請求並指定文檔的地址——索引庫、類型和ID。使用這三個信息可以返回原始的 JSON 文檔：

GET /megacorp/employee/1

返回結果包含了文檔的一些元數據，以及 _source 屬性，內容是 John Smith 僱員的原始 JSON 文檔：

{
  "_index" :   "megacorp",
  "_type" :    "employee",
  "_id" :      "1",
  "_version" : 1,
  "found" :    true,
  "_source" :  {
      "first_name" :  "John",
      "last_name" :   "Smith",
      "age" :         25,
      "about" :       "I love to go rock climbing",
      "interests":  [ "sports", "music" ]
  }
}

使用search請求來搜索所有僱員：

GET /megacorp/employee/_search

返回結果包括了所有三個文檔，放在數組 hits 中。一個搜索默認返回十條結果,返回結果不僅告知匹配了哪些文檔，還包含了整個文檔本身：顯示搜索結果給最終用戶所需的全部信息:

{
   "took":      6,
   "timed_out": false,
   "_shards": { ... },
   "hits": {
      "total":      3,
      "max_score":  1,
      "hits": [
         {
            "_index":         "megacorp",
            "_type":          "employee",
            "_id":            "3",
            "_score":         1,
            "_source": {
               "first_name":  "Douglas",
               "last_name":   "Fir",
               "age":         35,
               "about":       "I like to build cabinets",
               "interests": [ "forestry" ]
            }
         },
         {
            "_index":         "megacorp",
            "_type":          "employee",
            "_id":            "1",
            "_score":         1,
            "_source": {
               "first_name":  "John",
               "last_name":   "Smith",
               "age":         25,
               "about":       "I love to go rock climbing",
               "interests": [ "sports", "music" ]
            }
         },
         {
            "_index":         "megacorp",
            "_type":          "employee",
            "_id":            "2",
            "_score":         1,
            "_source": {
               "first_name":  "Jane",
               "last_name":   "Smith",
               "age":         32,
               "about":       "I like to collect rock albums",
               "interests": [ "music" ]
            }
         }
      ]
   }
}

搜索姓氏爲 Smith 的僱員，涉及到一個查詢字符串（query-string）搜索，因爲我們通過一個URL參數來傳遞查詢信息給搜索接口：

GET /megacorp/employee/_search?q=last_name:Smith

在請求路徑中使用 _search 端點，並將查詢本身賦值給參數 q= 。返回結果給出了所有的 Smith：

{
   ...
   "hits": {
      "total":      2,
      "max_score":  0.30685282,
      "hits": [
         {
            ...
            "_source": {
               "first_name":  "John",
               "last_name":   "Smith",
               "age":         25,
               "about":       "I love to go rock climbing",
               "interests": [ "sports", "music" ]
            }
         },
         {
            ...
            "_source": {
               "first_name":  "Jane",
               "last_name":   "Smith",
               "age":         32,
               "about":       "I like to collect rock albums",
               "interests": [ "music" ]
            }
         }
      ]
   }
}

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

elasticsearch—索引與檢索（一）

1. 索引介紹

2. 文檔檢索

24-5-18 X

redis — 持久化機制與內存淘汰策略（二）

redis — SpringBoot集成Redis Cache (三)

redis — 如何將redis內存使用量壓縮一半（四）

redis — 集羣模式與原理介紹（一）

rabbitmq — 消息確認機制解析

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結