文章目錄

灰灰商城-分佈式高級篇-1

碼雲地址：https://gitee.com/lin_g_g_hui/grey_mall

全文檢索-ElasticSearch

docker 安裝

1、下載鏡像文件

docker pull elasticsearch:7.4.2
docker pull kibana:7.4.2-》elasticsearch的可視化工具

2、創建實例

1. 創建外部ElasticSearch配置文件

mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data
echo "http.host: 0.0.0.0" >> /mydata/elasticsearch/config/elasticsearch.yml
注意： 0.0.0.0之前有一個空格

運行後報錯->
Caused by: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes"

查看日誌-> docker logs elasticsearch
修改權限-任何用戶，組可讀寫執行
chmod -R 777 /mydata/elasticsearch/

2. 運行 ElasticSearch

docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.4.2

注意：-e ES_JAVA_OPTS="-Xms64m -Xmx512m"
設置佔用內存
查看：
free -m

3. 運行kibana

docker run --name kibana -e ELASTICSEARCH_URL=http://192.168.80.133:9200 -p 5601:5601 -d kibana:7.4.2

注意主機改爲當前的

http://192.168.80.133:9200/

4. 訪問主機+端口

http://192.168.80.133:9200/ ->返回json數據則成功
http://192.168.80.133:9200/_cat/nodes -> 查看節點

注意：使用可視化界面報錯，鏈接ElasticSearch異常
http://192.168.80.133:5601/

前面第2點運行時ip爲docker中ElasticSearch的IP，查看
docker inspect d66aba8770af |grep IPAddress
結果：查找到d66aba8770af的ip爲->172.17.0.4

重新運行：
docker run --name kibana -e ELASTICSEARCH_URL=http://172.17.0.4:9200 -p 5601:5601 -d kibana:7.4.2

繼續配置：
進入docker中Kibana文件配置
docker exec -it kibana /bin/bash
cd /usr/share/kibana/config/
vi kibana.yml
elasticsearch.hosts 改成你es容器的ip，然後將
xpack.monitoring.ui.container.elasticsearch.enabled 改成 false

3、初步檢索

可使用postman 將GET等改爲主機+端口

1. _cat

GET/_cat/nodes : 查看所有節點
GET/_cat/health : 查看es健康狀況
GET/_cat/master : 查看主節點
GET/_cat/indices : 查看所有索引 ==>show databases;

2. 索引一個文檔（保存）

保存一個數據，保存在哪個索引的哪個類型下，指定用哪個唯一標識
PUT customer /external/1; 在customer索引下的external類型下保存1號數據爲

PUT customer/external/1

1號數據的信息：
{
“name”:“Wei-xhh”
}

PUT和POST都可以

POST新增。如果不指定id, 會自動生成id。指定id就會修改這個數據，並新增版本號

PUT可以新增可以修改。PUT必須指定id；由於PUT需要指定id, 我們一般都用來做修改操作，不指定id會報錯。

返回結果：

{
    "_index": "customer",
    "_type": "external",
    "_id": "1",
    "_version": 1,
    "result": "created",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 0,
    "_primary_term": 1
}

3. 查詢文檔

GET customer/external/1

返回結果:

{
    "_index": "customer",  // 在哪個索引
    "_type": "external",    // 在哪個類型
    "_id": "1",                // 記錄id
    "_version": 2,           // 版本號
    "_seq_no": 1,           // 併發控制字段，每次更新就會+1，用來做樂觀鎖
    "_primary_term": 1,   // 同上，主分片重新分配，如重啓，就會變化
    "found": true,
    "_source": {            // 真正的內容
        "name": "Wei-xhh"
    }
}

更新攜帶 ?if_seq_no=0 & if_primary_term = 1

樂觀鎖->併發
1、小明修改1號數據->
http://192.168.80.133:9200/customer/external/1?if_seq_no=0&if_primary_term=1
2、小紅修改1號數據->
http://192.168.80.133:9200/customer/external/1?if_seq_no=0&if_primary_term=1

情況：
小明修改了，->成功，對應的seq_no也自動修改
小紅並不知道已經被小明修改，想要時修改失敗 -> 錯誤碼409
這時小紅必須重新查詢1號數據得到seq_no等於什麼
查詢後小紅得到了seq_no=5，-> 重新發送請求
http://192.168.80.133:9200/customer/external/1?if_seq_no=5&if_primary_term=1
修改成功。

4. 更新文檔

POST customer/external/1/_update

會對比原來數據，與原來一樣就什麼都不做

{
    "doc":{
        "name":"wei-xhh6666"
    }
}

結果：與原來一樣就什麼都不做 “result”: “noop”
_version，_seq_no不變

{
    "_index": "customer",
    "_type": "external",
    "_id": "1",
    "_version": 5,
    "result": "noop",
    "_shards": {
        "total": 0,
        "successful": 0,
        "failed": 0
    },
    "_seq_no": 7,
    "_primary_term": 1
}

或者

POST customer/external/1
不會檢查原來的數據

{
    "name":"wei-xhh666"
}

或者

PUT customer/external/1
不會檢查原來的數據

{
    "name":"wei-xhh66"
}

更新時也可以同時添加屬性

5. 刪除文檔&索引

DELETE customer/external/1
DELETE customer

bulk 批量API

POST customer/external/ _bulk

{"index":{"_id":"1"}}
{"name":"wei-xhh"}
{"index":{"_id":"2"}}
{"name":"wei-xhh66"}

語法格式：
{action: {metadata}}\n
{request body}      \n
{action: {metadata}}\n
{request body}      \n

複雜實例
POST / _bulk
{ "delete":{ "_index":"website", "_type":"blog", "_id":"123"}}
{ "create":{ "_index":"website", "_type":"blog", "_id":"123"}}
{ "title":"my first blog post"}
{ "index":{ "_index":"website", "_type":"blog"}}
{ "title":"my second blog post"}
{ "update":{ "_index":"website", "_type":"blog", "_id":"123"}}
{ "doc":{ "title":"my updated blog post"}}

7. 樣本測試數據

https://raw.githubusercontent.com/elastic/elasticsearch/master/docs/src/test/resources/accounts.json
可能訪問不通

我有數據，如果找不到訪問不了可以私我

POST /bank/account/_bulk

4、進階檢索

1. SearchAPI

ES支持兩種基本方式檢索：

一個是通過使用 REST request URI 發送搜索參數 (uri+檢索參數）
另外一個是通過使用 REST request body 來發送它們 (url+請求體）

設置關機自動重啓

docker update 容器id --restart=always

第一種方式：
GET bank/_search?q=*&sort=account_number:asc

第二種方式：
GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "account_number": "asc"
    },
    {
      "balance": "desc"
    }
  ]
}

2. Query DSL -> 查詢領域對象語言

1、一個查詢語句的典型結構

{
    QUERY_NAME:{
        ARGUMENT:VALUE,
        ARGUMENT:VALUE,...
    }
}

如果針對某個字段，那麼他的結構如下：

{
     QUERY_NAME:{
        FIELD_NAME:{
            ARGUMENT:VALUE,
            ARGUMENT:VALUE,...
        }
    }  
}

例子

GET bank/_search
{
  "query": {"match_all": {}},
  "sort": [
    {
      "balance": {
        "order": "asc"
      }
    }
  ],
  "from": 5,
  "size": 3,
  "_source": ["balance","age"]   // 指定值
}

結果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "749",
        "_score" : null,
        "_source" : {
          "balance" : 1249,
          "age" : 36
        },
        "sort" : [
          1249
        ]
      },
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "402",
        "_score" : null,
        "_source" : {
          "balance" : 1282,
          "age" : 32
        },
        "sort" : [
          1282
        ]
      },
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "315",
        "_score" : null,
        "_source" : {
          "balance" : 1314,
          "age" : 33
        },
        "sort" : [
          1314
        ]
      }
    ]
  }
}

2、 match用法

精確查詢

GET bank/_search
{
  "query": {
    "match": {
      "account_number": "20"
    }
  }
}

模糊查詢 -> 分詞匹配

GET bank/_search
{
  "query": {
    "match": {
      "address": "Kings"
    }
  }
}

3、 match_phrase -> 短語匹配 (上述match的增強，模糊搜索以短語，不分詞)

//不分詞匹配
GET bank/_search
{
  "query": {
    "match_phrase": {
      "address": "mill lane"
    }
  }
}

結果：

{
  "took" : 1058,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 9.507477,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "136",
        "_score" : 9.507477,
        "_source" : {
          "account_number" : 136,
          "balance" : 45801,
          "firstname" : "Winnie",
          "lastname" : "Holland",
          "age" : 38,
          "gender" : "M",
          "address" : "198 Mill Lane",
          "employer" : "Neteria",
          "email" : "[email protected]",
          "city" : "Urie",
          "state" : "IL"
        }
      }
    ]
  }
}

4、 multi_match 多字段匹配

GET bank/_search
{
  "query": {
    "multi_match": {
      "query": "mill movice",
      "fields": ["address","city"]
    }
  }
}

結果：

{
  "took" : 12,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 5.4032025,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "970",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "[email protected]",
          "city" : "Lopezo",
          "state" : "AK"
        }
      },
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "136",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 136,
          "balance" : 45801,
          "firstname" : "Winnie",
          "lastname" : "Holland",
          "age" : 38,
          "gender" : "M",
          "address" : "198 Mill Lane",
          "employer" : "Neteria",
          "email" : "[email protected]",
          "city" : "Urie",
          "state" : "IL"
        }
      },
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "345",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 345,
          "balance" : 9812,
          "firstname" : "Parker",
          "lastname" : "Hines",
          "age" : 38,
          "gender" : "M",
          "address" : "715 Mill Avenue",
          "employer" : "Baluba",
          "email" : "[email protected]",
          "city" : "Blackgum",
          "state" : "KY"
        }
      },
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "472",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 472,
          "balance" : 25571,
          "firstname" : "Lee",
          "lastname" : "Long",
          "age" : 32,
          "gender" : "F",
          "address" : "288 Mill Street",
          "employer" : "Comverges",
          "email" : "[email protected]",
          "city" : "Movico",
          "state" : "MT"
        }
      }
    ]
  }
}

5、 bool 複合查詢

複合查詢可以合併任何其他查詢語句，包括複合語句，這就意味則，複合語句之間可以互相嵌套，可以表達非常複雜的邏輯

GET bank/_search

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "F"
          }
        },
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "age": "18"
          }
        }
      ],
      "should": [
        {
          "match": {
            "lastname": "Wallace"
          }
        }
      ]
    }
  }
}

結果：

{
  "took" : 109,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 6.1104345,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "472",
        "_score" : 6.1104345,
        "_source" : {
          "account_number" : 472,
          "balance" : 25571,
          "firstname" : "Lee",
          "lastname" : "Long",
          "age" : 32,
          "gender" : "F",
          "address" : "288 Mill Street",
          "employer" : "Comverges",
          "email" : "[email protected]",
          "city" : "Movico",
          "state" : "MT"
        }
      }
    ]
  }
}

6、 filter 結果過濾

並不是所有的查詢都需要產生分數，特別是那些僅用於filtering (過濾)的文檔，爲了不計算分數Elasticsearch會自動檢查場景並且優化查詢的執行。

GET bank/_search
{
  "query": {
    "bool": {
      "filter": {
        "range": {
          "age": {
            "gte": 19,
            "lte": 30
          }
        }
      }
    }
  }
}

7、 term, 與match類似

模糊檢索推薦使用match -> 文本字段使用
精確檢索推薦使用term -> 非文本字段使用


GET bank/_search
{
  "query": {
    "term": {
      "age": "28"
    }
  }
}

match的精確查找


GET bank/_search
{
  "query": {
    "match": {
      "address.keyword": "789 Madison Street"
    }
  }
}

8、 aggregations (執行聚合)

聚合提供了從數據中分組和提取數據的能力，最簡單的聚合方法大致等於 SQL GROUP BY 和 SQL聚合函數。在Elasticsearch中，您有執行搜索返回hits(命中結果)，並且同時返回聚合結果，把一個響應的所有hits(命中結構)分隔開的能力。可以執行查詢和多個聚合，並且在一次使用中得到各自的(任何一個的)返回結果，使用一次簡潔和簡化的API來避免網絡往返。

搜索address中包含mill的所有人的年齡分佈以及平均年齡，但不顯示這些人的詳情。

GET bank/_search
{
  "query": {
    "match": {
      "address": "mill"
    }
  },
  "aggs": {
    "ageAgg": {
      "terms": {
        "field": "age",
        "size": 10
      }
    },
    "ageAvg": {
      "avg": {
        "field": "age"
      }
    }
  }
}

結果

{
  "took" : 4643,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 5.4032025,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "970",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "[email protected]",
          "city" : "Lopezo",
          "state" : "AK"
        }
      },
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "136",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 136,
          "balance" : 45801,
          "firstname" : "Winnie",
          "lastname" : "Holland",
          "age" : 38,
          "gender" : "M",
          "address" : "198 Mill Lane",
          "employer" : "Neteria",
          "email" : "[email protected]",
          "city" : "Urie",
          "state" : "IL"
        }
      },
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "345",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 345,
          "balance" : 9812,
          "firstname" : "Parker",
          "lastname" : "Hines",
          "age" : 38,
          "gender" : "M",
          "address" : "715 Mill Avenue",
          "employer" : "Baluba",
          "email" : "[email protected]",
          "city" : "Blackgum",
          "state" : "KY"
        }
      },
      {
        "_index" : "bank",
        "_type" : "account",
        "_id" : "472",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 472,
          "balance" : 25571,
          "firstname" : "Lee",
          "lastname" : "Long",
          "age" : 32,
          "gender" : "F",
          "address" : "288 Mill Street",
          "employer" : "Comverges",
          "email" : "[email protected]",
          "city" : "Movico",
          "state" : "MT"
        }
      }
    ]
  },
  "aggregations" : {
    "ageAgg" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 38,
          "doc_count" : 2
        },
        {
          "key" : 28,
          "doc_count" : 1
        },
        {
          "key" : 32,
          "doc_count" : 1
        }
      ]
    },
    "ageAvg" : {
      "value" : 34.0
    },
    "balanceAvg" : {
      "value" : 25208.0
    }
  }
}

可嵌套聚合（道理類似）

3. Mapping

1、字段類型

7.0版本可有可無
未來8.0將取消。
直接將文檔存在某個索引下。
去掉type就是爲了提高ES處理數據的效率。

2、映射（Mapping）

Mapping是用來定義一個文檔，以及它所包含的屬性是如何存儲和索引的。
比如使用mapping來定義。

哪些字符串屬性應該被看做全文本屬性。
哪些屬性包含數字，日期或者地理位置。
文檔中的所有屬性是否都能被索引
日期的格式
自定義映射規則來執行動態添加屬性。
查看mapping信息：
- GET bank/_mapping
修改mapping信息

3、新版本下

創建映射

規定 my_index 這個索引下的屬性類型

PUT /my_index
{
  "mappings": {
    "properties": {
      "age": {"type": "integer"},
      "email":{"type": "keyword"},
      "name":{"type": "text"}
    }
  }
}

添加新的字段映射

PUT /my_index/_mapping
{
  "properties": {
    "employee-id": {
      "type": "keyword",
      "index": false
    }
  }
}

更新映射

對於已經存在的映射字段，我們不能更新，更新必須創建新的索引進行數據遷移。

數據遷移

先創建出 new_twitter 的正確映射。然後使用如下方式進行數據遷移

修改bank下的mapping

創建新的索引，指定mapping規則

PUT /newbank

{
  "mappings": {
    "properties": {
      "account_number": {
        "type": "long"
      },
      "address": {
        "type": "text"
      },
      "age": {
        "type": "integer"
      },
      "balance": {
        "type": "long"
      },
      "city": {
        "type": "keyword"
      },
      "email": {
        "type": "keyword"
      },
      "employer": {
        "type": "keyword"
      },
      "firstname": {
        "type": "text"
      },
      "gender": {
        "type": "text"
      },
      "lastname": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "state": {
        "type": "keyword"
      }
    }
  }
}

結果

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "newbank"
}

數據遷移

POST _reindex
{
  "source": {
    "index": "bank",
    "type": "account"
  },
  "dest": {
    "index": "newbank"
  }
}

4、分詞

一個tokenizer(分詞器)接收一個字符流，將之分割爲獨立的tokens(詞元，通常是獨立的單詞)，然後輸出tokens流。

例如，whitespace tokenizer 遇到空白字符是分割文本，它會將文本 “Quick brown fox!”分割爲[Quick, brown, fox!]

該tokenizer（分詞器）還負責記錄各個term（詞條）的順序或position位置（用於phrase短語和word proximtiy詞近鄰查詢），以及term所代表的原始word的start和end的character offsets（字符偏移量）（用於高亮顯示搜索的內容）

標準分詞器

POST _analyze
{
  "tokenizer": "standard",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}

安裝自己的分詞器（ik）

http://github.com/medcl/elasticsearch-analysis-ik/
下載對應版本，可以複製下載地址到迅雷下，非常快

進入容器內部

docker exec -it 容器id /bin/bash

ik zip 解壓

unzip elasticsearch-analysis-ik-7.4.2.zip

修改權限
chmod -R 777 ik/

重啓elasticsearch。

使用ik
ik_smart

POST _analyze
{
  "tokenizer": "ik_smart",
  "text": "歡迎您的到來"
}

結果

{
  "tokens" : [
    {
      "token" : "歡迎您",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "的",
      "start_offset" : 3,
      "end_offset" : 4,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "到來",
      "start_offset" : 4,
      "end_offset" : 6,
      "type" : "CN_WORD",
      "position" : 2
    }
  ]
}

ik_max_word

POST _analyze
{
  "tokenizer": "ik_max_word",
  "text": "歡迎您的到來"
}

額外
安裝wget,unzip

yum install wget
yum install unzip

自定義詞庫

修改/usr/share/elasticsearch/plugins/ik/config中的IKAnalyzer.cfg.xml
直接修改外部掛載的文件

重啓容器 -》如果測試失敗 -》看下面第5步最後一點

5、安裝nginx（爲自定義詞庫創建）

隨便啓動一個nginx實例，只是爲了複製出配置
- docker run -p 80:80 --name nginx -d nginx:1.10
將容器內的配置文件拷貝到當前目錄；docker container cp nginx:/etc/nginx . (後面還有個點,且點前面有空格)
修改文件名稱：mv nginx conf 把這個conf移動到/mydata/nginx下
終止原容器：docker stop nginx
刪除容器 docker rm 容器id
創建新的nginx

docker run -p 80:80 --name nginx
-v /mydata/nginx/html:/usr/share/nginx/html
-v /mydata/nginx/logs:/var/log/nginx
-v /mydata/nginx/conf:/etc/nginx
-d nginx:1.10

訪問主機地址

在/mydata/nginx/html中創建index.html

創建分詞文本

mkdir es
vi fenci.txt

訪問：
http://192.168.80.133/es/fenci.txt

注意問題：我使用的docker需要像之前安裝kibana找到ealsticsearch一樣，需要得到docker幫它創建的ip,不能直接使用主機ip修改IKAnalyzer.cfg.xml

如圖

這樣就可以配置成功。

6、Elasticsearch-Rest-Client

1、9300:TCP

spring-data-elasticsearch:transport-api.jar
- spirngboot版本不同
- 7.x已經不建議使用；8後就要廢棄

2、9200:HTTP

JestClient：非官方；更新慢
RestTemplate：模擬發送HTTP請求，ES很多操作需要自己分裝，麻煩
HttpClient：同上
Elasticsearch-Rest-Client：官方RestClient，封裝了ES操作，API層次分明，上手簡單

灰灰商城-ElasticSearch+kibana筆記（尚硅谷穀粒商城2020）夠豪橫！