elasticsearch 大字段高亮速度慢優化

原創

2020-02-22 01:52

對大字段在設計mapping時，添加term_vector參數，如下：

"description": {
          "similarity": "customize_bm25",
          "type": "text",
          "store": true,
          "analyzer": "my_jieba_index_analyzer",
          "search_analyzer": "my_jieba_search_analyzer",
          "term_vector" : "with_positions_offsets"
        }

配置該參數後，能明顯看到高亮速度快了很多。

但是，當輸入某些查詢詞時，可能會遇到如下錯誤：

錯誤Lucense解析字段中的空格導致的。

解決方案：把空格term，使用filter過濾掉。

但是，在添加空格filter時，發現一個問題，就是使用jieba分詞器，就算添加了如下filter過濾器，也沒辦法過濾到空格term：

"my_stop_filter": {
            "ignore_case": "true",
            "type": "stop",
            "stopwords": [
              " ",
              "的",
              "得",
              "地"
            ]
          },

而使用ik分詞器是可以，所以就轉戰ik了。定義了兩個解析器，如下：

"my_ik_index_analyzer": {
            "filter": [
              "my_stop_filter"
            ],
            "type": "custom",
            "tokenizer": "ik_max_word"
          },
          "my_ik_search_analyzer": {
            "filter": [
              "my_stop_filter"
            ],
            "type": "custom",
            "tokenizer": "ik_smart"
          }

大字段mapping定義如下：

"description": {
          "similarity": "customize_bm25",
          "type": "text",
          "store": true,
          "analyzer": "my_ik_index_analyzer",
          "search_analyzer": "my_ik_search_analyzer",
          "term_vector" : "with_positions_offsets"
        }

如此，上述報錯就會消失。

done......

yingchenwy

發佈了100 篇原創文章 · 獲贊 40 · 訪問量 18萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

elasticsearch 大字段高亮速度慢優化

中外程序員到底有啥區別？

Nginx R31 doc-13-Limiting Access to Proxied HTTP Resources 訪問限流

Python數據分析與挖掘實戰（5章）

python包：pandas

公司剛入職了一名 Java 中級開發，短短 4 行代碼居然湊齊了 3 個 bug！我哭了~~

C++文件/流

一、什麼是Docker

二、Docker 組件

揹包九講一 01揹包

今天！通義靈碼在北京、成都、杭州三城開講啦

Failed to remove network i5unxjx3ahdfrhksw0fmyqpjd: Error response from daemon: network *

Python的Tornado框架的異步任務與AsyncHTTPClient

linux docker swarm使用registry 構建本地鏡像倉庫

Mysql讀表，出現事務一直連接，導致鎖表的現象

elasticsearch FORBIDDEN/12/index read-only / allow delete (api)]

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結