springboot整合elasticsearch5.x以及IK分詞器做全文檢索

文章我會分三部分來講解:

第一部分,window下搭建elasticsearch的環境,以及其他插件

第二部分,springboot整合elasticsearch(有一定的分詞能力)

第三部分,springboot整合elasticsearch以及ik分詞器,做全字段檢索(完全分詞)

(我的第二篇,《springboot2.x 整合 elasticsearch 創建索引的方式》有更實用的意義,棄用postman創建索引方式,附上項目GitHub地址

一:window安裝部署elasticsearch和elasticsearch-head

現在直接在window下安裝部署elasticsearch5.5.3。

這裏是下載elasticsearch的官網網址:https://www.elastic.co/cn/downloads/elasticsearch

步驟如下:

接着到github主頁,下載相應的版本的zip壓縮包。

然後解壓到磁盤的某個路徑,路徑不要包含中文字符。

然後進入bin目錄下,雙擊elasticsearch.bat文件,啓動elasticsearch。也可以用控制檯,輸入elasticsearch.bat,然後按回車鍵,啓動elasticsearch。

啓動完成後,在瀏覽器,輸入localhost:9200。查看啓動效果,如下圖,則表示安裝啓動elasticsearch5.5.3成功。

接下來,對elasticsearch5.5.3進行一些配置,讓elasticsearch-head可以訪問elasticsearch。elasticsearch-head是讓elasticsearch數據可視化的插件。

在window下使用elasticsearch-head插件,要有node.js環境

安裝node.js: node.js下載:https://nodejs.org/en/download/ 

安裝grunt: 

  • grunt是一個很方便的構建工具,可以進行打包壓縮、測試、執行等等的工作,5.x裏的head插件就是通過grunt啓動的。因此需要安裝grunt
  • 注意:路徑切到nodejs安裝目錄下,[我安裝在C:\Program Files\nodejs],輸入命令: npm install -g grunt-cli
  • -g代表全局安裝
  • 查看版本號, 輸入命令:grunt -version

elasticsearch-head的下載官網網址:https://github.com/mobz/elasticsearch-head

同樣下載zip包,然後指定路徑解壓,路徑也不要有中文字符。(我是將elasticsearch和elasticsearch-head放在E盤路徑下)。

①:修改elasticsearch-head配置,找到 elasticsearch-head-master下的Gruntfile.js,文件, 新增:hostname: '0.0.0.0'.

connect: {        	
           server: {           		 
              options: {               			
                 hostname: '0.0.0.0',               			
                 port: 9100,                		 
                 base: '.',                		 
                 keepalive: true           			 
                   }       			
             }  		
      }  

②: 修改elasticsearch的配置文件,進入elasticsearch的config目錄下,修改elasticsearch.yml文件裏面的配置。

cluster.name: elasticsearch
node.name: node-es

network.host: 127.0.0.1
http.port: 9200
transport.tcp.port: 9300

# 增加新的參數,這樣head插件就可以訪問es
http.cors.enabled: true
http.cors.allow-origin: "*"

上面7行配置代碼,全部都是需要往elasticsearch.yml新增的配置,我特意將它們分割三部分。第一部分的配置,可以在啓動elasticsearch之後的頁面直接能看到(如我上面啓動後截圖的①②所示)。 最後兩個,http.cors.enabled: true, 和 http.cors.allow-origin: "*", 是讓head插件訪問鏈接用的配置,不可缺少。

重啓elasticsearch。接着在elasticsearch-head-master的目錄下,輸入命令:grunt server,啓動head插件

然後在瀏覽器輸入: localhost:9100

到這裏,elasticsearch和elasticsearch-head的安裝與配置已經完成。

elasticsearch以及elasticsearch-head的安裝與配置,推薦一個網址,這個哥們寫得很好,我的配置都是按他的來配置的:https://www.cnblogs.com/puke/p/7145687.html?utm_source=itdadao&utm_medium=referral

 

二:springboot整合elasticsearch

到springboot官網看你就知道,整合elasticsearch有三種方式:

  1. Connecting to Elasticsearch by REST clients
  2. Connecting to Elasticsearch by Using Jest
  3. Connecting to Elasticsearch by Using Spring Data

這裏我是使用第三種方式,spring Data elasticsearch

新建一個springboot工程,我的springboot版本是2.0.0

我的pom.xml文件

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        // 可以直接在這裏修改你的springboot版本,我的springboot新建時是2.1.6的,然後在這裏修改爲2.0.0
        <version>2.0.0.RELEASE</version>     
        <relativePath/>
    </parent>
    <groupId>com.gosuncn</groupId>
    <artifactId>esdemo</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>esdemo</name>
    <description>Demo project for Spring Boot</description>

    <properties>
        <java.version>1.8</java.version>
        <elasticsearch.version>5.5.3</elasticsearch.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>transport</artifactId>
            <version>5.5.3</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>

        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

然後可以看你springboot導入的elasticsearch.jar的版本

因爲spring data elasticsearch 與elasticsearch是要進行版本適配的,不匹配,是不能進行連接的。 

細心的朋友,可能已經發現,我的elasticsearch已經存在一個local_library索引了。那我們就以這個索引爲例。

這裏,我先說明一下,我的local_library索引是怎樣創建的。使用postman,提交PUT請求,即可創建local_library索引,以及這個索引下的book類型。

body實體裏面的內容如下:在local_library索引下,創建一個book類型。

{
    "mappings":{
        "book":{
            "properties":{
                "book_id":{
                    "type":"long",
                    "fields":{
                        "keyword":{
                            "type":"keyword"
                        }
                    }
                },
                "book_code":{
                    "type":"text",
                    "fields":{
                        "keyword":{
                            "type":"keyword"
                        }
                    }
                },
                "book_name":{
                    "type":"text",
                    "fields":{
                        "keyword":{
                            "type":"keyword"
                        }
                    }
                },
                "book_price":{
                    "type":"integer",
                    "fields":{
                        "keyword":{
                            "type":"keyword"
                        }
                    }
                },
                "book_author":{
                    "type":"text",
                    "fields":{
                        "keyword":{
                            "type":"keyword"
                        }
                    }
                },
                "book_desc":{
                    "type":"text",
                    "fields":{
                        "keyword":{
                            "type":"keyword"
                        }
                    }
                }
            }
        }
    }
}

在elasticsearch-head查看效果如下:(備註:)

接着,有索引,有類型了,可以開始插入數據。插入數據有兩種方式,第一種是用postman, 第二種是在springboot工程的測試類中插入。

①:用postman插入數據示例:

插入的效果,可以看上面數據截圖。

②:用實體類插入數據(包括新增、查詢)

新建一個Library實體類:(備註,實體類已經使用lombok插件了)

@Data
@NoArgsConstructor
@AllArgsConstructor
@Document(indexName = "local_library", type = "book")
public class Library {
    /**
     *     index:是否設置分詞
     *     analyzer:存儲時使用的分詞器
     *          ik_max_word
     *          ik_word
     *     searchAnalyze:搜索時使用的分詞器
     *     store:是否存儲
     *     type: 數據類型
     */

    @Id
    private Integer book_id;
    private String book_code;    
    private String book_name;   
    private Integer book_price;   
    private String book_author;    
    private String book_desc;

}

然後編寫一個LibraryRepository:

@Repository
public interface LibraryRepository extends ElasticsearchRepository<Library, Long> {
}

然後在測試類中,插入數據:

import com.gosuncn.esdemo.domin.Library;
import com.gosuncn.esdemo.repository.LibraryRepository;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.QueryStringQueryBuilder;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.PageRequest;
import org.springframework.data.domain.Sort;
import org.springframework.data.elasticsearch.core.ElasticsearchTemplate;
import org.springframework.data.elasticsearch.core.query.NativeSearchQueryBuilder;
import org.springframework.test.context.junit4.SpringRunner;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

@RunWith(SpringRunner.class)
@SpringBootTest
public class EsdemoApplicationTests {

    @Autowired
    ElasticsearchTemplate elasticsearchTemplate;
    @Autowired
    LibraryRepository libraryRepository;

    /**
     * 插入數據
     */
    @Test
    public void testInsert(){
        libraryRepository.save(new Library(42, "A00042", "明史簡述", 59, "吳晗", "吳晗背景uniworsity厲害"));
        libraryRepository.save(new Library(43, "A00043", "傅雷家書", 99, "傅聰", "都是NB,class大家u"));
        libraryRepository.save(new Library(24, "A00942", "時間簡史", 169, "霍金", "教授宇宙大爆發的59年曆史"));
        libraryRepository.save(new Library(25, "A00925", "我的前半生", 39, "方舟89子", "都是生活,每晚9點"));
        libraryRepository.save(new Library(29, "A00029", "圍9城", 139, "錢鍾書", "你想出城?不存在的"));
    }
}

到此,這是第二種插入數據的方式。

 

然後數據有了,就可以進行查詢了。查詢代碼如下:

    // 全字段查詢,不分頁
    @Test
    public void testSearch(){
        try {
            String searchStr = "三西阿";
            QueryStringQueryBuilder builder = new QueryStringQueryBuilder(searchStr);
            Iterable<Library> search = libraryRepository.search(builder);
            Iterator<Library> iterator = search.iterator();
            while (iterator.hasNext()){
                System.out.println("--> 數據:"+iterator.next());
            }
        }catch (Exception e){
            System.out.println("---> 異常信息: "+e);
        }
    }
    // 全字段查詢, 已經分頁
    @Test
    public void testSearchByPage(){
        try {
            String searchStr = "三西阿";
            QueryStringQueryBuilder builder = new QueryStringQueryBuilder(searchStr);
            Iterable<Library> search = libraryRepository.search(builder, PageRequest.of(0,2));
            Iterator<Library> iterator = search.iterator();
            while (iterator.hasNext()){
                System.out.println("--> 數據:"+iterator.next());
            }
        }catch (Exception e){
            System.out.println("---> 異常信息: "+e);
        }
    }

運行第一個test的結果是:

--> 數據:Library{book_id=3, book_code='A0003', book_name='西遊記', book_price=99, book_author='吳承恩', book_desc='冒險小說!'}
--> 數據:Library{book_id=6, book_code='A0006', book_name='歡樂頌', book_price=399, book_author='阿耐', book_desc='都是生活真實隱射!'}
--> 數據:Library{book_id=7, book_code='A0007', book_name='都挺好', book_price=299, book_author='阿耐', book_desc='折射現實家庭矛盾!'}
--> 數據:Library{book_id=5, book_code='A0005', book_name='三國演義', book_price=198, book_author='羅貫中', book_desc='三國霸王遊戲!'}
--> 數據:Library{book_id=18, book_code='A00018', book_name='編譯原理', book_price=79, book_author='趙建華', book_desc='三十多小時,98.5'}
--> 數據:Library{book_id=1, book_code='A0001', book_name='琴帝', book_price=180, book_author='唐家三少', book_desc='最好看的玄幻小說!'}
--> 數據:Library{book_id=12, book_code='A00012', book_name='三重門', book_price=69, book_author='韓寒', book_desc='這是一個批評現實的真實故事'}

三:IK分詞器的使用

前面的都是小case,到這裏纔是我們做全文檢索的重點。

3.1:爲什麼要使用IK分詞器

ik分詞器,會將你的查詢條件語句拆分成一個一個詞。比如:查詢語句是“我愛中華人民共和國”,使用ik_max_word來分詞,則會出現下面的結果: 【“我”,“愛”,“中華人民共和國”,“中華人民”,“中華”,“華人”,“人民共和國”,“人民”,“人民共和國”,“共和”,“國”】。具體看下圖:(使用postman)

body結果代碼如下:

{
    "tokens": [
        {
            "token": "我",
            "start_offset": 0,
            "end_offset": 1,
            "type": "CN_CHAR",
            "position": 0
        },
        {
            "token": "愛",
            "start_offset": 1,
            "end_offset": 2,
            "type": "CN_CHAR",
            "position": 1
        },
        {
            "token": "中華人民共和國",
            "start_offset": 2,
            "end_offset": 9,
            "type": "CN_WORD",
            "position": 2
        },
        {
            "token": "中華人民",
            "start_offset": 2,
            "end_offset": 6,
            "type": "CN_WORD",
            "position": 3
        },
        {
            "token": "中華",
            "start_offset": 2,
            "end_offset": 4,
            "type": "CN_WORD",
            "position": 4
        },
        {
            "token": "華人",
            "start_offset": 3,
            "end_offset": 5,
            "type": "CN_WORD",
            "position": 5
        },
        {
            "token": "人民共和國",
            "start_offset": 4,
            "end_offset": 9,
            "type": "CN_WORD",
            "position": 6
        },
        {
            "token": "人民",
            "start_offset": 4,
            "end_offset": 6,
            "type": "CN_WORD",
            "position": 7
        },
        {
            "token": "共和國",
            "start_offset": 6,
            "end_offset": 9,
            "type": "CN_WORD",
            "position": 8
        },
        {
            "token": "共和",
            "start_offset": 6,
            "end_offset": 8,
            "type": "CN_WORD",
            "position": 9
        },
        {
            "token": "國",
            "start_offset": 8,
            "end_offset": 9,
            "type": "CN_CHAR",
            "position": 10
        }
    ]
}

補充說明一下的區別:ik_max_word 和 ik_smart 

ik_max_word 會將文本做最細粒度的拆分。
ik_smart 會做最粗粒度的拆分。

ik_max_word  和 ik_smart 是兩個不同類型的分詞器。

3.2 安裝ik分詞器

ik分詞器的下載與安裝:(備註:ik分詞器的版本,必須與elasticsearch版本一致

GitHub下載地址: https://github.com/medcl/elasticsearch-analysis-ik/releases

找到5.5.3版本的

點擊下載elasticsearch-analysis-ik-5.5.3.zip 包,然後解壓,打開解壓後的文件夾,裏面有如下文件

接着,把這裏的所有文件複製到elasticsearch的plugins目錄下,自己新建的ik文件夾裏面(說明: ik文件夾是自己新建的,至於爲什麼要新建一個ik文件夾。其實到後來,你會在使用elasticsearch時添加很多的插件,每一個插件都是放在plugins目錄下的,所以建一個文件夾,單獨存放某個插件,會有一個很清晰的文件目錄)

然後重啓elasticsearch。

出現紅圈所示,則說明插件安裝成功。

檢驗效果:啓動elasticsearch成功後,直接在瀏覽器輸入:http://localhost:9200/_analyze?analyzer=ik_max_word&pretty=true&text=我愛中華人民共和國

備註:

如果是ik分詞插件是6.x版本的,只能用postman測試,而且查詢條件要放在body體內,如下圖:

如果直接在url加上查詢條件,如: http://localhost:9200/_analyze?analyzer=ik_max_word&pretty=true&text=大膽愛,不用地阿達女,加

會拋出異常:

{
  "error": {
    "root_cause": [
      {
        "type": "parse_exception",
        "reason": "request body or source parameter is required"
      }
    ],
    "type": "parse_exception",
    "reason": "request body or source parameter is required"
  },
  "status": 400
}

3.3 自定義分詞器

上面說的ik_max_word 和ik_smart 這兩種分詞器,其實不是我想要的效果。比如我要查詢“cla好好學習,15成功”。

瀏覽器輸入:http://localhost:9200/_analyze?analyzer=ik_max_word&pretty=true&text=cla好好學習,15成功

 

我想要將查詢條件分拆成:【"c", "l", "a", "好", "好", "學", "習", ",", "1", "5", "成", "功"】這種,單個字,單個詞,單個數字的最小粒度。但是ik_max_word 已經是不能滿足我的需求了,那我們就自定義一個分詞器(analyzer)

使用NGram 來定義一個分詞器: 官網:https://www.elastic.co/guide/en/elasticsearch/reference/6.4/analysis-ngram-tokenizer.html

這個網址裏面的內容很重要,看不懂英文的小夥伴,可以用瀏覽器的插件,將它翻譯成中文。

這句話:They are useful for querying languages that don’t use spaces or that have long compound words, like German.

意思是:它們對於查詢不使用空格或具有長複合詞的語言非常有用,比如德語。

我就直接使用官網的例子進行說明:輸入的查詢條件是:“Quick Fox”.

POST _analyze
{
  "tokenizer": "ngram",
  "text": "Quick Fox"
}

分詞後的結果是:

[ Q, Qu, u, ui, i, ic, c, ck, k, "k ", " ", " F", F, Fo, o, ox, x ]

可以給 ngram 配置下面三個參數:

①:min_gram :字符的最小長度,默認是1.

②:max_gram:字符的最大長度,默認是2.

③:token_chars: 可以理解爲elasticsearch分割字符的點。其中又可以包含5個屬性。letter ,digit ,whitespace ,punctuation ,symbol 。

知道這些配置後,開始實例:(實例裏面,自定義一個 myanalyzer 分詞器,並創建my_user索引,和user類型)

PUT localhost:9200/my_user/

{
	"settings": {
    	"analysis": {
    		"analyzer": {
        		"myanalyzer": {
        			"tokenizer": "mytokenizer"
        		}
    		},
    		"tokenizer": {
        		"mytokenizer": {
        		"type": "ngram",
        		"min_gram": 1,
        		"max_gram": 2,
        		"token_chars": [
            		"letter",
            		"digit",
            		"whitespace",
            		"punctuation",
            		"symbol"
        		]
        		}
    		}
    	}
	},
    "mappings":{
        "user":{
            "properties":{
                "id":{
                    "type":"long",
                    "store": true,
                    "index": false,
                    "fields":{
                        "keyword":{
                            "type":"keyword"
                        }
                    }
                },
                "username":{
                    "type":"text",
                    "store": true,
                    "index": true,
                    "analyzer": "myanalyzer",
                    "fields":{
                        "keyword":{
                            "type":"keyword"
                        }
                    }
                },
                "password":{
                    "type":"text",
                    "store": true,
                    "index": true,
                    "analyzer": "myanalyzer",
                    "fields":{
                        "keyword":{
                            "type":"keyword"
                        }
                    }
                },
                "age":{
                    "type":"integer",
                    "fields":{
                        "keyword":{
                            "type":"keyword"
                        }
                    }
                },
                "ip":{
                    "type":"text",
                    "store": true,
                    "index": true,
                    "analyzer": "myanalyzer",
                    "fields":{
                        "keyword":{
                            "type":"keyword"
                        }
                    }
                }
            }
        }
    }
}

postman截圖:

或者可以使用這種方式:

POST localhost:9200/my_user/_analyze?analyzer=myanalyzer&pretty=true

{
  "text": "2 Quick Fo18陳xes姥爺."
}

接着用postman查詢構建的這個my_user索引,類型爲user。

POST localhost:9200/my_user/_analyze

{
  "analyzer": "myanalyzer",
  "text": "2 Quick Fo18陳xes姥爺."
}


// 結果
{
    "tokens": [
        {
            "token": "2",
            "start_offset": 0,
            "end_offset": 1,
            "type": "word",
            "position": 0
        },
        {
            "token": "2 ",
            "start_offset": 0,
            "end_offset": 2,
            "type": "word",
            "position": 1
        },
        {
            "token": " ",
            "start_offset": 1,
            "end_offset": 2,
            "type": "word",
            "position": 2
        },
        {
            "token": " Q",
            "start_offset": 1,
            "end_offset": 3,
            "type": "word",
            "position": 3
        },
        {
            "token": "Q",
            "start_offset": 2,
            "end_offset": 3,
            "type": "word",
            "position": 4
        },
        {
            "token": "Qu",
            "start_offset": 2,
            "end_offset": 4,
            "type": "word",
            "position": 5
        },
        {
            "token": "u",
            "start_offset": 3,
            "end_offset": 4,
            "type": "word",
            "position": 6
        },
        {
            "token": "ui",
            "start_offset": 3,
            "end_offset": 5,
            "type": "word",
            "position": 7
        },
        {
            "token": "i",
            "start_offset": 4,
            "end_offset": 5,
            "type": "word",
            "position": 8
        },
        {
            "token": "ic",
            "start_offset": 4,
            "end_offset": 6,
            "type": "word",
            "position": 9
        },
        {
            "token": "c",
            "start_offset": 5,
            "end_offset": 6,
            "type": "word",
            "position": 10
        },
        {
            "token": "ck",
            "start_offset": 5,
            "end_offset": 7,
            "type": "word",
            "position": 11
        },
        {
            "token": "k",
            "start_offset": 6,
            "end_offset": 7,
            "type": "word",
            "position": 12
        },
        {
            "token": "k ",
            "start_offset": 6,
            "end_offset": 8,
            "type": "word",
            "position": 13
        },
        {
            "token": " ",
            "start_offset": 7,
            "end_offset": 8,
            "type": "word",
            "position": 14
        },
        {
            "token": " F",
            "start_offset": 7,
            "end_offset": 9,
            "type": "word",
            "position": 15
        },
        {
            "token": "F",
            "start_offset": 8,
            "end_offset": 9,
            "type": "word",
            "position": 16
        },
        {
            "token": "Fo",
            "start_offset": 8,
            "end_offset": 10,
            "type": "word",
            "position": 17
        },
        {
            "token": "o",
            "start_offset": 9,
            "end_offset": 10,
            "type": "word",
            "position": 18
        },
        {
            "token": "o1",
            "start_offset": 9,
            "end_offset": 11,
            "type": "word",
            "position": 19
        },
        {
            "token": "1",
            "start_offset": 10,
            "end_offset": 11,
            "type": "word",
            "position": 20
        },
        {
            "token": "18",
            "start_offset": 10,
            "end_offset": 12,
            "type": "word",
            "position": 21
        },
        {
            "token": "8",
            "start_offset": 11,
            "end_offset": 12,
            "type": "word",
            "position": 22
        },
        {
            "token": "8陳",
            "start_offset": 11,
            "end_offset": 13,
            "type": "word",
            "position": 23
        },
        {
            "token": "陳",
            "start_offset": 12,
            "end_offset": 13,
            "type": "word",
            "position": 24
        },
        {
            "token": "陳x",
            "start_offset": 12,
            "end_offset": 14,
            "type": "word",
            "position": 25
        },
        {
            "token": "x",
            "start_offset": 13,
            "end_offset": 14,
            "type": "word",
            "position": 26
        },
        {
            "token": "xe",
            "start_offset": 13,
            "end_offset": 15,
            "type": "word",
            "position": 27
        },
        {
            "token": "e",
            "start_offset": 14,
            "end_offset": 15,
            "type": "word",
            "position": 28
        },
        {
            "token": "es",
            "start_offset": 14,
            "end_offset": 16,
            "type": "word",
            "position": 29
        },
        {
            "token": "s",
            "start_offset": 15,
            "end_offset": 16,
            "type": "word",
            "position": 30
        },
        {
            "token": "s姥",
            "start_offset": 15,
            "end_offset": 17,
            "type": "word",
            "position": 31
        },
        {
            "token": "姥",
            "start_offset": 16,
            "end_offset": 17,
            "type": "word",
            "position": 32
        },
        {
            "token": "姥爺",
            "start_offset": 16,
            "end_offset": 18,
            "type": "word",
            "position": 33
        },
        {
            "token": "爺",
            "start_offset": 17,
            "end_offset": 18,
            "type": "word",
            "position": 34
        },
        {
            "token": "爺.",
            "start_offset": 17,
            "end_offset": 19,
            "type": "word",
            "position": 35
        },
        {
            "token": ".",
            "start_offset": 18,
            "end_offset": 19,
            "type": "word",
            "position": 36
        }
    ]
}

postman截圖:

得到的結果是:分詞成功。

3.4 重新構建springboot使用myanalyzer分詞器,做全文檢索

新建一個User實體:(對應elasticsearch中的my_user索引,和user類型)

@Data
@NoArgsConstructor
@AllArgsConstructor
@Document(indexName = "my_user", type = "user")
public class User {
    /**
     *     index:是否設置分詞
     *     analyzer:存儲時使用的分詞器
     *     searchAnalyze:搜索時使用的分詞器
     *     store:是否存儲
     *     type: 數據類型
     */

    @Id
    @Field(store = true, index = false, type =FieldType.Integer)
    private Integer id;
    @Field(store = true, index = true, type = FieldType.keyword, analyzer = "myanalyzer", searchAnalyzer = "myanalyzer")
    private String username;
    @Field(store = true, index = true, type = FieldType.keyword, analyzer = "myanalyzer", searchAnalyzer = "myanalyzer")
    private String password;
    @Field(store = true, index = true, type = FieldType.Integer, analyzer = "myanalyzer", searchAnalyzer = "myanalyzer")
    private Integer age;
    @Field(store = true, index = true, type = FieldType.keyword, analyzer = "myanalyzer", searchAnalyzer = "myanalyzer")
    private String ip;
    
}

新建UserRepository接口。

@Repository
public interface UserRepository extends ElasticsearchRepository<User, Long> {
}

在測試類中進行測試: (插入數據)

@RunWith(SpringRunner.class)
@SpringBootTest
public class UserTest {

    @Autowired
    UserRepository userRepository;

    // 插入數據
    @Test
    public void testInsert(){
        userRepository.save(new User(3, "高新興", "gao45", 18, "我登錄的ip地址是:127.145.0.11"));
        userRepository.save(new User(4, "神州@數碼", "shen18", 18, "我登錄的ip地址是:127.124.0.11"));
        userRepository.save(new User(6, "西南大學", "xida", 18, "我登錄的ip地址是:127.126.0.11"));
        userRepository.save(new User(7, "北京大學", "beida", 18, "我記錄的ip地址是:127.127.0.11"));
        userRepository.save(new User(8, "姚#明", "yao210", 18, "我登錄的@#%ip地址是:127.248.0.11"));
        userRepository.save(new User(9, "鄧紫棋", "dengml", 18, "我使用的ip地址是:127.249.0.11"));
        userRepository.save(new User(10, "李榮浩", "li06", 18, "我使用的@ip地址是:127.234.0.11"));
        userRepository.save(new User(11, "陳奕迅", "19ch8en", 18, "我登錄的ip地址是:127.219.0.11"));
        userRepository.save(new User(12, "周杰倫", "xiayu2014", 18, "我登錄的ip地址是:127.0.0.11"));
        userRepository.save(new User(13, "林俊杰", "zho99", 18, "我登錄,的ip地址是:127.111.0.11"));
        userRepository.save(new User(137, "林薇因", "zho99", 18, "我登錄,的ip地址是:127.111.0.11"));
    }

插入數據後, elasticsearch裏面的數據顯示如下:

查詢數據:

    @Test
    public void testQueryByStr(){
        try {
            String searchStr = "陳夏天u馬立,@45";
            QueryStringQueryBuilder builder = new QueryStringQueryBuilder(searchStr);

            //  重點是下面這行代碼
            builder.analyzer("myanalyzer").field("username").field("password").field("ip");
            Iterable<User> search = userRepository.search(builder);
            Iterator<User> iterator = search.iterator();
            while (iterator.hasNext()){
                System.out.println("---> 匹配數據: "+iterator.next());
            }
        }catch (Exception e){
            System.out.println("---> 異常信息 "+e);
        }
    }

查詢結果:

---> 匹配數據: User(id=33, username=陳%喜華, password=gao45, age=18, ip=我登錄的ip地址是:127.145.0.11)
---> 匹配數據: User(id=3, username=高新興, password=gao45, age=18, ip=我登錄的ip地址是:127.145.0.11)
---> 匹配數據: User(id=35, username=馬@,#立志, password=ling009, age=18, ip=我記錄的ip地址是:127.125.0.11)
---> 匹配數據: User(id=48, username=郭才瑩, password=yao210, age=18, ip=我登錄的,@#%ip地址是:127.248.0.11)
---> 匹配數據: User(id=126, username=夏雨, password=xiayu2014, age=18, ip=我登錄的ip地址是:127.0.0.11)
---> 匹配數據: User(id=12, username=周杰倫, password=xiayu2014, age=18, ip=我登錄的ip地址是:127.0.0.11)
---> 匹配數據: User(id=115, username=朱&@#%夏宇, password=19ch8en, age=18, ip=我登錄的ip地址是:127.219.0.11)
---> 匹配數據: User(id=8, username=姚#明, password=yao210, age=18, ip=我登錄的@#%ip地址是:127.248.0.11)
---> 匹配數據: User(id=10, username=李榮浩, password=li06, age=18, ip=我使用的@ip地址是:127.234.0.11)
---> 匹配數據: User(id=104, username=黃小羣, password=li06, age=18, ip=我使用的@ip地址是:127.234.0.11)
---> 匹配數據: User(id=36, username=陳耀鵬, password=xida, age=18, ip=我登錄的ip地址是:127.126.0.11)
---> 匹配數據: User(id=11, username=陳奕迅, password=19ch8en, age=18, ip=我登錄的ip地址是:127.219.0.11)
---> 匹配數據: User(id=137, username=林薇因, password=zho99, age=18, ip=我登錄,的ip地址是:127.111.0.11)
---> 匹配數據: User(id=13, username=林俊杰, password=zho99, age=18, ip=我登錄,的ip地址是:127.111.0.11)
---> 匹配數據: User(id=4, username=神州@數碼, password=shen18, age=18, ip=我登錄的ip地址是:127.124.0.11)
---> 匹配數據: User(id=5, username=嶺南師範, password=ling009, age=18, ip=我記錄的ip地址是:127.125.0.11)
---> 匹配數據: User(id=9, username=鄧紫棋, password=dengml, age=18, ip=我使用的ip地址是:127.249.0.11)
---> 匹配數據: User(id=34, username=鍾楚瑩, password=shen18, age=18, ip=我登錄的ip地址是:127.124.0.11)
---> 匹配數據: User(id=49, username=黃羣, password=dengml, age=18, ip=我使用的ip地址是:127.249.0.11)

到處,springboot整合elasticsearch以及ik分詞器,做全文檢索完成了。

先告一段落,文章寫得不是很好。但希望你能看懂。如果有不懂的,底下留言。我會及時更新

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章