文章目錄
一. 概述
本文記錄了Spring Boot與Elasticsearch的整合方式,Spring boot的版本爲2.1.9.RELEASE,Elasticsearch的版本爲7.6.1。
參考: 官網地址
如果需要本文項目源代碼,就評論留言吧
二. 集成
2.1 maven中添加依賴
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.6.1</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>7.6.1</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<version>7.6.1</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- 爲了方便後續代碼的編寫,引入了lombok lombok與整合無關,完全可以不依賴 -->
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
<scope>provided</scope>
<version>1.18.8</version>
</dependency>
spring-boot-starter-parent通過聲明成parent繼承,或是dependencyManagement中使用pom的方式引入都可以。
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.9.RELEASE</version>
<relativePath/>
</parent>
<dependencyManagement>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.9.RELEASE</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</denpendencyManagement>
在ES客戶端啓動時,必須依賴log4j才能運行,由於我們繼承了spring-boot-starter-parent,這個父類中已經包含了log4j2,因此不需要顯示的寫出來。(反之,如果在普通的Spring項目中集成ES,那麼就需要顯示的依賴log4j)
注意: 本文連接ES的客戶端使用的是REST clients,因此後續的配置和api調用都圍繞着REST clients,實際上,官網還提供了Jest用於連接ES,訪問頁面後搜索關鍵字"Connecting to Elasticsearch by Using Jest"。
2.2 配置文件
yaml,properties,配置類,三種方式選其一。
2.3.1 yaml
server:
port: 8080
spring:
application:
name: spring-boot-es-demo
elasticsearch:
rest:
username: user
password: 123456
uris: https://127.0.0.1:9200
2.3.2 properties
server.port=8080
spring.application.name=spring-boot-es-demo
spring.elasticsearch.rest.username=user
spring.elasticsearch.rest.password=123456
spring.elasticsearch.rest.uris=https://127.0.0.1:9200
2.3.3 配置類
- 自定義的配置 (需要寫在yaml或者properties中)
#============================================================================
# Elasticsearch-核心配置
#============================================================================
# http連接超時時間
elasticsearch.connectTimeout=1000
# socket連接超時時間
elasticsearch.socketTimeout=30000
# 獲取連接的超時時間
elasticsearch.connectionRequestTimeout=500
# 最大連接數
elasticsearch.maxConnTotal=100
# 最大路由連接數
elasticsearch.maxConnPerRoute=100
# 任務最長可執行時間 (單位:小時)
elasticsearch.executeTimeout=8
# 用戶名
elasticsearch.username=admin
# 密碼
elasticsearch.password=123456
- ESProperties用於與ES相關配置進行映射
(PS: 你完全可以不使用本類,配置寫在Disconf或者Apollo,自己寫配置中心獲取配置)
@Getter
@Setter
@ConfigurationProperties(prefix = "elasticsearch")
@Configuration
public class ESProperties {
/**
* http連接超時時間
*/
private String connectTimeout;
/**
* socket連接超時時間
*/
private String socketTimeout;
/**
* 獲取連接的超時時間
*/
private String connectionRequestTimeout;
/**
* 最大連接數
*/
private String maxConnTotal;
/**
* 最大路由連接數
*/
private String maxConnPerRoute;
/**
* 用戶名
*/
private String username;
/**
* 密碼
*/
private String password;
/**
* Elasticsearch http訪問路徑
*/
private String httpHost;
}
- Elasticsearch 配置類
@RequiredArgsConstructor(onConstructor = @__(@Autowired))
@Configuration
public class ElasticsearchConfig {
private final ESProperties esProperties;
@Bean
public RestHighLevelClient clientDev() {
final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(
esProperties.getUsername(), esProperties.getPassword()
));
// 初始化ES客戶端的構造器
RestClientBuilder builder = RestClient.builder(httpHostHandlerDev());
// 異步的請求配置
builder.setRequestConfigCallback(builder1 -> {
// 連接超時時間 默認-1
builder1.setConnectTimeout(Integer.parseInt(esProperties.getConnectTimeout()));
//
builder1.setSocketTimeout(Integer.parseInt(esProperties.getSocketTimeout()));
// 獲取連接的超時時間 默認-1
builder1.setConnectionRequestTimeout(Integer.parseInt(esProperties.getConnectionRequestTimeout()));
return builder1;
});
// 異步的httpclient連接數配置
builder.setHttpClientConfigCallback(httpAsyncClientBuilder -> {
// 最大連接數
httpAsyncClientBuilder.setMaxConnTotal(Integer.parseInt(esProperties.getMaxConnTotal()));
// 最大路由連接數
httpAsyncClientBuilder.setMaxConnPerRoute(Integer.parseInt(esProperties.getMaxConnPerRoute()));
// 賦予連接憑證
httpAsyncClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
return httpAsyncClientBuilder;
});
return new RestHighLevelClient(builder);
}
/**
* 爲了應對集羣部署的es,使用以下寫法,返回HttpHost數組
*/
private HttpHost[] httpHostHandlerDev() {
String[] hosts = esProperties.getHttpHost().split(",");
HttpHost[] httpHosts = new HttpHost[hosts.length];
for (int i = 0; i < hosts.length; i++) {
String ip = hosts[i].split(":")[0];
int port = Integer.parseInt(hosts[i].split(":")[1]);
httpHosts[i] = new HttpHost(ip, port, "http");
}
return httpHosts;
}
}
三. Api調用
3.1 查看索引是否存在
public boolean existIndex(String indexName) throws IOException {
GetIndexRequest request = new GetIndexRequest(indexName);
return esClient.indices().exists(request, RequestOptions.DEFAULT);
}
3.2 創建索引
public void createIndex(String indexName, int numberOfShards, int numberOfReplicas) throws IOException {
if (!existIndex(indexName)) {
CreateIndexRequest request = new CreateIndexRequest(indexName);
// settings部分
request.settings(Settings.builder()
// 創建索引時,分配的主分片的數量
.put("index.number_of_shards", numberOfReplicas)
// 創建索引時,爲每一個主分片分配的副本分片的數量
.put("index.number_of_replicas", numberOfReplicas)
);
// mapping部分 除了用json字符串來定義外,還可以使用Map或者XContentBuilder
request.mapping("{\n" +
" \"properties\": {\n" +
" \"message\": {\n" +
" \"type\": \"text\"\n" +
" }\n" +
" }\n" +
"}", XContentType.JSON);
// 創建索引(同步的方式)
// CreateIndexResponse response = esClient.indices().create(request, RequestOptions.DEFAULT);
// 創建索引(異步的方式)
esClient.indices().createAsync(request, RequestOptions.DEFAULT, new ActionListener<CreateIndexResponse>() {
@Override
public void onResponse(CreateIndexResponse createIndexResponse) {
log.debug("執行情況:" + createIndexResponse);
}
@Override
public void onFailure(Exception e) {
log.error("執行失敗的原因:" + e.getMessage()) ;
}
});
}
}
3.3 更新索引的settings配置
public void updateIndexSettings(String indexName) throws IOException {
UpdateSettingsRequest request = new UpdateSettingsRequest(indexName);
String settingKey = "index.number_of_replicas";
int settingValue = 2;
Settings.Builder settingsBuilder = Settings.builder().put(settingKey, settingValue);
request.settings(settingsBuilder);
// 是否更新已經存在的settings配置 默認false
request.setPreserveExisting(true);
// 更新settings配置(同步)
//esClient.indices().putSettings(request, RequestOptions.DEFAULT);
// 更新settings配置(異步)
esClient.indices().putSettingsAsync(request, RequestOptions.DEFAULT, new ActionListener<AcknowledgedResponse>() {
@Override
public void onResponse(AcknowledgedResponse acknowledgedResponse) {
log.debug("執行情況:" + acknowledgedResponse);
}
@Override
public void onFailure(Exception e) {
log.error("執行失敗的原因:" + e.getMessage()) ;
}
});
}
3.4 更新索引的mapping配置
public void putIndexMapping(String indexName) throws IOException {
PutMappingRequest request = new PutMappingRequest(indexName);
XContentBuilder builder = XContentFactory.jsonBuilder();
builder.startObject();
{
builder.startObject("properties");
{
builder.startObject("new_parameter");
{
builder.field("type", "text");
builder.field("analyzer", "ik_max_word");
}
builder.endObject();
}
builder.endObject();
}
builder.endObject();
request.source(builder);
// 新增mapping配置(同步)
//AcknowledgedResponse putMappingResponse = esClient.indices().putMapping(request, RequestOptions.DEFAULT);
// 新增mapping配置(異步)
esClient.indices().putMappingAsync(request, RequestOptions.DEFAULT, new ActionListener<AcknowledgedResponse>() {
@Override
public void onResponse(AcknowledgedResponse acknowledgedResponse) {
log.debug("執行情況:" + acknowledgedResponse);
}
@Override
public void onFailure(Exception e) {
log.error("執行失敗的原因:" + e.getMessage()) ;
}
});
}
3.5 新增Document
使用json字符串
public void addDocument1(String indexName) throws IOException {
IndexRequest request = new IndexRequest(indexName);
request.id("1");
String jsonString = "{" +
"\"user\":\"kimchy\"," +
"\"postDate\":\"2020-03-28\"," +
"\"message\":\"trying out Elasticsearch\"" +
"}";
request.source(jsonString, XContentType.JSON);
request.routing("routing");
esClient.index(request, RequestOptions.DEFAULT);
}
使用Map
public void addDocument2(String indexName) throws IOException{
Map<String, Object> jsonMap = new HashMap<>();
jsonMap.put("user", "kimchy");
jsonMap.put("postDate", new Date());
jsonMap.put("message", "trying out Elasticsearch");
IndexRequest indexRequest = new IndexRequest(indexName).id("1").source(jsonMap);
indexRequest.routing("routing");
esClient.indexAsync(indexRequest, RequestOptions.DEFAULT, new ActionListener<IndexResponse>() {
@Override
public void onResponse(IndexResponse indexResponse) {
log.debug("執行情況: " + indexResponse);
}
@Override
public void onFailure(Exception e) {
log.error("執行失敗的原因");
}
});
}
3.6 修改Document
public void updateDocument(String indexName) throws IOException{
// 傳入索引名稱和需要更新的Document的id
UpdateRequest request = new UpdateRequest(indexName, "1");
// 更新的內容會與數據本身合併, 若存在則更新,不存在則新增
// 組裝更新內容的數據結構有四種: json字符串、Map、XContentBuilder、Key-Value
// json字符串
// String jsonString = "{" +
// "\"updated\":\"2020-03-29\"," +
// "\"reason\":\"daily update\"" +
// "}";
// request.doc(jsonString);
// Map
// Map<String, Object> jsonMap = new HashMap<>();
// jsonMap.put("updated", new Date());
// jsonMap.put("reason", "daily update");
// request.doc(jsonMap);
// XContentBuilder
// XContentBuilder builder = XContentFactory.jsonBuilder();
// builder.startObject();
// builder.timeField("updated", new Date());
// builder.timeField("reason", "daily update");
// builder.endObject();
// request.doc(builder);
// Key-Value
request.doc("updated", new Date(),"reason", "daily update");
// 同步的方式發送更新請求
esClient.update(request, RequestOptions.DEFAULT);
}
3.7 刪除Document
public void deleteDocument(String indexName) throws IOException{
DeleteByQueryRequest deleteByQueryRequest = new DeleteByQueryRequest();
// 待刪除的數據需要滿足的條件
deleteByQueryRequest.setQuery(new TermQueryBuilder("user", "kimchy"));
// 忽略版本衝突
deleteByQueryRequest.setConflicts("proceed");
esClient.deleteByQuery(deleteByQueryRequest, RequestOptions.DEFAULT);
}
3.8 bulk api批量操作
public void bulkDocument(String indexName) throws IOException{
BulkRequest request = new BulkRequest();
// 刪除操作
request.add(new DeleteRequest(indexName, "3"));
// 更新操作
request.add(new UpdateRequest(indexName, "2")
.doc(XContentType.JSON,"other", "test"));
// 普通的PUT操作,相當於全量替換或新增
request.add(new IndexRequest(indexName).id("4")
.source(XContentType.JSON,"field", "baz"));
esClient.bulk(request, RequestOptions.DEFAULT)
}
3.10 搜索描述中包含dubbo的document,並篩選過濾年齡15~40之間的document
public void searchDocument(String indexNmae) throws IOException{
SearchRequest searchRequest = new SearchRequest(indexNmae);
BoolQueryBuilder booleanQueryBuilder = QueryBuilders.boolQuery();
// 過濾出年齡在15~40歲之間的docuemnt
booleanQueryBuilder.filter(QueryBuilders.rangeQuery("age").from(15).to(40));
// bool must條件, 找出description字段中包含Dubbo的document
booleanQueryBuilder.must(QueryBuilders.matchQuery("description", "Dubbo"));
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(booleanQueryBuilder);
sourceBuilder.from(0);
sourceBuilder.size(5);
sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
searchRequest.source(sourceBuilder);
// 同步的方式發送請求
esClient.search(searchRequest, RequestOptions.DEFAULT);
}
四. 拓展
4.1 IK分詞器
4.1.1 下載IK分詞器
點我 參考IK與Elasticsearch的版本對照圖選擇下載合適的版本
4.1.2 安裝IK分詞器
對下載後的項目進行編譯、打包。打包後的資源在target/release目錄下,elasticsearch-analysis-ik-7.6.1.zip
在ES的安裝目錄中找到plugins目錄,手動創建子目錄ik,最後將elasticsearch-analysis-ik-7.6.1.zip解壓縮到ik目錄中,重啓ES即可。
4.1.3 IK熱詞更新
如果直接使用本地自定義詞庫文件定義最新的詞條,那麼每次定義完新的詞條後,必須重啓ES才能生效,這樣對整個系統影響非常大。IK支持遠程詞庫加載,實現的原理就是在ES啓動並加載IK時,在IK的初始化方法中開闢新的線程定時輪詢Mysql數據庫,將熱詞從DB同步至ES中。 注意:只能動態新增、更新,不能刪除已有熱詞。
- 修改pom.xml文件,使依賴中es的版本號與實際相符。
<properties>
<!-- 與環境中ES的版本號保持一致 -->
<elasticsearch.version>7.6.1</elasticsearch.version>
... 省略其它配置
</properties>
- 在org.wltea.analyzer.dic.Dictionary中,新增定時任務線程池,並修改IK初始化方法 initial()
HotDicLoadingTask的定義在第4步
private static ScheduledExecutorService hotDictionaryTaskPool = Executors.newScheduledThreadPool(1);
public static synchronized void initial(Configuration cfg) {
if (singleton == null) {
synchronized (Dictionary.class) {
if (singleton == null) {
... 省略代碼
// 啓動一個自定義的線程,實現遠程訪問DB,查詢熱詞
// 啓動後延遲10秒纔開始運行,每5秒鐘運行一次
// 實際工作項目中,沒必要達到這麼高頻率的運行次數
hotDictionaryTaskPool.scheduleAtFixedRate(new HotDicLoadingTask(),
10, 5, TimeUnit.SECONDS);
... 省略代碼
}
}
}
}
- 在org.wltea.analyzer.dic.Dictionary中,新增方法:
/**
* 遠程熱詞的加載方法
*/
public void reloadHotDictionary() {
// 加載自定義遠程核心詞庫 相當於在IKAnalyzer.cfg.xml中自定義配置了ext_dict詞庫
this.loadMainHotDicFromDB();
// 加載自定義遠程停用詞詞庫 相當於在IKAnalyzer.cfg.xml中自定義配置了ext_stopwords停用詞詞庫
// 所謂的停用詞,類似於介詞
this.loadStopWordDicFromDB();
}
/**
* 加載classpath下的配置文件
*/
private static Properties properties = new Properties();
/*
* 一般商業環境不會輕易修改數據庫
*/
static {
try {
logger.info("Register mysql database driver");
DriverManager.registerDriver(new com.mysql.cj.jdbc.Driver());
} catch (SQLException e) {
logger.error("Register mysql database driver error: ", e);
}
}
/**
* 訪問數據庫,加載自定義熱詞
* 向_MainDict加載填充詞典片段
*/
private void loadMainHotDicFromDB() {
try {
// 加載外部自定義的配置文件
// getDictRoot() - 用於獲取IK基礎路徑的方法 對應着: $ES_HOME/plugins/ik/config
Path file = PathUtils.get(getDictRoot(), "hot_dict_db_source.properties");
// 讀取配置信息至Properties
properties.load(new FileInputStream(file.toFile()));
logger.info("properties information: " + properties);
} catch (IOException e) {
logger.error("can not find properties: ", e);
}
try (Connection connection = DriverManager.getConnection(properties.getProperty("db.url"),
properties.getProperty("db.username"),
properties.getProperty("db.password"));
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery(properties.getProperty("db.reload.mainHotDic.sql"))
) {
while (resultSet.next()) {
String word = resultSet.getString("word");
logger.info("hot word from DB: " + word);
// _MainDict用於在內存中保存詞典信息,對應着main.dic中的文件
_MainDict.fillSegment(word.trim().toCharArray());
}
} catch (SQLException e) {
logger.error("execute sql or connect db exception: ", e);
}
}
/**
* 訪問數據庫 加載自定義停用詞
* 向_StopWords加載填充停用詞詞典片段
*/
private void loadStopWordDicFromDB() {
try {
Path path = PathUtils.get(getDictRoot(), "hot_dict_db_source.properties");
properties.load(new FileInputStream(path.toFile()));
} catch (IOException e) {
logger.error("can not find properties: ", e);
}
try (Connection connection = DriverManager.getConnection(properties.getProperty("db.url"),
properties.getProperty("db.username"),
properties.getProperty("db.password"));
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery(properties.getProperty("db.reload.stopWordDic.sql"))
) {
while (resultSet.next()) {
String word = resultSet.getString("word");
logger.info("stop word from DB: " + word);
// _StopWords用於在內存中保存停用詞信息,對應着stopword.dic
_StopWords.fillSegment(word.toCharArray());
}
} catch (SQLException e) {
logger.error("execute sql or connect db exception: ", e);
}
}
private String getProperty(String key){
if(props!=null){
return props.getProperty(key);
}
return null;
}
- 自定線程類,專門用於熱詞更新
public class HotDicLoadingTask implements Runnable{
private static final Logger LOGGER = ESPluginLoggerFactory.getLogger(HotDicLoadingTask.class.getName());
@Override
public void run() {
LOGGER.info("====================reload hot dictionary from mysql database====================");
/*
Dictionary在IK中是單例的 構造函數都是私有的,只能通過getSingleton()獲取對象實例,只能通過initial()來初始化對象實例
reloadHotDictionary() 用於加載遠程熱詞
*/
Dictionary.getSingleton().reloadHotDictionary();
}
}
- 新增配置文件 config/hot_dict_db_source.properties 用於提供連接Mysql的相關配置
db.url=jdbc:mysql://127.0.0.1:3306/hotdic?useUnicode=true&characterEncoding=utf8&serverTimezone=GMT%2B8
db.username=root
db.password=123456
db.reload.mainHotDic.sql=select word from tb_main_hot_dic
db.reload.stopWordDic.sql=select word from tb_stop_word_dic
- 新增mysql驅動依賴
找到pom.xml文件,新增如下依賴:
<!-- mysql的驅動 -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>8.0.19</version>
</dependency>
- 修改assembly配置文件,使項目打包時將mysql驅動打包至zip中。
找到src/main/assemblies/plugin.xml 在標籤內增加以下內容:
<dependencySet>
<outputDirectory/>
<useProjectArtifact>true</useProjectArtifact>
<useTransitiveFiltering>true</useTransitiveFiltering>
<includes>
<include>mysql:mysql-connector-java</include>
</includes>
</dependencySet>
- Mysql 數據庫相關腳本:
create database hotdic charset=utf8;
use hotdic;
DROP TABLE IF EXISTS `tb_main_hot_dic`;
CREATE TABLE `tb_main_hot_dic` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`word` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `tb_stop_word_dic`;
CREATE TABLE `tb_stop_word_dic` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`word` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;
- 修改jdk權限配置
JDK對外部執行應用也有權限限制,默認情況下,外部應用(如ES)在使用JDK相關內核組件的時候(如ClassLoader)、使用JDK網絡訪問其他應用的時候(如Socket連接等),都需要有對應的權限。這裏就是修改本地JDK,讓本地啓動的應用程序訪問JDK內核或通過JDK訪問外部資源的時候,擁有權限,避免錯誤的可能。
找到$JAVA_HOME/jre/lib/security/java.policy 在grant中添加如下內容:
permission java.lang.RuntimePermission "createClassLoader";
permission java.lang.RuntimePermission "getClassLoader";
permission java.net.SocketPermission "127.0.0.1:3306","connect,resolve";
permission java.lang.RuntimePermission "setContextClassLoader";
最後對ik進行打包,並把最終打包出的zip包的內容替換至/plugins/ik中 。