spring-boot ElasticSearch-5.6.12 windows 安裝,mysql,csv,pdf,word導入到ES

系統查詢速度慢,就想用elasticsearch增加查詢速度。並且能把pdf,csv 導入到elasticsearch.

系統使用了springboot版本號是.2.0.6.RELEASE。那麼首先要確認elasticsearch的版本號

........
<parent>
       <groupId>org.springframework.boot</groupId>
       <artifactId>spring-boot-starter-parent</artifactId>
       <version>2.0.6.RELEASE</version>
       <relativePath/> <!-- lookup parent from repository -->
</parent>
<dependencies>
      <dependency>
            <groupId>org.springframework.data</groupId>
            <artifactId>spring-data-elasticsearch</artifactId>
      </dependency>
      <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
      </dependency>
......

在intellij idea 的maven包裏面可以看到版本號

SpringBoot 2.0.6版本對應的是es5.6.12的版本

那麼就是

1.spring boot 2.0.6

2.jdk8

3.elastic search 版本號:5.6.12

4.elasticsearch-head(可視化插件,爲了不出錯,要和elastic search版本號一致),版本號:5.6.12

5.logstash-5.6.12。能導入mysql(或其他數據庫),csv到elastic search

6.fscrawler-es5-2.6。可以導入pdf,word文件。

 

參考網站

1.參考代碼(稍微修改):https://github.com/fonxian/spring-elasticsearch-example

2.elastic search,logstash都是在這裏下載:https://www.elastic.co/cn/downloads/past-releases

3.elastic search下載:https://www.elastic.co/cn/downloads/past-releases/elasticsearch-5-6-12

4.logstash下載:https://www.elastic.co/cn/downloads/past-releases/logstash-5-6-12

5.安裝elastic search header(先要安裝nodejs)的參考https://www.cnblogs.com/asker009/p/10045125.html

6.fscrawler下載安裝說明的網站:https://fscrawler.readthedocs.io/en/fscrawler-2.6/

7.安裝包放在百度網盤一份

鏈接:https://pan.baidu.com/s/1CzHfg0lvPI81kxgWpkLHxg
提取碼:aake

安裝說明省略。

 

啓動這些服務

1.啓動elastic search

直接雙擊:E:\soft\elasticsearch-5.6.12\bin\elasticsearch.bat

訪問http://localhost:9200/

網頁內容:

name	"3cgkZDo"
cluster_name	"elasticsearch"
cluster_uuid	"7Np7tHwwSpy_T-khLB7utA"
version	
number	"5.6.12"
build_hash	"cfe3d9f"
build_date	"2018-09-10T20:12:43.732Z"
build_snapshot	false
lucene_version	"6.6.1"
tagline	"You Know, for Search"

這個時候,intellij idea 的application.properties的才能生效。否則報錯

spring.data.elasticsearch.cluster-nodes = localhost:9300

2。啓動elastic search header

doc命令:

Microsoft Windows [版本 10.0.17763.615]
(c) 2018 Microsoft Corporation。保留所有權利。

C:\Users\lunmei>cd E:\soft\elasticsearch-head-master

C:\Users\lunmei>e:

E:\soft\elasticsearch-head-master>grunt server
Running "connect:server" (connect) task
Waiting forever...
Started connect web server on http://localhost:9100

http://localhost:9100/

 

3.用logstash導入csv,mysql
 新增文件E:\soft\logstash-5.6.12\bin\logstash-csv.conf。用來把文件C:\Users\lunmei\Desktop\sys_role.csv導入到elasticsearch

input {
  file {
    path => ["C:\Users\lunmei\Desktop\sys_role.csv"]  
    start_position => "beginning"
  }
}
filter {
  csv {
    separator => ","
    columns => ["id","name","value","tips","status","create_time","update_time"]
  }
  mutate {
    convert => {
      "id" => "string"
    }
  }
 }

output {
  elasticsearch {
        hosts => ["127.0.0.1:9200"]
        index => "role"
	document_id => "%{id}"
		document_type => "role"
  }
  stdout{
	codec => rubydebug
  }
}

新增文件:E:\soft\logstash-5.6.12\bin\logstash-mysql.conf。用來把sql數據導入到elastic search

input {
  stdin {
  }
  jdbc {  
  jdbc_connection_string => "jdbc:mysql://192.168.0.100:3306/zillion-wfm?characterEncoding=UTF-8&useSSL=false&autoReconnect=true"
  
  jdbc_user => "root"
  jdbc_password => "root"
  jdbc_driver_library => "mysql-connector-java-5.1.47.jar"
  jdbc_driver_class => "com.mysql.jdbc.Driver"
  jdbc_paging_enabled => "true"
  jdbc_page_size => "50000"
  jdbc_default_timezone => "Asia/Shanghai" 
  record_last_run => true
  use_column_value => true
  tracking_column => "price"
  last_run_metadata_path => "my_info_last"
  #statement_filepath => "jdbc-sql.sql"
  statement => "SELECT * FROM kn_knowledge"
  schedule => "* * * * *"
  type => "knowledge"
  }
}

filter {
  json {
  source => "message"
  remove_field => ["message"]
  }
}

output {
  elasticsearch {
  hosts => "127.0.0.1:9200"
  index => "knowledge"
  document_id => "%{id}"
  }
  stdout {
  codec => json_lines
  }
}

命令

Microsoft Windows [版本 10.0.17763.615]
(c) 2018 Microsoft Corporation。保留所有權利。

C:\Users\lunmei>e:

E:\>logstash -f logstash-csv.conf
'logstash' 不是內部或外部命令,也不是可運行的程序
或批處理文件。

E:\>cd E:\soft\logstash-5.6.12\bin

E:\soft\logstash-5.6.12\bin>logstash -f logstash-csv.conf
Sending Logstash's logs to E:/soft/logstash-5.6.12/logs which is now configured via log4j2.properties
[2019-07-19T18:02:22,722][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"E:/soft/logstash-5.6.12/modules/fb_apache....
E:\soft\logstash-5.6.12\bin>logstash -f logstash-mysql.conf
Sending Logstash's logs to E:/soft/logstash-5.6.12/logs which is now configured via log4j2.properties
..........
[2019-07-19T18:02:48,618][FATAL][logstash.runner          ] SIGINT received. Terminating immediately..
終止批處理操作嗎(Y/N)?
^C系統無法打開指定的設備或文件。

E:\soft\logstash-5.6.12\bin>

文件或者mysql 導入成功

4.

詳細看網頁的說明

第一步要創建一個fscrawlerRunner.bat。創建完後雙擊。

啓動fscrawler

E:\soft\fscrawler-es5-2.6\bin>fscrawler test
17:00:23,187 WARN  [f.p.e.c.f.c.FsCrawlerCli] job [test] does not exist
17:00:23,189 INFO  [f.p.e.c.f.c.FsCrawlerCli] Do you want to create it (Y/N)?
y
17:00:30,241 INFO  [f.p.e.c.f.c.FsCrawlerCli] Settings have been created in [C:\Users\lunmei\.fscrawler\test\_settings.json]. Please review and edit before relaunch

這裏注意test是個變量,代表job name。第一次啓動這個job會創建一個相關的_setting.json用來配置文件和es相關的信息。而我們這個很明顯在“C:\Users\lunmei\.fscrawler\test\_settings.json“。

那麼在看網頁資料的時候,所有涉及_settings.json,就找到地方了。這個是自動自動創建的。我一開始在這裏浪費很多時間啊。

改好此文件後,再運行命令”fscrawler test“。把文件導入到elastic search中。

我有一個word文檔,內容是“你好啊,你好啊”,但是想按照字段導入到elastic search。這個還得研究研究。加油。

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章