文章目錄
1, 運行solr程序
官方指南:https://lucene.apache.org/solr/guide/8_3/solr-tutorial.html
說明:在今日(2020,1,1),最新版爲8.4.0 , 而國內的鏡像都是最新版,但8.4的文檔卻沒有,只好參照8.3版的文檔,實踐證明這兩者基本一致
操作步驟 | 說明 |
---|---|
下載package | http://mirror.bit.edu.cn/apache/lucene/solr/8.4.0/solr-8.4.0.tgz |
解壓後啓動 | ./solr start -e cloud(此時進入配置環節:一路默熱,最後選擇config名稱爲“sample_techproducts_configs”) |
//創建collection | ./solr create -c gettingstarted -d sample_techproducts_configs -s 2 -rf 2 (等同於上面的交互界面創建collection,使用server/solr/configsets/sample_techproducts_configs裏面的配置) |
web端訪問服務 | 訪問(驗證collection):http://localhost:8983/solr |
向collection添加數據 | ./post -c gettingstarted …/example/exampledocs/* |
刪除collection | ./solr delete -c gettingstarted |
停止solr服務 | ./solr stop -all |
- 刪除collection數據
#根據id刪除一條記錄
./post -c data1 -d "<delete><id>/en/45_2006</id></delete>"
#刪除所有數據
./post -c data1 -d "<delete><query>*:*</query></delete>"
- solr配置文件
wang@wang-pc:~/unpack/solr-8.4.0/bin$ find ../ -name solrconfig.xml
../server/solr/configsets/_default/conf/solrconfig.xml
../server/solr/configsets/sample_techproducts_configs/conf/solrconfig.xml
../example/files/conf/solrconfig.xml
../example/example-DIH/solr/tika/conf/solrconfig.xml
../example/example-DIH/solr/solr/conf/solrconfig.xml
../example/example-DIH/solr/atom/conf/solrconfig.xml
../example/example-DIH/solr/db/conf/solrconfig.xml
../example/example-DIH/solr/mail/conf/solrconfig.xml
2, exercise 1: 查詢語法
a, 全量查詢: q=* : *
curl "http://localhost:8983/solr/gettingstarted/select?indent=on&q=*:*"
b, 單詞過濾: q=foundation
curl "http://localhost:8983/solr/gettingstarted/select?q=foundation"
c, 等值查詢(key:val): q=cat:electronics
curl "http://localhost:8983/solr/gettingstarted/select?q=cat:electronics"
d, 短語查詢: q=“CAS+latency”
此時:+充當連字符
curl "http://localhost:8983/solr/gettingstarted/select?q=\"CAS+latency\""
e, 聯合查詢(and/or): +electronics +music
此時:+充當and
The encoding for + is %2B
The encoding for blank is %20 (兩個+相連要空白符合,以區分短語查詢)
require: +
require not: -
curl "http://localhost:8983/solr/gettingstarted/select?q=%2Belectronics%20%2Bmusic"
##下面的+-: 加號仍算連字符
curl "http://localhost:8983/solr/gettingstarted/select?q=%2Belectronics+-music"
3, exercise 2: faceting(分面查詢)
準備工作:創建collection, 配置schemaless索引(自動類型推斷)
#啓動關閉的solr服務
./solr start -c -p 8983 -s ../example/cloud/node1/solr
./solr start -c -p 7574 -s ../example/cloud/node2/solr -z localhost:9983
#創建collection: film, 無預定義模式
solr create -c films -s 2 -rf 2
#創建普通字段:手動設置某字段(name)的數據類型text_general
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' http://localhost:8983/solr/films/schema
#創建copy field(多個複製到一個)字段: 所有的數據都複製到一個字段中
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-copy-field" : {"source":"*","dest":"_text_"}}' http://localhost:8983/solr/films/schema
#加載example/films/數據:films.json, 或film.csv, 或film.xml
./post -c films ../example/films/films.json
##### 至此, 可以使用web ui工具來模糊查詢了 ####
a,Field Facets (字段值分面): 相當於group by 某字段
查詢條件如下:
- (q=* : *)
- (rows=0)
- (facet=true/on)
- (facet.field=genre_str)
curl "http://localhost:8983/solr/films/select?q=*:*&rows=0&facet=true&facet.field=genre_str"
#curl "http://localhost:8983/solr/films/select?=&q=*:*&facet.field=genre_str&facet.mincount=200&facet=on&rows=0"
返回數據:(相當於group by 某字段)
"response":{"numFound":1100,"start":0,"maxScore":1.0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{
"genre_str":[
"Drama",552,
"Comedy",389,
"Romance Film",270,
"Thriller",259,
b,Range Facets (區間段分面): 相當於group by 各個區間
web ui 暫不支持此類查詢,可以直接在瀏覽器url中輸入此Range Facets URL
url=$(echo "http://localhost:8983/solr/films/select?
facet.range=initial_release_date&
facet.range.start=NOW-20YEAR&
facet.range.end=NOW&
facet.range.gap=%2B1YEAR&
facet=true&
q=*%3A*&
rows=0" |xargs |sed "s/[[:space:]]//g" )
curl $url
返回數據
"response":{"numFound":1100,"start":0,"maxScore":1.0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{},
"facet_ranges":{
"initial_release_date":{
"counts":[
"1997-07-28T17:12:06.919Z",0,
"1998-07-28T17:12:06.919Z",0,
"1999-07-28T17:12:06.919Z",48,
"2000-07-28T17:12:06.919Z",82,
c, Pivot Facets (中心點分面): 相當於group by 某中心字段 ,再內容下鑽: group by 其他字段
web ui 暫不支持此類查詢,可以直接在瀏覽器url中輸入地址Pivot Facets URL
curl "http://localhost:8983/solr/films/select?q=*:*&rows=0&facet=on&facet.pivot=genre_str,directed_by_str"
返回數據
"response":{"numFound":1100,"start":0,"maxScore":1.0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{},
"facet_ranges":{},
"facet_intervals":{},
"facet_heatmaps":{},
"facet_pivot":{
"genre_str,directed_by_str":[{
"field":"genre_str",
"value":"Drama",
"count":552,
"pivot":[{
"field":"directed_by_str",
"value":"Ridley Scott",
"count":5},
{
"field":"directed_by_str",
"value":"Steven Soderbergh",
"count":5},