折騰了好幾天,終於把東西都搬到新住處了,累死我了.現在是光着膀子坐在電腦前碼字.
前幾天發表了一篇文章,寫的是關於lucene(見文章我封裝的全文檢索之lucene篇),對於這篇文章大傢什麼看法都有,有好有壞,不管好壞,都謝謝大家,我會繼續努力寫下去的,我也會參考你們的建議去修改一下,爭取寫出更好的!
今天準備寫的是關於solr的,solr相信大家有的已經很熟悉了,具體是什麼玩意,什麼怎麼用啊,我就不寫了.浪費oschina服務器硬盤空間.我就寫寫,我封裝的這套所謂的框架(好多人都說僅僅只是一個對索引的創建,更新,刪除以及查詢的幾個操作而已,不過確實是這樣的.名字起的有點大了.)
啥也不說,先浪費點oschina的硬盤再說(貼代碼):
package com.message.base.search.engine;
import com.message.base.pagination.PaginationSupport;
import com.message.base.pagination.PaginationUtils;
import com.message.base.search.SearchBean;
import com.message.base.search.SearchInitException;
import com.message.base.utils.StringUtils;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.client.solrj.response.UpdateResponse;
import org.apache.solr.common.SolrDocument;
import org.apache.solr.common.SolrDocumentList;
import org.apache.solr.common.SolrInputDocument;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.BeanUtils;
import java.net.MalformedURLException;
import java.util.*;
/**
* 基於solr實現的搜索引擎.
*
* @author sunhao([email protected])
* @version V1.0
* @createTime 13-5-5 下午9:36
*/
public class SolrSearchEngine extends AbstractSearchEngine {
private static final Logger logger = LoggerFactory.getLogger(SolrSearchEngine.class);
private String server = "http://localhost:8080/solr";
private SolrServer getSolrServer(){
if(StringUtils.isEmpty(server)){
logger.error("null solr server path!");
throw new SearchInitException("Give a null solr server path");
}
try {
return new CommonsHttpSolrServer(server);
} catch (MalformedURLException e) {
throw new SearchInitException("Connect to solr server error use server '" + server + "'");
}
}
public synchronized void doIndex(List<SearchBean> searchBeans) throws Exception {
SolrServer solrServer = getSolrServer();
List<SolrInputDocument> sids = new ArrayList<SolrInputDocument>();
for(SearchBean sb : searchBeans){
if(sb == null){
logger.debug("give SearchBean is null!");
return;
}
//初始化一些字段
sb.initPublicFields();
SolrInputDocument sid = new SolrInputDocument();
//保證每個對象的唯一性,而且通過對象的主鍵可以明確的找到這個對象在solr中的索引
sid.addField("id", "uniqueKey-" + sb.getIndexType() + "-" + sb.getId());
if(StringUtils.isEmpty(sb.getId())){
throw new SearchInitException("you must give a id");
}
sid.addField("pkId", sb.getId());
if(StringUtils.isEmpty(sb.getKeyword())){
throw new SearchInitException("you must give a keyword");
}
sid.addField("keyword", sb.getKeyword());
if(StringUtils.isEmpty(sb.getOwerId())){
throw new SearchInitException("you must give a owerId");
}
sid.addField("owerId", sb.getOwerId());
if(StringUtils.isEmpty(sb.getOwerName())){
throw new SearchInitException("you must give a owerName");
}
sid.addField("owerName", sb.getOwerName());
if(StringUtils.isEmpty(sb.getLink())){
throw new SearchInitException("you must give a link");
}
sid.addField("link", sb.getLink());
if(StringUtils.isEmpty(sb.getCreateDate())){
throw new SearchInitException("you must give a createDate");
}
sid.addField("createDate", sb.getCreateDate());
sid.addField("indexType", getIndexType(sb));
String[] doIndexFields = sb.getDoIndexFields();
Map<String, String> values = sb.getIndexFieldValues();
if(doIndexFields != null && doIndexFields.length > 0){
for(String f : doIndexFields){
//匹配動態字段
sid.addField(f + "_message", values.get(f));
}
}
sids.add(sid);
}
solrServer.add(sids);
solrServer.commit();
}
public synchronized void deleteIndex(SearchBean bean) throws Exception {
if(bean == null){
logger.warn("Get search bean is empty!");
return;
}
String id = bean.getId();
if(StringUtils.isEmpty(id)){
logger.warn("get id and id value from bean is empty!");
return;
}
SolrServer server = getSolrServer();
UpdateResponse ur = server.deleteByQuery("pkId:" + id);
logger.debug("delete all indexs! UpdateResponse is '{}'! execute for '{}'ms!", ur, ur.getElapsedTime());
server.commit();
}
public synchronized void deleteIndexs(List<SearchBean> beans) throws Exception {
if(beans == null){
logger.warn("Get beans is empty!");
return;
}
for(SearchBean bean : beans){
this.deleteIndex(bean);
}
}
public PaginationSupport doSearch(List<SearchBean> beans, boolean isHighlighter, int start, int num) throws Exception {
if(beans == null || beans.isEmpty()){
logger.debug("given search beans is empty!");
return PaginationUtils.getNullPagination();
}
List queryResults = new ArrayList();
StringBuffer query_ = new StringBuffer();
for(SearchBean bean : beans){
//要進行檢索的字段
String[] doSearchFields = bean.getDoSearchFields();
if(doSearchFields == null || doSearchFields.length == 0)
continue;
for(int i = 0; i < doSearchFields.length; i++){
String f = doSearchFields[i];
query_.append("(").append(f).append("_message:*").append(bean.getKeyword()).append("*").append(")");
if(i + 1 != doSearchFields.length)
query_.append(" OR ");
}
}
if(StringUtils.isEmpty(query_.toString())){
logger.warn("query string is null!");
return PaginationUtils.getNullPagination();
}
SolrQuery query = new SolrQuery();
query.setQuery(query_.toString());
query.setStart(start == -1 ? 0 : start);
query.setRows(num == -1 ? 100000000 : num);
query.setFields("*", "score");
if(isHighlighter){
query.setHighlight(true).setHighlightSimplePre(getHtmlPrefix()).setHighlightSimplePost(getHtmlSuffix());
query.setHighlightSnippets(2);
query.setHighlightFragsize(1000);
query.setParam("hl.fl", "*");
}
QueryResponse response = getSolrServer().query(query);
SolrDocumentList sd = response.getResults();
for(Iterator it = sd.iterator(); it.hasNext(); ){
SolrDocument doc = (SolrDocument) it.next();
String indexType = doc.get("indexType").toString();
SearchBean result = super.getSearchBean(indexType, beans);
try {
result.setId(doc.getFieldValue("pkId").toString());
result.setLink(doc.getFieldValue("link").toString());
result.setOwerId(doc.getFieldValue("owerId").toString());
result.setOwerName(doc.getFieldValue("owerName").toString());
result.setCreateDate(doc.getFieldValue("createDate").toString());
result.setIndexType(doc.getFieldValue("indexType").toString());
String keyword = StringUtils.EMPTY;
if(isHighlighter){
String id = (String) doc.getFieldValue("id");
List temp = response.getHighlighting().get(id).get("keyword");
if(temp != null && !temp.isEmpty()){
keyword = temp.get(0).toString();
}
}
if(StringUtils.isEmpty(keyword))
keyword = doc.getFieldValue("keyword").toString();
result.setKeyword(keyword);
//要進行檢索的字段
String[] doSearchFields = result.getDoSearchFields();
if(doSearchFields == null || doSearchFields.length == 0)
continue;
Map<String, String> extendValues = new HashMap<String, String>();
for(String field : doSearchFields){
String value = doc.getFieldValue(field + "_message").toString();
if(isHighlighter){
String id = (String) doc.getFieldValue("id");
List temp = response.getHighlighting().get(id).get(field + "_message");
if(temp != null && !temp.isEmpty()){
value = temp.get(0).toString();
}
}
extendValues.put(field, value);
}
result.setSearchValues(extendValues);
} catch (Exception e) {
logger.error(e.getMessage(), e);
}
queryResults.add(result);
}
PaginationSupport paginationSupport = PaginationUtils.makePagination(queryResults, Long.valueOf(sd.getNumFound()).intValue(), num, start);
return paginationSupport;
}
public synchronized void deleteIndexsByIndexType(Class<? extends SearchBean> clazz) throws Exception {
String indexType = getIndexType(BeanUtils.instantiate(clazz));
this.deleteIndexsByIndexType(indexType);
}
public synchronized void deleteIndexsByIndexType(String indexType) throws Exception {
SolrServer server = getSolrServer();
UpdateResponse ur = server.deleteByQuery("indexType:" + indexType);
logger.debug("delete all indexs! UpdateResponse is '{}'! execute for '{}'ms!", ur, ur.getElapsedTime());
server.commit();
}
public synchronized void deleteAllIndexs() throws Exception {
SolrServer server = getSolrServer();
UpdateResponse ur = server.deleteByQuery("*:*");
logger.debug("delete all indexs! UpdateResponse is '{}'! execute for '{}'ms!", ur, ur.getElapsedTime());
server.commit();
}
public void updateIndex(SearchBean searchBean) throws Exception {
this.updateIndexs(Collections.singletonList(searchBean));
}
/**
* 更新索引<br/>
* 在solr中更新索引也就是創建索引(當有相同ID存在的時候,僅僅更新,否則新建)<br/>
* {@link SolrSearchEngine#doIndex(java.util.List)}
*
* @param searchBeans 需要更新的beans
* @throws Exception
*/
public void updateIndexs(List<SearchBean> searchBeans) throws Exception {
this.doIndex(searchBeans);
}
public void setServer(String server) {
this.server = server;
}
}
關於solr服務端的配置,我想說的就是那個schema.xml文件的配置:
1.這裏我配置了幾個共有的字段,如下:
<!-- start my solr -->
<field name="pkId" type="string" indexed="true" stored="true"/>
<field name="keyword" type="string" indexed="true" stored="true"/>
<field name="owerId" type="string" indexed="true" stored="true"/>
<field name="owerName" type="string" indexed="true" stored="true"/>
<field name="link" type="string" indexed="true" stored="true"/>
<field name="createDate" type="string" indexed="true" stored="true"/>
<field name="indexType" type="string" indexed="true" stored="true"/>
<!-- end my solr -->
這些是一些固定字段,也是每個對象都通用的.
<!-- a dynamic field, match all fields what end with _solr -->
<dynamicField name="*_message" type="paodingAnalyzer" indexed="true" stored="true"/>
這個是動態匹配字段,比如說我有一個對象其實一個字段是真實姓名(truename),那麼在solr索引中的字段名稱就叫(username_message).這樣就能匹配起來了,so easy!
再談談solr使用分詞,暫時我使用的是庖丁分詞(paoding),需要的可以去網上找找,osc上就有的.
需要在solr的schema.xml添加一個字段類型:
<!-- paoding -->
<fieldType name="paodingAnalyzer" class="solr.TextField">
<analyzer class="net.paoding.analysis.analyzer.PaodingAnalyzer"></analyzer>
</fieldType>
然後在你需要使用分詞的字段的配置上,修改type="paodingAnalyzer".跟上面的動態字段一致.
可以檢查一下是否配置正確:
訪問http://192.168.1.118/solr/admin/analysis.jsp?highlight=on
安裝以下圖片說明操作:
好了,over here.具體對索引的新增.刪除.更新.以及查詢的操作見上面的代碼,相信對於沉浸在oschina這麼多年的你們,這些都是小case了.