Lucene2.0升級3.3版本開發筆記之一------建

今天本來是要寫OSCache剩下部分的內容的。
由於項目急需將搜索引擎升級爲新版本,所以優先級發生了變化。
爲什麼是升級到3.3是因爲3.5纔出來2個月,3.3出來了半年。
所以項目認爲3.3比3.5穩定。。。

2.0升級到3.3只有很小的區別。
使用到的jar:
lucene-core-3.3.0.jar
IKAnalyzer3.2.5Stable.jar
lucene-highlighter-3.0.1.jar
以前我們做2.0版本的時候引入了很多jar,現在lucene只需要一個core jar就夠了
IK也同樣升級到匹配的3.2.5穩定版。
高亮也升級到了3.0以上的版本。根據jar包名稱可以在網上輕鬆找到jar包的下載地址。

建索引:

使用的對象和基本步驟:

Analyzer,解析器。

IndexWriter,需要對象IndexWriter來進行索引的創建與更新。

Document,寫入的文檔,是IndexWriter的基本對象。(一條報警可以用一個文檔表示)

Field,一個Document可以有多個Field,這是我們存儲的基本單位。(PCIP等都可以視爲Field)注:field默認域名區分大小寫,最好統一。

A. 創建寫對象IndexWriter,它依賴於Analyzer、存儲路徑,可通過IndexWriterConfig對其進行參數設置。

B. 創建空文檔Document doc = newDocument();

C. 向空文檔裏面添加若干個Fielddoc.add(new Field("PCIP", fields[0],Field.Store.YES, Field.Index.ANALYZED_NO_NORMS));

注:

Field參數STORE,與索引無關,是否額外存儲原文 ,可以在搜索結果後調用出來,

NO不額外存儲;

YES,額外存儲。

Field參數INDEX

NO,不索引;

ANALYZED,分詞後索引;

NOT_ANALYZED,不分詞索引

ANALYZED_NO_NORMS,分詞索引,不存儲NORMS

NOT_ANALYZED_NO_NORMS,不分詞,索引,不存儲NORMS

除了NO外都算索引,可以搜索。NORMS存儲了boost所需信息,包含了NORM可能會佔用更多內存。

D.IndexWriter添加Documentwriter.addDocument(doc);

E. 優化索引(優化相對比較慢,可以選擇進行,優化之後可以達到最大查詢速度,//writer.optimize();//優化索引


需要引入的包:

importorg.apache.lucene.analysis.Analyzer;

importorg.apache.lucene.document.Document;

import org.apache.lucene.document.Field;

importorg.apache.lucene.index.CorruptIndexException;

importorg.apache.lucene.index.IndexWriter;

importorg.apache.lucene.index.IndexWriterConfig;

importorg.apache.lucene.index.Term;

importorg.apache.lucene.store.Directory;

importorg.apache.lucene.store.FSDirectory;

importorg.apache.lucene.store.LockObtainFailedException;

importorg.apache.lucene.util.Version;


我使用當中主要發生的變化已經用紅色標記出

   @SuppressWarnings("deprecation")

   public void BuildLawyerPublic(ResponseList<SearchLawyer> lawyerList,

           String path, boolean overwrite, Date start, Date end) {

       IndexWriter indexWriter = null;

      

      try {

           try {

             Analyzer analyzer = new IKAnalyzer();//分詞類變爲新的

               IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_33, analyzer);//indexWrite配置新的

                Directory dir =FSDirectory.open(new File(path)) ;//地址

               indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);//總是重新創建

 

             indexWriter = new IndexWriter(dir,indexWriterConfig);//新的方法

           } catch (IOException e) {

              e.printStackTrace();

             this.BuildAll();

             return;

           }

 

           long count = 0;

           long startTime = new Date().getTime();

           for (int i = 0; i < lawyerList.size(); i++) {

             // 按小時或天建立索弄17

             if (path == hoursDir || path == dailyDir) {

                  SearchLawyer lawyerBean = new SearchLawyer();

                  BeanUtils.copyProperties(lawyerList.get(i), lawyerBean);

                 // 更新索引

                 if (((SearchLawyer) lawyerList.get(i)).getJoinDate()

                         .before(start)) {

                     Term term = new Term("consultID", String

                            .valueOf(((SearchLawyer) lawyerList.get(i))

                                   .getLawyerID()));

                     indexWriter.updateDocument(term,

                            convert2Doc(lawyerBean));

                    update_count++;

                  } else {// 新增索引

                     indexWriter.addDocument(convert2Doc(lawyerBean));

                    add_count++;

                  }

              } else {

                 // 第一次建立索弄17

                  SearchLawyer lawyerBean = (SearchLawyer) lawyerList.get(i);

                 if (lawyerBean != null) {

                     Document hdoc = convert2Doc(lawyerBean);

                     indexWriter.addDocument(hdoc);

                  }

                  count++;

              }

           }

           indexWriter.optimize();

           indexWriter.close();

           long endTime = new Date().getTime();

           logger.debug("It takes " + (endTime - startTime)

                  + "ms index count :" + count

                  + " consultpublic count:==============="

                  + lawyerList.size());

 

       } catch(CorruptIndexException e) {

           logger.error("{}", e);

           e.printStackTrace();

       } catch(LockObtainFailedException e) {

           logger.error("{}", e);

           e.printStackTrace();

       } catch (IOException e) {

           logger.error("{}", e);

           e.printStackTrace();

       }

    }

 

 

 

   //bean 2

   public Document convert2Doc(SearchLawyer lawyer) {

       Document doc = new Document();

      try {

           // 律師主鍵

           doc.add(new Field("lawyerID", String.valueOf(lawyer.getLawyerID()),

                  Field.Store.YES, Field.Index.NO));//NO,不索引$17

           // 律師姓名

           doc.add(new Field("fullName", String.valueOf(lawyer.getFullName()),

                 Field.Store.YES, Field.Index.ANALYZED));//ANALYZED,分詞後索引

           // 律師電話

           doc.add(new Field("phone", String.valueOf(lawyer.getPhone()),

                  Field.Store.YES, Field.Index.NO));

           // 律師頭像

           doc.add(new Field("photo", String.valueOf(lawyer

                  .getPhoto()), Field.Store.YES, Field.Index.NO));

           // 律師箄1717

           doc.add(new Field("lawyer_Intro", String.valueOf(lawyer

                         .getLawyer_Intro()), Field.Store.YES,

                         Field.Index.ANALYZED));

           // 律所名稱

           doc.add(new Field("officeName", String

                  .valueOf(lawyer.getOfficeName()), Field.Store.YES,

                 Field.Index.NOT_ANALYZED));//NOT_ANALYZED,不分詞索引

           // 律師扄17在省

           doc.add(new Field("provinceName", String.valueOf(lawyer

                         .getProvinceName()), Field.Store.YES,

                         Field.Index.NOT_ANALYZED));

           // 律師扄17在市

           doc.add(new Field("cityName", String.valueOf(lawyer.getCityName()),

                  Field.Store.YES, Field.Index.NOT_ANALYZED));

           // 專注類別

           doc.add(new Field("specialtyName", String

                  .valueOf(lawyer.getSpecialtyName()), Field.Store.YES,

                  Field.Index.ANALYZED));

           // 律師添加的時闄17

           doc.add(new Field("joinDate", DateUtil.getDateTime(

                 "yyyy-MM-dd hh:mm:ss", lawyer.getJoinDate()),

                  Field.Store.YES, Field.Index.NO));

 

       } catch (Exception e) {

           logger.error("{}", e);

           e.printStackTrace();

       }

      return doc;

    }

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章