Lucene之Hello world

首先，確認要建立索引的文件。在E:/lucene/test下放置所有要建立索引的文件。

a.txt b.txt c.txt d.txt 內容如圖：

選擇開發工具和開發包開發工具

開發工具

Eclipse 3.2

開發包

lucene-demos-1.9-final.jar

lucene-core-1.9-final.jar

4.6 Lucene實例開發

打開Eclipse，新建一個Java工程，工程有3個類，其中

Constants.java主要是用來存儲一些常量的類，如索引文件路徑和索引的存放位置；

LuceneIndex.java是用於對文件建閏索引的類;

LuceneSearch.java則是用於檢索索引的類。

另外，工程還引入開發包lucene-demos-1.9-final.jar lucene-core-1.9-final.jar

4.6.1 建立索引LuceneIndex.java

Constants.java創建

package test;

public class Constants {

public final static String INDEX_FILE_PATH = "e://lucene//test"; //索引的文件的存放路徑

public final static String INDEX_STORE_PATH = "e://lucene//index"; //索引的存放位置

}

LuceneIndex.java創建

package test;

import java.io.BufferedReader;

import java.io.File;

import java.io.FileInputStream;

import java.io.InputStreamReader;

import java.io.Reader;

import java.util.Date;

import org.apache.lucene.analysis.standard.StandardAnalyzer;

import org.apache.lucene.document.Document;

import org.apache.lucene.document.Field;

import org.apache.lucene.index.IndexWriter;

public class LuceneIndex {

public static void main(String[] args) throws Exception {

// 聲明一個對象

LuceneIndex indexer = new LuceneIndex();

// 建立索引

Date start = new Date();

indexer.writeToIndex();

Date end = new Date();

System.out.println("建立索引用時" + (end.getTime() - start.getTime()) + "毫秒");

indexer.close();

}

public LuceneIndex() {

try {

writer = new IndexWriter(Constants.INDEX_STORE_PATH,

new StandardAnalyzer(), true);

} catch (Exception e) {

e.printStackTrace();

}

// 索引器

private IndexWriter writer = null;

// 將要建立索引的文件構造成一個Document對象，並添加一個域"content"

private Document getDocument(File f) throws Exception {

Document doc = new Document();

FileInputStream is = new FileInputStream(f);

Reader reader = new BufferedReader(new InputStreamReader(is));

doc.add(Field.~~Text~~("contents", reader));

doc.add(Field.~~Keyword~~("path", f.getAbsolutePath()));

return doc;

}

public void writeToIndex() throws Exception {

File folder = new File(Constants.INDEX_FILE_PATH);

if (folder.isDirectory()) {

String[] files = folder.list();

for (int i = 0; i < files.length; i++) {

File file = new File(folder, files[i]);

Document doc = getDocument(file);

System.out.println("正在建立索引 : " + file + "");

writer.addDocument(doc);

}

public void close() throws Exception {

writer.close();

}

4.6.2 建立搜索LuceneSearch.java

LuceneSearch.java創建

package test;

import java.util.Date;

import org.apache.lucene.analysis.standard.StandardAnalyzer;

import org.apache.lucene.document.Document;

import org.apache.lucene.index.IndexReader;

import org.apache.lucene.queryParser.QueryParser;

import org.apache.lucene.search.Hits;

import org.apache.lucene.search.IndexSearcher;

import org.apache.lucene.search.Query;

public class LuceneSearch {

public static void main(String[] args) throws Exception {

LuceneSearch test = new LuceneSearch();

Hits h = null;

h = test.search("中國");

test.printResult(h);

h = test.search("人民");

test.printResult(h);

h = test.search("共和國");

test.printResult(h);

}

public LuceneSearch() {

try {

searcher = new IndexSearcher(IndexReader

.open(Constants.INDEX_STORE_PATH));

} catch (Exception e) {

e.printStackTrace();

}

// 聲明一個IndexSearcher對象

private IndexSearcher searcher = null;

// 聲明一個Query對象

private Query query = null;

public final Hits search(String keyword) {

System.out.println("正在檢索關鍵字 : " + keyword);

try {

// 將關鍵字包裝成Query對象

query = QueryParser.parse(keyword, "contents",

new StandardAnalyzer());

Date start = new Date();

Hits hits = searcher.search(query);

Date end = new Date();

System.out.println("檢索完成，用時" + (end.getTime() - start.getTime())

+ "毫秒");

return hits;

} catch (Exception e) {

e.printStackTrace();

return null;

}

public void printResult(Hits h) {

if (h.length() == 0) {

System.out.println("對不起，沒有找到您要的結果。");

} else {

for (int i = 0; i < h.length(); i++) {

try {

Document doc = h.doc(i);

System.out.print("這是第" + i + "個檢索到的結果，文件名爲：");

System.out.println(doc.get("path"));

} catch (Exception e) {

e.printStackTrace();

}

System.out.println("--------------------------");

}

4.6.3 結果分析

運行LuceneIndex.java

控制區打印結果如下：

正在建立索引 : e:/lucene/test/a.txt

正在建立索引 : e:/lucene/test/b.txt

正在建立索引 : e:/lucene/test/c.txt

正在建立索引 : e:/lucene/test/d.txt

建立索引用時94毫秒

打開E:/lucene/index目錄，可以看到剛纔建立的索引，如圖：

運行搜索

索引已經成功建立，現在分別以“中華”、“人民”，“共和國”爲關鍵字來在索引中進行檢索；

在Eclipse中運行LuceneSearch.java

可以看到控制區輸出了檢索結果如下：

正在檢索關鍵字 : 中國

檢索完成，用時16毫秒

這是第0個檢索到的結果，文件名爲：e:/lucene/test/b.txt

--------------------------

正在檢索關鍵字 : 人民

檢索完成，用時0毫秒

這是第0個檢索到的結果，文件名爲：e:/lucene/test/a.txt

這是第1個檢索到的結果，文件名爲：e:/lucene/test/c.txt

這是第2個檢索到的結果，文件名爲：e:/lucene/test/b.txt

--------------------------

正在檢索關鍵字 : 人

檢索完成，用時15毫秒

這是第0個檢索到的結果，文件名爲：e:/lucene/test/a.txt

這是第1個檢索到的結果，文件名爲：e:/lucene/test/c.txt

這是第2個檢索到的結果，文件名爲：e:/lucene/test/b.txt

--------------------------

首先，搜索是一種服務。在本例中，僅是通過一段代碼來演示了API的使用。這與真正的服務性搜索還相去甚遠。比如用戶的界面的友好性、檢索結果的顯示、用戶響應時間長短、關鍵字分析的能力等，這些都是評價一個搜索引擎好壞的參數。

其次，對於一個簡單的搜索引擎來說，索引只要存放在某個特定的硬盤上就可以了。如本例中，我們使用一個目錄來作爲索引的存放位置。然而，如果要構建一個大型的集羣化的搜索引擎，每天光日誌的大小就有上百G，更不用說索引文件的大小了。很顯然不可像本例中那樣使用某個目錄來存放，而應當採用分佈式存儲的方式，並利用存儲網絡技術進行連接。

當然，對於非專業型電子商務的網站來說，搜索只是它所提供一個特性，並非一定要構建什麼大型集羣化搜索引擎。

Lucene之Hello world

10分鐘搞定Mysql主從部署配置

如何使用 JS 判斷用戶是否處於活躍狀態

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

[轉帖]

python列出centos7內存使用前50的進程信息

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

一鍵自動化博客發佈工具,用過的人都說好(掘金篇)

lightdb數據庫超時相關控制參數

lightdb秒級增加列和刪除列（not null帶默認值）

Java ThreadPoolShutdown

mod函數詳解

道德經@老子

LDAP使用

代碼重構與優化

Lucene功能包簡介

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結