4. Lucene入門

原創

2018-12-07 23:52

Apache Lucene是一個用Java寫的高性能、可伸縮的全文檢索引擎工具包，它可以方便的嵌入到各種應用中實現針對應用的全文索引/檢索功能。Lucene的目標是爲各種中小型應用程序加入全文檢索功能。
Lucene的核心作者：Doug Cutting是一位資深全文索引/檢索專家。
版本發佈情況：2000年3月，最初版發佈，2001年9月，加入apache；2004年7月，發佈1.4正式版；2009年11月，發佈2.9.1（jdk1.4)及3.0(jdk1.5)版本；2015年3月，發佈4.10.4。2016年2月，發佈5.5.0。

導包

1、把文本內容轉換爲Document對象
文本是作爲Document對象的一個字段而存在
2、準備IndexWriter（索引寫入器）
3 、通過IndexWriter，把Document添加到緩衝區並提交
addDocument
commit
close
//創建索引的數據現在寫死，以後根據實際應用場景
String doc1 = “hello world”;
String doc2 = “hello java world”;
String doc3 = “hello lucene world”;
private String path =“F:/eclipse/workspace/lucene/index/
hello”;
@Test
public void testCreate() {
try {
//2、準備IndexWriter（索引寫入器）
//索引庫的位置 FS fileSystem
Directory d = FSDirectory.open(Paths.get(path ));
//分詞器
Analyzer analyzer = new StandardAnalyzer();
//索引寫入器的配置對象
IndexWriterConfig conf = new IndexWriterConfig(analyzer);
IndexWriter indexWriter = new IndexWriter(d, conf);
System.out.println(indexWriter);

		//1、 把文本內容轉換爲Document對象
		//把文本轉換爲document對象
		Document document1 = new Document();
		//標題字段
		document1.add(new TextField("title", "doc1", Store.YES));
		document1.add(new TextField("content", doc1, Store.YES));
		//添加document到緩衝區
		indexWriter.addDocument(document1);
		Document document2 = new Document();
		//標題字段
		document2.add(new TextField("title", "doc2", Store.YES));
		document2.add(new TextField("content", doc2, Store.YES));
		//添加document到緩衝區
		indexWriter.addDocument(document2);
		Document document3 = new Document();
		//標題字段
		document3.add(new TextField("title", "doc3", Store.YES));
		document3.add(new TextField("content", doc3, Store.YES));
		
		//3 、通過IndexWriter，把Document添加到緩衝區並提交
		//添加document到緩衝區
		indexWriter.addDocument(document3);
		indexWriter.commit();
		indexWriter.close();
		
	} catch (Exception e) {
		e.printStackTrace();
	}

}

   // OpenMode=create 每次都會重置索引庫然後重新添加索引文檔
   // 後者覆蓋前者(默認是不覆蓋累加模式)
	conf.setOpenMode(OpenMode.CREATE);

圖形界面客戶端使用
4.2.2. 搜索索引
1 封裝查詢提交爲查詢對象
2 準備IndexSearcher
3 使用IndexSearcher傳入查詢對象做查詢-----查詢出來只是文檔編號DocID
4 通過IndexSearcher傳入DocID獲取文檔
5 把文檔轉換爲前臺需要的對象 Docment----> Article

@Test
	public void testSearch() {
		String keyWord = "lucene";
		try {
			// * 1 封裝查詢提交爲查詢對象
		    //通過查詢解析器解析一個字符串爲查詢對象
			String f = "content"; //查詢的默認字段名,
			Analyzer a = new StandardAnalyzer();//查詢關鍵字要分詞，所有需要分詞器
			QueryParser parser = new QueryParser(f, a);
			Query query = parser.parse("content:"+keyWord);
			// * 2 準備IndexSearcher
			Directory d = FSDirectory.open(Paths.get(path ));
			IndexReader r = DirectoryReader.open(d);
			IndexSearcher searcher = new IndexSearcher(r);
			// * 3 使用IndexSearcher傳入查詢對象做查詢-----查詢出來只是文檔編號DocID
			TopDocs topDocs = searcher.search(query, 1000);//查詢ton條記錄 前多少條記錄
			System.out.println("總命中數："+topDocs.totalHits);
			ScoreDoc[] scoreDocs = topDocs.scoreDocs;//命中的所有的文檔的封裝（docId）
			// * 4 通過IndexSearcher傳入DocID獲取文檔
			for (ScoreDoc scoreDoc : scoreDocs) {
				int docId = scoreDoc.doc;
				Document document = searcher.doc(docId);
				// * 5 把文檔轉換爲前臺需要的對象 Docment----> Article
				System.out.println("=======================================");
				System.out.println("title:"+document.get("title")
								+",content:"+document.get("content"));
			}
		} catch (Exception e) {
			e.printStackTrace();
		}
	}

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

4. Lucene入門

MySQL 核心模塊揭祕 | 18 期 | 鎖在內存里長什麼樣*

使用perf工具生成火焰圖

大齡程序員思考

響應式界面控件DevExtreme * 更強的數據分析和可視化功能

HttpSecurity 是如何組裝過濾器鏈的

數說海南——近6年海南各市縣人口簡單看

長序列中Transformers的高級注意力機制總結

WebStorm 創建 Vue 項目

linux系統下部署xampp和禪道

java從局域網共享目錄下載文件

前臺播放本地絕對路徑視頻

java刪除本地文件夾下的所有文件

java通過jcifs.smb訪問局域網中的共享文件

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結