Lucene(三)索引域選項

轉載出處:http://blog.csdn.net/ayi_5788/article/details/52121434

通過上兩篇的學習,想必已經入了門了,今天來看索引域選項中的幾個值得設置 
先來看一個構造器:


  /**
   * Create a field by specifying its name, value and how it will
   * be saved in the index. Term vectors will not be stored in the index.
   * 
   * @param name The name of the field
   * @param value The string to process
   * @param store Whether <code>value</code> should be stored in the index
   * @param index Whether the field should be indexed, and if so, if it should
   *  be tokenized before indexing 
   * @throws NullPointerException if name or value is <code>null</code>
   * @throws IllegalArgumentException if the field is neither stored nor indexed 
   */
  public Field(String name, String value, Store store, Index index) {
    this(name, value, store, index, TermVector.NO);
  }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16

我們向Document添加Field可以有更多的設置,那麼都是什麼意思呢? 
name:字段名,很容易理解 
value:字段值,也很容易理解 
store和index怎麼解釋,下面就來看一下這兩個選項的可選值: 
Field.Store.YES或者NO(存儲域選項) 
設置爲YES表示或把這個域中的內容完全存儲到文件中,方便進行文本的還原 
設置爲NO表示把這個域的內容不存儲到文件中,但是可以被索引,此時內容無法完全還原 
Field.Index(索引選項) 
Index.ANALYZED:進行分詞和索引,適用於標題、內容等 
Index.NOT_ANALYZED:進行索引,但是不進行分詞,如果身份證號,姓名,ID等,適用於精確搜索 
Index.ANALYZED_NOT_NORMS:進行分詞但是不存儲norms信息,這個norms中包括了創建索引的時間和權值等信息 
Index.NOT_ANALYZED_NOT_NORMS:即不進行分詞也不存儲norms信息 
Index.NO:不進行索引 
寫個例子看看,由於pom文件與之前的一樣,就不貼出了,直接看例子代碼: 
3.5版本:


package com.darren.lucene35;

import java.io.File;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;

public class IndexUtil {
    private static final String[] ids = { "1", "2", "3" };
    private static final String[] authors = { "Darren", "Tony", "Grylls" };
    private static final String[] titles = { "Hello World", "Hello Lucene", "Hello Java" };
    private static final String[] contents = { "Hello World, I am on my way", "Today is my first day to study Lucene",
            "I like Java" };

    /**
     * 建立索引
     */
    public static void index() {
        IndexWriter indexWriter = null;
        try {
            // 1、創建Directory
            Directory directory = FSDirectory.open(new File("F:/test/lucene/index"));

            // 2、創建IndexWriter
            Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_35);
            IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_35, analyzer);
            indexWriter = new IndexWriter(directory, config);

            int size = ids.length;
            for (int i = 0; i < size; i++) {
                // 3、創建Document對象
                Document document = new Document();
                // 看看四個參數的意思
                /**
                 * Create a field by specifying its name, value and how it will be saved in the index. Term vectors will
                 * not be stored in the index.
                 * 
                 * @param name
                 *            The name of the field
                 * @param value
                 *            The string to process
                 * @param store
                 *            Whether <code>value</code> should be stored in the index
                 * @param index
                 *            Whether the field should be indexed, and if so, if it should be tokenized before indexing
                 * 
                 *            public Field(String name, String value, Store store, Index index) { this(name, value,
                 *            store, index, TermVector.NO); }
                 */

                // 4、爲Document添加Field

                // 對ID存儲,但是不分詞也不存儲norms信息,這個norms中包括了創建索引的時間和權值等信息
                document.add(new Field("id", ids[i], Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS));
                // 對Author存儲,但是不分詞也不存儲norms信息,這個norms中包括了創建索引的時間和權值等信息
                document.add(new Field("author", authors[i], Field.Store.YES, Field.Index.NOT_ANALYZED));
                // 對Title存儲,分詞
                document.add(new Field("title", titles[i], Field.Store.YES, Field.Index.ANALYZED));
                // 對Content不存儲,但是分詞
                /**
                 * 注:添加內容或文件是默認是不存儲的,這個查詢時可以證明這個問題
                 * 
                 * new Field(name, reader)
                 * 
                 * 那麼問題來了,如果想存文件內容怎麼辦呢?
                 * 
                 * 那就把文件讀出來,比如讀出字符串,然後不就能按字符串的方式存儲啦
                 */
                document.add(new Field("content", contents[i], Field.Store.NO, Field.Index.ANALYZED));

                // 5、通過IndexWriter添加文檔到索引中
                indexWriter.addDocument(document);
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (indexWriter != null) {
                    indexWriter.close();
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

    /**
     * 搜索
     */
    public static void search() {
        IndexReader indexReader = null;
        try {
            // 1、創建Directory
            Directory directory = FSDirectory.open(new File("F:/test/lucene/index"));
            // 2、創建IndexReader
            indexReader = IndexReader.open(directory);
            // 3、根據IndexReader創建IndexSearch
            IndexSearcher indexSearcher = new IndexSearcher(indexReader);
            // 4、創建搜索的Query
            // 使用默認的標準分詞器
            Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_35);

            // 在content中搜索Lucene
            // 創建parser來確定要搜索文件的內容,第二個參數爲搜索的域
            QueryParser queryParser = new QueryParser(Version.LUCENE_35, "content", analyzer);
            // 創建Query表示搜索域爲content包含Lucene的文檔
            Query query = queryParser.parse("Lucene");

            // 5、根據searcher搜索並且返回TopDocs
            TopDocs topDocs = indexSearcher.search(query, 10);
            // 6、根據TopDocs獲取ScoreDoc對象
            ScoreDoc[] scoreDocs = topDocs.scoreDocs;
            for (ScoreDoc scoreDoc : scoreDocs) {
                // 7、根據searcher和ScoreDoc對象獲取具體的Document對象
                Document document = indexSearcher.doc(scoreDoc.doc);
                // 8、根據Document對象獲取需要的值
                System.out.println("id : " + document.get("id"));
                System.out.println("author : " + document.get("author"));
                System.out.println("title : " + document.get("title"));
                /**
                 * 看看content能不能打印出來,爲什麼?
                 */
                System.out.println("content : " + document.get("content"));
            }

        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (indexReader != null) {
                    indexReader.close();
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

    }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155

我在註釋中留了問 
題,現在我們是這樣添加content字段的:


document.add(new Field("content", contents[i], Field.Store.NO, Field.Index.ANALYZED));
  • 1
  • 2
  • 1
  • 2

測試代碼如下:


package com.darren.lucene35;

import org.junit.Test;

public class IndexUtilTest {
    @Test
    public void testIndex() {
        IndexUtil.index();
    }

    @Test
    public void testSearch() {
        IndexUtil.search();
    }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18

現在跑一下測試看看效果,結果如下:

id : 2
author : Tony
title : Hello Lucene
content : null
  • 1
  • 2
  • 3
  • 4
  • 1
  • 2
  • 3
  • 4

爲什麼content爲null,就是因爲沒有存,那麼我們存一下看看

document.add(new Field("content", contents[i], Field.Store.YES, Field.Index.ANALYZED));
  • 1
  • 1

再跑一下測試,注意,要先跑索引,再跑查詢

id : 2  
author : Tony  
title : Hello Lucene  
content : Today is my first day to study Lucene  
  • 1
  • 2
  • 3
  • 4
  • 1
  • 2
  • 3
  • 4

現在content有值了 
索引選項與此類同,不在贅述 
4.5版本: 
這裏先要看看3.5版本的Store和Index到底設置了什麼東四,其實在Field的構造器中是這樣設置的:

this.isStored = store.isStored();  

this.isIndexed = index.isIndexed();  
this.isTokenized = index.isAnalyzed();  
this.omitNorms = index.omitNorms();  
  • 1
  • 2
  • 3
  • 4
  • 5
  • 1
  • 2
  • 3
  • 4
  • 5

是使用的這幾個屬性,那麼這幾個屬性的值是什麼呢:


  public static enum Store {

    /** Store the original field value in the index. This is useful for short texts
     * like a document's title which should be displayed with the results. The
     * value is stored in its original form, i.e. no analyzer is used before it is
     * stored.
     */
    YES {
      @Override
      public boolean isStored() { return true; }
    },

    /** Do not store the field value in the index. */
    NO {
      @Override
      public boolean isStored() { return false; }
    };

    public abstract boolean isStored();
  }

  /** Specifies whether and how a field should be indexed. */
  public static enum Index {

    /** Do not index the field value. This field can thus not be searched,
     * but one can still access its contents provided it is
     * {@link Field.Store stored}. */
    NO {
      @Override
      public boolean isIndexed()  { return false; }
      @Override
      public boolean isAnalyzed() { return false; }
      @Override
      public boolean omitNorms()  { return true;  }   
    },

    /** Index the tokens produced by running the field's
     * value through an Analyzer.  This is useful for
     * common text. */
    ANALYZED {
      @Override
      public boolean isIndexed()  { return true;  }
      @Override
      public boolean isAnalyzed() { return true;  }
      @Override
      public boolean omitNorms()  { return false; }     
    },

    /** Index the field's value without using an Analyzer, so it can be searched.
     * As no analyzer is used the value will be stored as a single term. This is
     * useful for unique Ids like product numbers.
     */
    NOT_ANALYZED {
      @Override
      public boolean isIndexed()  { return true;  }
      @Override
      public boolean isAnalyzed() { return false; }
      @Override
      public boolean omitNorms()  { return false; }     
    },

    /** Expert: Index the field's value without an Analyzer,
     * and also disable the indexing of norms.  Note that you
     * can also separately enable/disable norms by calling
     * {@link Field#setOmitNorms}.  No norms means that
     * index-time field and document boosting and field
     * length normalization are disabled.  The benefit is
     * less memory usage as norms take up one byte of RAM
     * per indexed field for every document in the index,
     * during searching.  Note that once you index a given
     * field <i>with</i> norms enabled, disabling norms will
     * have no effect.  In other words, for this to have the
     * above described effect on a field, all instances of
     * that field must be indexed with NOT_ANALYZED_NO_NORMS
     * from the beginning. */
    NOT_ANALYZED_NO_NORMS {
      @Override
      public boolean isIndexed()  { return true;  }
      @Override
      public boolean isAnalyzed() { return false; }
      @Override
      public boolean omitNorms()  { return true;  }     
    },

    /** Expert: Index the tokens produced by running the
     *  field's value through an Analyzer, and also
     *  separately disable the storing of norms.  See
     *  {@link #NOT_ANALYZED_NO_NORMS} for what norms are
     *  and why you may want to disable them. */
    ANALYZED_NO_NORMS {
      @Override
      public boolean isIndexed()  { return true;  }
      @Override
      public boolean isAnalyzed() { return true;  }
      @Override
      public boolean omitNorms()  { return true;  }     
    };
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98

明明白白,清清楚楚吧,是預定義好的,對應關係非常清楚,那麼我們來看看4.5版本是怎麼做的:


package com.darren.lucene45;

import java.io.File;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.FieldType;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;

public class IndexUtil {
    private static final String[] ids = { "1", "2", "3" };
    private static final String[] authors = { "Darren", "Tony", "Grylls" };
    private static final String[] titles = { "Hello World", "Hello Lucene", "Hello Java" };
    private static final String[] contents = { "Hello World, I am on my way", "Today is my first day to study Lucene",
            "I like Java" };

    /**
     * 建立索引
     */
    public static void index() {
        IndexWriter indexWriter = null;
        try {
            // 1、創建Directory
            Directory directory = FSDirectory.open(new File("F:/test/lucene/index"));

            // 2、創建IndexWriter
            Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);
            IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_45, analyzer);
            indexWriter = new IndexWriter(directory, config);

            int size = ids.length;
            for (int i = 0; i < size; i++) {
                // 3、創建Document對象
                Document document = new Document();
                // 看看四個參數的意思

                // 4、爲Document添加Field
                /**
                 * Create field with String value.
                 * 
                 * @param name
                 *            field name
                 * @param value
                 *            string value
                 * @param type
                 *            field type
                 * @throws IllegalArgumentException
                 *             if either the name or value is null, or if the field's type is neither indexed() nor
                 *             stored(), or if indexed() is false but storeTermVectors() is true.
                 * @throws NullPointerException
                 *             if the type is null
                 * 
                 *             public Field(String name, String value, FieldType type)
                 */

                /**
                 * 注意:這裏與3.5版本不同,原來的構造函數已過時
                 */

                /**
                 * 注:這裏4.5版本使用FieldType代替了原來的Store和Index,不同的Field預定義了一些FieldType
                 * 
                 */
                // 對ID存儲,但是不分詞也不存儲norms信息
                FieldType idType = TextField.TYPE_STORED;
                idType.setIndexed(false);
                idType.setOmitNorms(false);
                document.add(new Field("id", ids[i], idType));

                // 對Author存儲,但是不分詞
                FieldType authorType = TextField.TYPE_STORED;
                authorType.setIndexed(false);
                document.add(new Field("author", authors[i], authorType));

                // 對Title存儲,分詞
                document.add(new Field("title", titles[i], StringField.TYPE_STORED));

                // 對Content不存儲,但是分詞
                document.add(new Field("content", contents[i], TextField.TYPE_NOT_STORED));

                // 5、通過IndexWriter添加文檔到索引中
                indexWriter.addDocument(document);
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (indexWriter != null) {
                    indexWriter.close();
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

    /**
     * 搜索
     */
    public static void search() {
        DirectoryReader indexReader = null;
        try {
            // 1、創建Directory
            Directory directory = FSDirectory.open(new File("F:/test/lucene/index"));
            // 2、創建IndexReader
            /**
             * 注意Reader與3.5版本不同:
             * 
             * 所以使用DirectoryReader
             * 
             * @Deprecated public static DirectoryReader open(final Directory directory) throws IOException { return
             *             DirectoryReader.open(directory); }
             */
            indexReader = DirectoryReader.open(directory);
            // 3、根據IndexReader創建IndexSearch
            IndexSearcher indexSearcher = new IndexSearcher(indexReader);
            // 4、創建搜索的Query
            // 使用默認的標準分詞器
            Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);

            // 在content中搜索Lucene
            // 創建parser來確定要搜索文件的內容,第二個參數爲搜索的域
            QueryParser queryParser = new QueryParser(Version.LUCENE_45, "content", analyzer);
            // 創建Query表示搜索域爲content包含Lucene的文檔
            Query query = queryParser.parse("Lucene");

            // 5、根據searcher搜索並且返回TopDocs
            TopDocs topDocs = indexSearcher.search(query, 10);
            // 6、根據TopDocs獲取ScoreDoc對象
            ScoreDoc[] scoreDocs = topDocs.scoreDocs;
            for (ScoreDoc scoreDoc : scoreDocs) {
                // 7、根據searcher和ScoreDoc對象獲取具體的Document對象
                Document document = indexSearcher.doc(scoreDoc.doc);
                // 8、根據Document對象獲取需要的值
                System.out.println("id : " + document.get("id"));
                System.out.println("author : " + document.get("author"));
                System.out.println("title : " + document.get("title"));
                /**
                 * 看看content能不能打印出來,爲什麼?
                 */
                System.out.println("content : " + document.get("content"));
            }

        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (indexReader != null) {
                    indexReader.close();
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172

4.5版本使用了FieldType來代替Store和Index,其實去看看FieldType是什麼東四,就是預定義了一些值,比如StringField


package org.apache.lucene.document;

import org.apache.lucene.index.FieldInfo.IndexOptions;

/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

/** A field that is indexed but not tokenized: the entire
 *  String value is indexed as a single token.  For example
 *  this might be used for a 'country' field or an 'id'
 *  field, or any field that you intend to use for sorting
 *  or access through the field cache. */

public final class StringField extends Field {

  /** Indexed, not tokenized, omits norms, indexes
   *  DOCS_ONLY, not stored. */
  public static final FieldType TYPE_NOT_STORED = new FieldType();

  /** Indexed, not tokenized, omits norms, indexes
   *  DOCS_ONLY, stored */
  public static final FieldType TYPE_STORED = new FieldType();

  static {
    TYPE_NOT_STORED.setIndexed(true);
    TYPE_NOT_STORED.setOmitNorms(true);
    TYPE_NOT_STORED.setIndexOptions(IndexOptions.DOCS_ONLY);
    TYPE_NOT_STORED.setTokenized(false);
    TYPE_NOT_STORED.freeze();

    TYPE_STORED.setIndexed(true);
    TYPE_STORED.setOmitNorms(true);
    TYPE_STORED.setIndexOptions(IndexOptions.DOCS_ONLY);
    TYPE_STORED.setStored(true);
    TYPE_STORED.setTokenized(false);
    TYPE_STORED.freeze();
  }

  /** Creates a new StringField. 
   *  @param name field name
   *  @param value String value
   *  @param stored Store.YES if the content should also be stored
   *  @throws IllegalArgumentException if the field name or value is null.
   */
  public StringField(String name, String value, Store stored) {
    super(name, value, stored == Store.YES ? TYPE_STORED : TYPE_NOT_STORED);
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65

整個類預定義了兩種FieldType,分別是TYPE_NOT_STORED和TYPE_STORED,具體的值也是一目瞭然看看和3.5版本是不是基本一樣的,當然還有一些其他的FieldType,比如TextField中預定義了另外兩種,


  /** Indexed, tokenized, not stored. */
  public static final FieldType TYPE_NOT_STORED = new FieldType();

  /** Indexed, tokenized, stored. */
  public static final FieldType TYPE_STORED = new FieldType();

  static {
    TYPE_NOT_STORED.setIndexed(true);
    TYPE_NOT_STORED.setTokenized(true);
    TYPE_NOT_STORED.freeze();

    TYPE_STORED.setIndexed(true);
    TYPE_STORED.setTokenized(true);
    TYPE_STORED.setStored(true);
    TYPE_STORED.freeze();
  }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

當然,還有其他的FieldType,不再一一列出,那麼我們來試一下


package com.darren.lucene45;

import org.junit.Test;

public class IndexUtilTest {
    @Test
    public void testIndex() {
        IndexUtil.index();
    }

    @Test
    public void testSearch() {
        IndexUtil.search();
    }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18

此時跑一下測試的testIndex()方法看看效果:


java.lang.IllegalStateException: this FieldType is already frozen and cannot be changed
    at org.apache.lucene.document.FieldType.checkIfFrozen(FieldType.java:86)
    at org.apache.lucene.document.FieldType.setIndexed(FieldType.java:118)
    at com.darren.lucene45.IndexUtil.index(IndexUtil.java:80)
    at com.darren.lucene45.IndexUtilTest.testIndex(IndexUtilTest.java:8)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
    at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29

竟然報錯了,那麼怎麼辦呢,原因是因爲預定義的值都調用了freeze()方法,這個方法設置

public void freeze() {  
  this.frozen = true;  
}  
  • 1
  • 2
  • 3
  • 1
  • 2
  • 3

frozen爲true,而FieldType中有這樣的方法:

private void checkIfFrozen() {  
  if (frozen) {  
    throw new IllegalStateException("this FieldType is already frozen and cannot be changed");  
  }  
}  
  • 1
  • 2
  • 3
  • 4
  • 5
  • 1
  • 2
  • 3
  • 4
  • 5

如果爲true就拋異常,就是這些預定義的值不可修改,那沒辦法了,我們只好自己設置了,於是放索引方法改爲這樣:


   /**
     * 建立索引
     */
    public static void index() {
        IndexWriter indexWriter = null;
        try {
            // 1、創建Directory
            Directory directory = FSDirectory.open(new File("F:/test/lucene/index"));

            // 2、創建IndexWriter
            Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);
            IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_45, analyzer);
            indexWriter = new IndexWriter(directory, config);

            int size = ids.length;
            for (int i = 0; i < size; i++) {
                // 3、創建Document對象
                Document document = new Document();
                // 看看四個參數的意思

                // 4、爲Document添加Field
                /**
                 * Create field with String value.
                 * 
                 * @param name
                 *            field name
                 * @param value
                 *            string value
                 * @param type
                 *            field type
                 * @throws IllegalArgumentException
                 *             if either the name or value is null, or if the field's type is neither indexed() nor
                 *             stored(), or if indexed() is false but storeTermVectors() is true.
                 * @throws NullPointerException
                 *             if the type is null
                 * 
                 *             public Field(String name, String value, FieldType type)
                 */

                /**
                 * 注意:這裏與3.5版本不同,原來的構造函數已過時
                 */

                /**
                 * 注:這裏4.5版本使用FieldType代替了原來的Store和Index,不同的Field預定義了一些FieldType
                 * 
                 */
                // 對ID存儲,但是不分詞也不存儲norms信息
                FieldType idType = new FieldType();
                idType.setStored(true);
                idType.setIndexed(false);
                idType.setOmitNorms(false);
                document.add(new Field("id", ids[i], idType));

                // 對Author存儲,但是不分詞
                FieldType authorType = new FieldType();
                authorType.setStored(true);
                authorType.setIndexed(false);
                document.add(new Field("author", authors[i], authorType));

                // 對Title存儲,分詞
                document.add(new Field("title", titles[i], StringField.TYPE_STORED));

                // 對Content不存儲,但是分詞
                document.add(new Field("content", contents[i], TextField.TYPE_NOT_STORED));

                // 5、通過IndexWriter添加文檔到索引中
                indexWriter.addDocument(document);
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (indexWriter != null) {
                    indexWriter.close();
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84

再跑一下testIndex()方法,沒有出錯,得到正常的索引,然後跑一下testSearch()方法看看結果: 
id : 2 
author : Tony 
title : Hello Lucene 
content : null
 
此時的content也是null,那麼改一改對content的設置: 
把這句:// 對Content不存儲,但是分詞 
document.add(new Field("content", contents[i], TextField.TYPE_STORED));
 
再跑測試,記住線索因在查找,結果爲:

id : 2  
author : Tony  
title : Hello Lucene  
content : Today is my first day to study Lucene  
  • 1
  • 2
  • 3
  • 4
  • 1
  • 2
  • 3
  • 4

此時得到了和3.5版本一樣的測試結果,4.5版本完成 
5.0版本: 
5.0版本與4.5版本相比沒有太大改動,先看一下代碼:


package com.darren.lucene50;

import java.nio.file.FileSystems;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.FieldType;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexOptions;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

public class IndexUtil {
    private static final String[] ids = { "1", "2", "3" };
    private static final String[] authors = { "Darren", "Tony", "Grylls" };
    private static final String[] titles = { "Hello World", "Hello Lucene", "Hello Java" };
    private static final String[] contents = { "Hello World, I am on my way", "Today is my first day to study Lucene",
            "I like Java" };

    /**
     * 建立索引
     */
    public static void index() {
        IndexWriter indexWriter = null;
        try {
            // 1、創建Directory
            Directory directory = FSDirectory.open(FileSystems.getDefault().getPath("F:/test/lucene/index"));

            // 2、創建IndexWriter
            Analyzer analyzer = new StandardAnalyzer();
            IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
            indexWriter = new IndexWriter(directory, indexWriterConfig);

            int size = ids.length;
            for (int i = 0; i < size; i++) {
                // 3、創建Document對象
                Document document = new Document();
                // 看看四個參數的意思

                // 4、爲Document添加Field
                /**
                 * Create field with String value.
                 * 
                 * @param name
                 *            field name
                 * @param value
                 *            string value
                 * @param type
                 *            field type
                 * @throws IllegalArgumentException
                 *             if either the name or value is null, or if the field's type is neither indexed() nor
                 *             stored(), or if indexed() is false but storeTermVectors() is true.
                 * @throws NullPointerException
                 *             if the type is null
                 * 
                 *             public Field(String name, String value, FieldType type)
                 */

                /**
                 * 注意:這裏與3.5版本不同,原來的構造函數已過時
                 */

                /**
                 * 注:這裏4.5版本類似使用FieldType代替了原來的Store和Index,不同的是Index變成IndexOptions
                 * 
                 */
                // 對ID存儲,但是不分詞也不存儲norms信息
                FieldType idType = new FieldType();
                idType.setStored(true);
                idType.setIndexOptions(IndexOptions.DOCS);
                idType.setOmitNorms(false);
                document.add(new Field("id", ids[i], idType));

                // 對Author存儲,但是不分詞
                FieldType authorType = new FieldType();
                authorType.setStored(true);
                authorType.setIndexOptions(IndexOptions.DOCS);
                document.add(new Field("author", authors[i], authorType));

                // 對Title存儲,分詞
                document.add(new Field("title", titles[i], StringField.TYPE_STORED));

                // 對Content不存儲,但是分詞
                document.add(new Field("content", contents[i], TextField.TYPE_NOT_STORED));

                // 5、通過IndexWriter添加文檔到索引中
                indexWriter.addDocument(document);
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (indexWriter != null) {
                    indexWriter.close();
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

    /**
     * 搜索
     */
    public static void search() {
        DirectoryReader indexReader = null;
        try {
            // 1、創建Directory
            Directory directory = FSDirectory.open(FileSystems.getDefault().getPath("F:/test/lucene/index"));
            // 2、創建IndexReader
            indexReader = DirectoryReader.open(directory);
            // 3、根據IndexReader創建IndexSearch
            IndexSearcher indexSearcher = new IndexSearcher(indexReader);
            // 4、創建搜索的Query
            // 使用默認的標準分詞器
            Analyzer analyzer = new StandardAnalyzer();

            // 在content中搜索Lucene
            // 創建parser來確定要搜索文件的內容,第二個參數爲搜索的域
            QueryParser queryParser = new QueryParser("content", analyzer);
            // 創建Query表示搜索域爲content包含Lucene的文檔
            Query query = queryParser.parse("Lucene");

            // 5、根據searcher搜索並且返回TopDocs
            TopDocs topDocs = indexSearcher.search(query, 10);
            // 6、根據TopDocs獲取ScoreDoc對象
            ScoreDoc[] scoreDocs = topDocs.scoreDocs;
            for (ScoreDoc scoreDoc : scoreDocs) {
                // 7、根據searcher和ScoreDoc對象獲取具體的Document對象
                Document document = indexSearcher.doc(scoreDoc.doc);
                // 8、根據Document對象獲取需要的值
                System.out.println("id : " + document.get("id"));
                System.out.println("author : " + document.get("author"));
                System.out.println("title : " + document.get("title"));
                /**
                 * 看看content能不能打印出來,爲什麼?
                 */
                System.out.println("content : " + document.get("content"));
            }

        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (indexReader != null) {
                    indexReader.close();
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

    }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168

FieldType的使用稍有不同,沒有了Index而使用IndexOptions代替,現在TextField中預定義的值是這樣的:


  /** Indexed, tokenized, not stored. */
  public static final FieldType TYPE_NOT_STORED = new FieldType();

  /** Indexed, tokenized, stored. */
  public static final FieldType TYPE_STORED = new FieldType();

  static {
    TYPE_NOT_STORED.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
    TYPE_NOT_STORED.setTokenized(true);
    TYPE_NOT_STORED.freeze();

    TYPE_STORED.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
    TYPE_STORED.setTokenized(true);
    TYPE_STORED.setStored(true);
    TYPE_STORED.freeze();
  }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

我們來看看IndexOptions與Index有什麼不同:


/**
 * Controls how much information is stored in the postings lists.
 * @lucene.experimental
 */

public enum IndexOptions { 
  // NOTE: order is important here; FieldInfo uses this
  // order to merge two conflicting IndexOptions (always
  // "downgrades" by picking the lowest).
  /** Not indexed */
  NONE,
  /** 
   * Only documents are indexed: term frequencies and positions are omitted.
   * Phrase and other positional queries on the field will throw an exception, and scoring
   * will behave as if any term in the document appears only once.
   */
  DOCS,
  /** 
   * Only documents and term frequencies are indexed: positions are omitted. 
   * This enables normal scoring, except Phrase and other positional queries
   * will throw an exception.
   */  
  DOCS_AND_FREQS,
  /** 
   * Indexes documents, frequencies and positions.
   * This is a typical default for full-text search: full scoring is enabled
   * and positional queries are supported.
   */
  DOCS_AND_FREQS_AND_POSITIONS,
  /** 
   * Indexes documents, frequencies, positions and offsets.
   * Character offsets are encoded alongside the positions. 
   */
  DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS,
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36

從它的選項看似乎是多了幾個功能,可以對詞的頻率索引、位置索引、甚至偏移量索引,這是之前版本所沒有的。其他方面和4.5版本基本一樣。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章