Lucene
Lucene是很強大的檢索工具,Hibernate Search將lucene core和JPA/Hibernate ORM結合起來,當我們通過JPA添加或者修改數據時,自動在Lucene中index了entity,在檢索時採用lucene core搜索引起進行搜索,並返回JPA對象實體。
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-search-orm</artifactId>
<version>5.9.1.Final</version>
</dependency>
設置Hibernate Search
在上下文的配置中:
@Bean
public LocalContainerEntityManagerFactoryBean entityManagerFactory() throws PropertyVetoException{
Map<String, Object> properties = new Hashtable<>();
properties.put("javax.persistence.schema-generation.database.action","none");
/* 允許Hibernate ORM使用Hibernate Search。採用Lucene standalone (no Solr),並將索引保存在本地文件系統。
* 當本war裏面的Hibernate ORM相關的數據庫寫時,將觸發Hibernate search對相關內容進行索引,寫入到文件中。
* 這種方式不適用於Tomcat集羣的方式,如果採用Tomcat集羣,需要使用Solr server。*/
properties.put("hibernate.search.default.directory_provider", "filesystem");
/* 本例放在 ../searchIndexes,開發環境中爲eclipse的第一級目錄 */
properties.put("hibernate.search.default.indexBase", "../searchIndexes");
properties.put("hibernate.show_sql", "true");
properties.put("hibernate.dialect", "org.hibernate.dialect.MySQL5InnoDBDialect");
LocalContainerEntityManagerFactoryBean factory = new LocalContainerEntityManagerFactoryBean();
factory.setJpaVendorAdapter(new HibernateJpaVendorAdapter());
factory.setDataSource(this.springJpaDataSource());
factory.setPackagesToScan("cn.wei.flowingflying.chapter23.entities");
factory.setSharedCacheMode(SharedCacheMode.ENABLE_SELECTIVE);
factory.setValidationMode(ValidationMode.NONE);
factory.setJpaPropertyMap(properties);
return factory;
}
小例子有關的Entity數據和小例子目的
這是數據庫表UserPrincipal_23,映射爲entity User:mysql> select * from UserPrincipal_23;
+--------+----------+
| UserId | Username |
+--------+----------+
| 4 | John |
| 3 | Mike |
| 1 | Nicholas |
| 2 | Sarah |
+--------+----------+
表格Post_23,映射爲entity Post:
mysql> select * from Post_23;
+--------+--------+----------------+--------------------------------------+------------+
| PostId | UserId | Title | Body | Keywords |
+--------+--------+----------------+--------------------------------------+------------+
| 1 | 3 | Title One | Test One. Hello world! Java! | one java |
| 2 | 1 | Title Two | Hello, my friend! This is title two. | two friend |
| 3 | 1 | Hello Nicholas | My name is Nicholas! Hi, Nicholas | Nicholas |
+--------+--------+----------------+--------------------------------------+------------+
mysql> select Post_23.*,UserPrincipal_23.Username from Post_23 left join UserPrincipal_23 on Post_23.UserId = UserPrincipal_23.UserId;
+--------+--------+----------------+--------------------------------------+------------+----------+
| PostId | UserId | Title | Body | Keywords | Username |
+--------+--------+----------------+--------------------------------------+------------+----------+
| 1 | 3 | Title One | Test One. Hello world! Java! | one java | Mike |
| 2 | 1 | Title Two | Hello, my friend! This is title two. | two friend | Nicholas |
| 3 | 1 | Hello Nicholas | My name is Nicholas! Hi, Nicholas | Nicholas | Nicholas |
+--------+--------+----------------+--------------------------------------+------------+----------+
小例子搜索title,body,keywords和username。
小例子的search是兩個表join,對於ORM,這裏採用@ManyToOne,將在後面學習。對於Lucene,採用Hibernate search的標記。
被索引的entity
@Entity
@Table(name="UserPrincipal_23")
public class User {
private long id;
private String username;
@Basic
@Field //表明這個屬性在Lucene中作爲可索引項(被搜索內容)
public String getUsername() {
return username;
}
... ...
}
主entity
@Entity
@Table(name="Post_23")
/*【1】@Indexed:表明這個類對Hibernate search是全文檢索,將自動爲該實體創建或是更新Lucene的文檔。
* 文檔的Id由@DocumentId標識,如果不添加,則自動標註到entity的@Id */
@Indexed
public class ForumPost {
private long id;
private User user;//表格通過外鍵UserId 關聯
private String title;
private String body;
private String keywords;
@Id
@Column(name="PostId")
@GeneratedValue(strategy=GenerationType.IDENTITY)
/*【2】設置文檔的Id: Hibernate Search爲這個entity自動創建和更新document。@DocumentId用來表示這是document ID。
* 這裏加在@Id上作爲唯一標識,如果沒有加,自動加在@Id上。*/
@DocumentId
public long getId() { ... }
@ManyToOne(fetch=FetchType.EAGER,optional=false)
@JoinColumn(name="UserId")
/*【4】索引到根entity(本例爲索引至User):告訴Hibernate Search這是屬性是另一個entity的Id。
* 關聯的對象的屬性也可以進行index,本例爲User中的@Field String username。有點類似於級聯的設置 */
@IndexedEmbedded
public User getUser() { ... }
@Basic
@Field //【3】該屬性需要進行全文搜索
public String getTitle() { ... }
@Lob
@Field //【3】該屬性需要進行全文搜索
public String getBody() { ... }
@SuppressWarnings("deprecation")
@Basic
/* Deprecated. Index-time boosting will not be possible anymore starting from Lucene 7.
* You should use query-time boosting instead, for instance by calling boostedTo(float)
* when building queries with the Hibernate Search query DSL.
* @Boost:相關性加權 */
@Field(boost = @Boost(2.0F))
public String getKeywords() { ... }
... ...
}
search的相關代碼
同樣的,我們提供SearchResult來存放entity和相關度分值。
public class SearchResult<T> {
private final T entity;
private final double relevance;
......
}
設置相關的倉庫接口
public interface SearchableRepository<T>{
Page<SearchResult<T>> search(String query, Pageable pageable);
}
public interface ForumPostRepository extends JpaRepository<ForumPost, Long>,SearchableRepository<ForumPost>{
}
在Hibernate Search中使用了Lucene文檔,相關api和JPA的api相似,當然亦可以採用Lucene的API。我們看看具體的代碼:
public class ForumPostRepositoryImpl implements SearchRepository<ForumPost>{
//【1】獲取Hibernate search的全文檢索的entity管理器,類似於JPA中的entityManager,相關的Lucene的全文搜索的方法均給
// 基於此FullTextEntityManager。請注意在root上下文配置的entityManagerFactory是涵蓋了Hibernate search的相關設置。
@PersistenceContext EntityManager entityManager;
EntityManagerProxy entityManagerProxy;
// 1.1)在Spring框架中注入的@PersistenceContext EntityManager entityManager;,實際是EntityManger proxy
//(爲每個事務都代表提供一個新EntityManager),我們通過initialize()獲取該proxy。
@PostConstruct
public void initialize(){
if(!(this.entityManager instanceof EntityManagerProxy))
throw new FatalBeanException("Entity manager " + this.entityManager + " was not a proxy.");
this.entityManagerProxy = (EntityManagerProxy) entityManager;
}
// 1.2)FullTextEntityManager是真實的,非proxy的,也就是我們需要爲每次搜索,創建一個新的對象。
//(無法如Spring注入的EntityManager proxy那樣默默爲你自動實現)。且FullTextEntityManager的獲取
// 必須通過一個真正的Hibernate ORM EntityManager實現(而不能通過proxy)來獲取。
private FullTextEntityManager getFullTextEntityManager(){
return Search.getFullTextEntityManager(this.entityManagerProxy.getTargetEntityManager());
}
//【2】Hibernate search的全文檢索實現。
@Override
public Page<SearchResult<ForumPost>> search(String query, Pageable pageable) {
// 2.1)在事務中獲取FullTextEntityManager
FullTextEntityManager manager = getFullTextEntityManager();
// 2.2)進行search。Hibernate search的API和JPA的API有相似之處,因爲都是Hibernate架構。
QueryBuilder builder = manager.getSearchFactory().buildQueryBuilder().forEntity(ForumPost.class).get();
Query lucene = builder.keyword()
.onFields("title", "body", "keywords", "user.username") //指定要檢索的屬性,請注意user.username
.matching(query) //matching裏面爲要檢索的內容
.createQuery();
FullTextQuery q = manager.createFullTextQuery(lucene, ForumPost.class);
q.setProjection(FullTextQuery.THIS,FullTextQuery.SCORE); //返回ForumPost和相關度
// 2.3)獲取搜索的結果的數量
long total = q.getResultSize();
// 2.4)獲取具體的內容
@SuppressWarnings("unchecked")
List<Object[]> results = q.setFirstResult(pageable.getOffset())
.setMaxResults(pageable.getPageSize())
.getResultList();
List<SearchResult<ForumPost>> list = new ArrayList<>();
results.forEach(o -> list.add(
new SearchResult<>((ForumPost)o[0], (Float)o[1])) );
return new PageImpl<>(list, pageable, total);
}
}