Spring data solr - 台部落

本文是基于Spring-data对solr的快速搭建，具体文档请参考下面的官方。

http://docs.spring.io/spring-data/solr/docs/current/reference/html/

首先安装Solr，自行百度。

在Solr里创建文件夹，文件夹名字为下面要起的Core Admin的名字。这里起名为caa-new，创建conf和data两个文件夹

conf结构如下：

下载中文分词包：lucene-analyzers-smartcn-5.2.1.jar 放到 solr/server/solr-webapp/webapp/libs 下。

注意：之后修改schema.xml.bak。要改成这个schema.xml，删掉managed-schema，这样之后才能生成新的索引。

编辑data-config.xml

<?xml version="1.0" encoding="UTF-8"?> 
<dataConfig>
    <dataSource type="JdbcDataSource"
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://10.23.203.34:3306/caa?autoReconnect=true&useUnicode=true&characterEncoding=UTF8&characterSetResults=UTF8"
              user="root"
              password="root" />
    <document name="notice">
       <entity name="notice" query="select * from notice where title like '%${dataimporter.request.title}%' and content like '%${dataimporter.request.content}%'">
            <field column="id" name="id" />
            <field column="title" name="title" />
            <field column="content_solr" name="content" />
        </entity>
    </document>
</dataConfig>

编辑schema.xml

添加中文分词类

<fieldType name="text_smart" class="solr.TextField" positionIncrementGap="100">
          <analyzer type="index">
            <tokenizer class="solr.SmartChineseSentenceTokenizerFactory"/>
			<!--charFilter class="solr.HTMLStripCharFilterFactory" /-->
            <filter class="solr.SmartChineseWordTokenFilterFactory"/>
          </analyzer>
          <analyzer type="query">
            <tokenizer class="solr.SmartChineseSentenceTokenizerFactory"/>
			<filter class="solr.SmartChineseWordTokenFilterFactory"/>
          </analyzer>
    </fieldType>

添加查询字段

   <field name="content" type="text_smart" indexed="true" stored="true"/>

   <field name="title" type="text_smart" indexed="true" stored="true"/>

编辑solrconfig.xml

    <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
       <lst name="defaults">
           <str name="config">/opt/solr/server/solr/caa-new/conf/data-config.xml</str>
       </lst>
    </requestHandler>

之后如下图，添加Core Admin。名字和上面的文件夹保持一致，这里是caa-new

如下图测试中文分词是不是可以正确运作。

之后需要手动生成下索引

然后就可以happy search了……（这里因为没有数据就不展示了。）

下面进行Spring-data-solr的配置。

首先是POM.xml依赖

		<!-- solr -->
		<dependency>
			<groupId>org.springframework.data</groupId>
			<artifactId>spring-data-solr</artifactId>
			<version>1.3.1.RELEASE</version>
		</dependency>
		<!-- Spring and Transactions -->
		<dependency>
			<groupId>org.springframework</groupId>
			<artifactId>spring-context</artifactId>
			<version>${spring-framework.version}</version>
		</dependency>
		<dependency>
			<groupId>org.springframework</groupId>
			<artifactId>spring-core</artifactId>
			<version>${spring-framework.version}</version>
		</dependency>

注意，确保Spring的版本<spring-framework.version>4.1.4.RELEASE</spring-framework.version>

配置文件：solr.properties

solr.host=http://10.23.101.187:8983/solr

之后是读取配置生成SolrService

@Configuration
@PropertySource("solr.properties")
@EnableSolrRepositories(basePackages={"com.github.tmxin.solr"},multicoreSupport=true)
public class SolrContext {
	static final String SOLR_HOST = "solr.host";

	  @Resource
	  private Environment environment;

	  @Bean
	  public SolrServer solrServer() {
	    String solrHost = environment.getProperty(SOLR_HOST);
	    return new HttpSolrServer(solrHost);
	  }
}

全文检索的实体类

@SolrDocument(solrCoreName="caa-new")
public class NoticeSolr {
	
	@Id
	@Indexed
	public String id;
	
	@Indexed
	public String title;
	
	@Indexed
	public String content;

	@Override
	public String toString() {
		return "NoticeSolr [title=" + title +" content="+content.replaceAll("\n", "") + ", ID=" + id + "]";
	}

}

最后是接口定义层，需要

public interface NoticeSolrRepository extends SolrCrudRepository<NoticeSolr, String>{
	
	@Query(value = "*:*")//, filters = {"title北京市"}
	Page<NoticeSolr> findByID(Pageable page);

	@Highlight(prefix = "<b>", postfix = "</b>")
	HighlightPage<NoticeSolr> findByContent(String content, Pageable pageable);
	
	@Highlight(prefix = "null", postfix = "null")
	HighlightPage<NoticeSolr> findByTitle (String title, Pageable pageable);
	
	@Highlight(prefix = "<basn>", postfix = "</basn>")
	HighlightPage<NoticeSolr> findByTitleOrContentLike(String title, String content, Pageable pageable);
	
}

这些接口可以通过Spring-data的映射直接实现，不需要自己实现，这就是spring-data的便利之处

@Highlight是高亮显示查询的分词。你可以自己定义标签。

最后我们来测试一下。

public class Test {
	private static final Log LOG = LogFactory.getLog(Test.class);

	
	@SuppressWarnings("resource")
	public static void main(String[] args) {
		ApplicationContext ctx = new AnnotationConfigApplicationContext(
				SolrContext.class);
		NoticeSolrRepository repositorys = ctx
				.getBean(NoticeSolrRepository.class);
//		Page<NoticeSolr> page = repositorys.findByID(new PageRequest(
//				0, 2));
//		HighlightPage<NoticeSolr> page = repositorys.findByContent("拍卖", new PageRequest(
//				0, 2));
//		HighlightPage<NoticeSolr> page = repositorys.findByTitle ("拍卖", new PageRequest(
//				0, 2));
		HighlightPage<NoticeSolr> page = repositorys.findByTitleOrContentLike("s","s", new PageRequest(
				0, 2));
		LOG.info(page.getContent());
		
		LOG.info("_____________下面是返回高亮的内容，其中page也包含不是高亮内容的其他数据，需要的话可以自行获取。______________");
		
		for (HighlightEntry<NoticeSolr> he : page.getHighlighted()) {
		  // A HighlightEntry belongs to an Entity (Book) and may have multiple highlighted fields (description)
		  for (Highlight highlight : he.getHighlights()) {
		    // Each highlight might have multiple occurrences within the description
		    for (String snipplet : highlight.getSnipplets()) {
		    	LOG.info(snipplet.replaceAll("\n", ""));
		    }
		  }
		}
		LOG.info("************************************************");
		for (NoticeSolr notice: page) {
			LOG.info(notice);
		}
	}
}

具体代码参照下载：

http://download.csdn.net/detail/tengmuxin/9707920

常年帮下载CSDN资料，Q819226396，只求好评，不为钱