﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>BlogJava-Rising Sun -随笔分类-Lucene</title><link>http://www.blogjava.net/brock/category/54662.html</link><description /><language>zh-cn</language><lastBuildDate>Wed, 07 Jan 2015 11:41:29 GMT</lastBuildDate><pubDate>Wed, 07 Jan 2015 11:41:29 GMT</pubDate><ttl>60</ttl><item><title>Lucene基础篇3 Analyzer</title><link>http://www.blogjava.net/brock/archive/2015/01/07/422100.html</link><dc:creator>brock</dc:creator><author>brock</author><pubDate>Wed, 07 Jan 2015 02:11:00 GMT</pubDate><guid>http://www.blogjava.net/brock/archive/2015/01/07/422100.html</guid><wfw:comment>http://www.blogjava.net/brock/comments/422100.html</wfw:comment><comments>http://www.blogjava.net/brock/archive/2015/01/07/422100.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/brock/comments/commentRss/422100.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/brock/services/trackbacks/422100.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp; 摘要: 看了网上的许多对于lucene 分词解析的文章一知半解且代码比较老旧，为透彻、系统、全面、深刻的了解分词是怎么一个过程，通过自定义一个分词器来分析理解。 其中分词部分利用ICTCLAS4j接口实现。结构如下所示：    &nbsp;  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 要实现自定义的ICTCLAS4jAnalyzer必须继承Analy...&nbsp;&nbsp;<a href='http://www.blogjava.net/brock/archive/2015/01/07/422100.html'>阅读全文</a><img src ="http://www.blogjava.net/brock/aggbug/422100.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/brock/" target="_blank">brock</a> 2015-01-07 10:11 <a href="http://www.blogjava.net/brock/archive/2015/01/07/422100.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Lucene基础篇2 Directory</title><link>http://www.blogjava.net/brock/archive/2015/01/07/422099.html</link><dc:creator>brock</dc:creator><author>brock</author><pubDate>Wed, 07 Jan 2015 02:09:00 GMT</pubDate><guid>http://www.blogjava.net/brock/archive/2015/01/07/422099.html</guid><wfw:comment>http://www.blogjava.net/brock/comments/422099.html</wfw:comment><comments>http://www.blogjava.net/brock/archive/2015/01/07/422099.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/brock/comments/commentRss/422099.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/brock/services/trackbacks/422099.html</trackback:ping><description><![CDATA[<p>Lucene <span style="font-family: 宋体;">的</span> Directory<span style="font-family:宋体;">类就像它的意思一样&#8220;目录&#8221;，如&#8220;目录&#8221;不存在，第一次启动被创建，一旦文件被创建，它只能打开阅读，或删除。允许读取和写入随机访问。</span>Java I/O api <span style="font-family:宋体;">不能直接使用，只能通过这个</span>API <span style="font-family:宋体;">。</span>Directory<span style="font-family:宋体;">的实现类可以分为文件目录，内存目录和目录的代理类及工具类。具体如下图所示：</span></p>  <p align="center" style="text-align:center"><img src="http://www.blogjava.net/images/blogjava_net/brock/1.jpg" width="799" height="518" alt="" /><br /></p>  <h3><span style="font-family:宋体;">一：文件目录</span></h3>  <p>SimpleFSDirectory:FSDirectory<span style="font-family:宋体;">的简单实现</span>,<span style="font-family:宋体;">并发能力有限，遇到多线程读同一个文件时会遇到瓶颈，通常用</span>NIOFSDirectory<span style="font-family:宋体;">或</span>MMapDirectory<span style="font-family:宋体;">代替。</span></p>  <p>NIOFSDirectory<span style="font-family:宋体;">：通过</span>java.nio's FileChannel<span style="font-family:宋体;">实行定位读取，支持多线程读（默认情况下是线程安全的）。该类仅使用</span>FileChannel<span style="font-family:宋体;">进行读操作，写操作则是通过</span>FSIndexOutput<span style="font-family:宋体;">实现。</span> </p>  <p><span style="font-family:宋体; color:red">注意：</span><span style="color:red">NIOFSDirectory </span><span style="font-family:宋体;color:red">不适用于</span><span style="color:red">Windows</span><span style="font-family:宋体;color:red">系统</span><span style="font-family: 宋体;">，另外如果一个访问该类的线程，在</span>IO<span style="font-family:宋体;">阻塞时被</span>interrupt<span style="font-family:宋体;">或</span>cancel<span style="font-family:宋体;">，将会导致底层的文件描述符被关闭，后续的线程再次访问</span>NIOFSDirectory<span style="font-family:宋体;">时将会出现</span>ClosedChannelException<span style="font-family:宋体;">异常，此种情况应用</span>SimpleFSDirectory<span style="font-family:宋体;">代替。</span></p>  <p>MMapDirectory<span style="font-family: 宋体;">：通过内存映射进行读，通过</span>FSIndexOutput<span style="font-family:宋体;">进行写的</span>FSDirectory<span style="font-family:宋体;">实现类。使用该类时要保证用足够的虚拟地址空间。另外当通过</span>IndexInput<span style="font-family:宋体;">的</span>close<span style="font-family:宋体;">方法进行关闭时并不会立即关闭底层的文件句柄，只有</span>GC<span style="font-family:宋体;">进行资源回收时才会关闭。</span></p>  <p>&nbsp;</p>  <p><span style="font-family:宋体;">为了能适应各个操作系统选择最佳</span>Directory<span style="font-family:宋体;">方案，</span>lucene <span style="font-family:宋体;">提供</span>FSDirectory<span style="font-family:宋体;">类的静态方法</span>open()<span style="font-family:宋体;">实现自适应。</span></p>  <p align="left" style="line-height: 12pt;">&nbsp;<strong><span style="font-size:7.5pt;font-family:Consolas; color:#7F0055;">public</span></strong> <strong><span style="font-size: 7.5pt;font-family:Consolas;color:#7F0055;">static</span></strong><span style="font-size: 7.5pt; font-family: Consolas;"> FSDirectory open(File path, LockFactory lockFactory) </span><strong><span style="font-size:7.5pt;font-family:Consolas; color:#7F0055;">throws</span></strong><span style="font-size: 7.5pt; font-family: Consolas;"> IOException {</span></p>  <p align="left" style="line-height: 12pt;"><span style="font-size: 7.5pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp; </span><strong><span style="font-size:7.5pt;font-family:Consolas; color:#7F0055;">if</span></strong><span style="font-size: 7.5pt; font-family: Consolas;"> ((Constants.</span><em><span style="font-size:7.5pt;font-family:Consolas; color:#0000C0;">WINDOWS</span></em><span style="font-size: 7.5pt; font-family: Consolas;"> || Constants.</span><em><span style="font-size:7.5pt;font-family:Consolas; color:#0000C0;">SUN_OS</span></em><span style="font-size: 7.5pt; font-family: Consolas;"> || Constants.</span><em><span style="font-size:7.5pt;font-family:Consolas; color:#0000C0;">LINUX</span></em><span style="font-size: 7.5pt; font-family: Consolas;">)</span></p>  <p align="left" style="line-height: 12pt;"><span style="font-size: 7.5pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &amp;&amp; Constants.</span><em><span style="font-size:7.5pt;font-family:Consolas;color:#0000C0;">JRE_IS_64BIT</span></em><span style="font-size: 7.5pt; font-family: Consolas;"> &amp;&amp; MMapDirectory.</span><em><span style="font-size:7.5pt;font-family:Consolas;color:#0000C0;">UNMAP_SUPPORTED</span></em><span style="font-size: 7.5pt; font-family: Consolas;">) {</span></p>  <p align="left" style="line-height: 12pt;"><span style="font-size: 7.5pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><strong><span style="font-size:7.5pt;font-family:Consolas; color:#7F0055;">return</span></strong> <strong><span style="font-size: 7.5pt;font-family:Consolas;color:#7F0055;">new</span></strong><span style="font-size: 7.5pt; font-family: Consolas;"> MMapDirectory(path, lockFactory);</span></p>  <p align="left" style="line-height: 12pt;"><span style="font-size: 7.5pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp; } </span><strong><span style="font-size:7.5pt;font-family:Consolas; color:#7F0055;">else</span></strong> <strong><span style="font-size: 7.5pt;font-family:Consolas;color:#7F0055;">if</span></strong><span style="font-size: 7.5pt; font-family: Consolas;"> (Constants.</span><em><span style="font-size:7.5pt;font-family:Consolas; color:#0000C0;">WINDOWS</span></em><span style="font-size: 7.5pt; font-family: Consolas;">) {</span></p>  <p align="left" style="line-height: 12pt;"><span style="font-size: 7.5pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><strong><span style="font-size:7.5pt;font-family:Consolas; color:#7F0055;">return</span></strong> <strong><span style="font-size: 7.5pt;font-family:Consolas;color:#7F0055;">new</span></strong><span style="font-size: 7.5pt; font-family: Consolas;"> SimpleFSDirectory(path, lockFactory);</span></p>  <p align="left" style="line-height: 12pt;"><span style="font-size: 7.5pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp; } </span><strong><span style="font-size:7.5pt;font-family:Consolas; color:#7F0055;">else</span></strong><span style="font-size: 7.5pt; font-family: Consolas;"> {</span></p>  <p align="left" style="line-height: 12pt;"><span style="font-size: 7.5pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><strong><span style="font-size:7.5pt;font-family:Consolas; color:#7F0055;">return</span></strong> <strong><span style="font-size: 7.5pt;font-family:Consolas;color:#7F0055;">new</span></strong><span style="font-size: 7.5pt; font-family: Consolas;"> NIOFSDirectory(path, lockFactory);</span></p>  <p align="left" style="line-height: 12pt;"><span style="font-size: 7.5pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp; }</span></p>  <p style="line-height:12.0pt;"><span style="font-size: 7.5pt; font-family: Consolas;">&nbsp; }</span></p>  <h3><span style="font-family:宋体;">二：内存目录</span></h3>  <p>RAMDirectory<span style="font-family: 宋体;">：常驻内存的</span>Directory<span style="font-family:宋体;">实现方式。默认通过</span>SingleInstanceLockFactory<span style="font-family:宋体;">（单实例锁工厂）进行锁的实现。<span style="color:red">该类不适合大量索引的情况</span>。<span style="color:red">另外也不适用于多线程的情况</span>。</span>&nbsp;<span style="font-family:宋体;">在索引数据量大的情况下建议使用</span>MMapDirectory<span style="font-family:宋体;">代替。</span>RAMDirectory<span style="font-family:宋体;">是</span>Directory<span style="font-family:宋体;">抽象类在使用内存最为文件存储的实现类，其主要是将所有的索引文件保存到内存中。这样可以提高效率。但是如果索引文件过大的话，则会导致内存不足，因此，小型的系统推荐使用，如果大型的，索引文件达到</span>G<span style="font-family:宋体;">级别上，推荐使用</span>FSDirectory<span style="font-family:宋体;">。</span></p>  <p>NRTCachingDirectory<span style="font-family:宋体;">：是对</span>RAMDirectory<span style="font-family:宋体;">的封装，适用于近乎时时（</span>near-real-time<span style="font-family:宋体;">）操作的环境。</span></p>  <h3><span style="font-family:宋体;">三：</span>Direcotry<span style="font-family:宋体;">的代理类及工具类</span></h3>  <p>FileSwitchDirectory:<span style="font-family:宋体;">文件切换的</span>Directory<span style="font-family:宋体;">实现</span>.<span style="font-family:宋体;">针对</span>lucene<span style="font-family:宋体;">的不同的索引文件使用不同的</span>Directory .<span style="font-family:宋体;">借助</span>FileSwitchDirectory<span style="font-family:宋体;">整合不同的</span>Directory<span style="font-family:宋体;">实现类的优点于一身</span><br /> <span style="font-family:宋体;">比如</span>MMapDirectory,<span style="font-family:宋体;">借助内存映射文件方式提高性能，但又要减少内存切换的可能</span> <span style="font-family:宋体;">，当索引太大的时候，内存映射也需要不断地切换，这样优点也可能变缺点，而之前的</span>NIOFSDirectory<span style="font-family:宋体;">实现</span>java NIO<span style="font-family:宋体;">的方式提高高并发性能，但又因高并发也会导致</span>IO<span style="font-family:宋体;">过多的影响，所以这次可以借助</span>FileSwitchDirectory<span style="font-family:宋体;">发挥他们两的优点。</span></p>  <p>RateLimitedDirectoryWrapper:<span style="font-family:宋体;">通过</span>IOContext<span style="font-family:宋体;">来限制读写速率的</span>Directory<span style="font-family:宋体;">封装类。</span></p>  <p>CompoundFileDirectory<span style="font-family:宋体;">：用于访问一个组合的数据流。仅适用于读操作。对于同一段内扩展名不同但文件名相同的所有文件合并到一个统一的</span>.cfs<span style="font-family:宋体;">文件和一个对应的</span>.cfe<span style="font-family:宋体;">文件内。</span><br /> .cfs<span style="font-family:宋体;">文件由</span>Header<span style="font-family:宋体;">，</span>FileData<span style="font-family:宋体;">和</span>FileCount<span style="font-family:宋体;">组成。</span>.cfe<span style="font-family:宋体;">文件由</span>Header<span style="font-family:宋体;">，</span>FileCount,FileName,DataOffset,DataLength<span style="font-family:宋体;">组成。</span>.cfs<span style="font-family:宋体;">文件中存储着索引的概要信息及组合文件</span><br /> <span style="font-family:宋体;">的数目（</span>FileCount<span style="font-family:宋体;">）。</span>.cfe<span style="font-family:宋体;">文件存储文件目录的条目内容，内容中包括文件数据扇区的起始位置，文件的长度及文件的名称。</span></p>  <p>TrackingDirectoryWrapper<span style="font-family:宋体;">：</span>Directory<span style="font-family:宋体;">的代理类。用于记录哪些文件被写入和删除。</span></p>  <h3><span style="font-family:宋体;">四：</span>Direcotry<span style="font-family:宋体;">读写对象的类图</span></h3>  <p><br /><img src="http://www.blogjava.net/images/blogjava_net/brock/2.jpg" width="825" height="458" alt="" /><br /></p><a href="http://blog.itpub.net/batch.download.php?aid=38503"></a>  <span style="font-size:10.5pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;;Times New Roman&quot;;"><br clear="all" style="page-break-before:always" /> </span>  <p>&nbsp;文章转载过来的！</p><img src ="http://www.blogjava.net/brock/aggbug/422099.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/brock/" target="_blank">brock</a> 2015-01-07 10:09 <a href="http://www.blogjava.net/brock/archive/2015/01/07/422099.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Lucene基础篇1 概论</title><link>http://www.blogjava.net/brock/archive/2014/12/31/421991.html</link><dc:creator>brock</dc:creator><author>brock</author><pubDate>Wed, 31 Dec 2014 09:07:00 GMT</pubDate><guid>http://www.blogjava.net/brock/archive/2014/12/31/421991.html</guid><wfw:comment>http://www.blogjava.net/brock/comments/421991.html</wfw:comment><comments>http://www.blogjava.net/brock/archive/2014/12/31/421991.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/brock/comments/commentRss/421991.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/brock/services/trackbacks/421991.html</trackback:ping><description><![CDATA[<p><span style="font-family:宋体;">在学</span>lucene <span style="font-family:宋体;">之初看了许多书，都是走马观花，没有项目的驱动下，来一个用例</span>demo<span style="font-family:宋体;">感觉也不是很难，&#8220;我会了&#8221;这是我的第一感觉。</span></p><p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span style="font-family:宋体;">在</span>2013<span style="font-family:宋体;">年底公司接到一个项目用到</span>lucene,<span style="font-family:宋体;">这是我第一次正真接触</span>Lucene<span style="font-family:宋体;">，代码比较老</span>3.6<span style="font-family:宋体;">版本，不适合新项目的需求（空间查询）。于是下载了最新版本</span> 4.51,<span style="font-family:宋体;">有带&#8220;空间查询&#8221;模块。各大搜索引擎都没有找到像样例子，于是想到了</span>lucene svn<span style="font-family:宋体;">的</span> trunk<span style="font-family:宋体;">目录测试用例中找到了测试例子，开始了一段</span>lucene<span style="font-family:宋体;">之旅。</span></p><p>&nbsp;</p><p style="text-indent:21.0pt"><span style="font-family:宋体;">写数据，创建</span>IndexWriter,<span style="font-family:宋体;">通过它的构造函数需要一个索引目录（</span>Diectory<span style="font-family:宋体;">）和索引写入配置项（</span>InderWriterConfig<span style="font-family:宋体;">）</span>,<span style="font-family:宋体;">直接上代码：</span></p><p align="left"><span style="font-size:9.0pt;font-family:Consolas;color:#3F7F5F;">//</span><span style="font-size:9.0pt;font-family:宋体;color:#3F7F5F;">设置写入目录</span><span style="font-size:9.0pt;font-family:Consolas;color:#3F7F5F;">(</span><span style="font-size:9.0pt;font-family: 宋体; color:#3F7F5F;">好几种呵呵</span><span style="font-size:9.0pt;font-family:Consolas; color:#3F7F5F;">)</span></p><p align="left"><span style="font-size: 9pt; font-family: Consolas;">Directory d=FSDirectory.<em>open</em>(</span><strong><span style="font-size:9.0pt;font-family:Consolas;color:#7F0055;">new</span></strong><span style="font-size: 9pt; font-family: Consolas;"> File(</span><span style="font-size:9.0pt;font-family:Consolas; color:#2A00FF;">"D:/luceneTest"</span><span style="font-size: 9pt; font-family: Consolas;">));</span></p><p align="left"><span style="font-size:9.0pt;font-family:Consolas;color:#3F7F5F;">//</span><span style="font-size:9.0pt;font-family:宋体;color:#3F7F5F;">设置分词</span><span style="font-size:9.0pt;font-family:Consolas;color:#3F7F5F;"> StandardAnalyzer</span><span style="font-size:9.0pt;font-family:宋体;color:#3F7F5F;">（会把句子中的字单个分词）</span></p><p align="left"><span style="font-size: 9pt; font-family: Consolas;">Analyzer analyzer= </span><strong><span style="font-size:9.0pt;font-family:Consolas;color:#7F0055;">new</span></strong><span style="font-size: 9pt; font-family: Consolas;"> StandardAnalyzer(Version.</span><em><span style="font-size:9.0pt;font-family:Consolas; color:#0000C0;">LUCENE_45</span></em><span style="font-size: 9pt; font-family: Consolas;">);</span></p><p align="left"><span style="font-size:9.0pt;font-family:Consolas;color:#3F7F5F;">//</span><span style="font-size:9.0pt;font-family:宋体;color:#3F7F5F;">设置索引写入配置</span> </p><p align="left"><span style="font-size: 9pt; font-family: Consolas;">IndexWriterConfig </span>config<span style="font-size: 9pt; font-family: Consolas;">=</span><strong><span style="font-size:9.0pt;font-family:Consolas;color:#7F0055;">new</span></strong><span style="font-size: 9pt; font-family: Consolas;"> IndexWriterConfig(Version.</span><em><span style="font-size:9.0pt;font-family:Consolas; color:#0000C0;">LUCENE_45</span></em><span style="font-size: 9pt; font-family: Consolas;">,analyzer);</span></p><p align="left"><span style="font-size:9.0pt;font-family:Consolas;color:#3F7F5F;">//</span><span style="font-size:9.0pt;font-family:宋体;color:#3F7F5F;">设置创建模式</span></p><p align="left"><span style="font-size:9.0pt;font-family:Consolas;color:#3F7F5F;">//config.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);</span></p><p><span style="font-size: 9pt; font-family: Consolas;">IndexWriter <u>indexwriter</u>= </span><strong><span style="font-size:9.0pt; font-family:Consolas;color:#7F0055;">new</span></strong><span style="font-size: 9pt; font-family: Consolas;"> IndexWriter(d,config);</span></p><p>&nbsp;</p><p><span style="font-size: 9pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp; </span><span style="font-size: 9pt; font-family: 宋体;">上面四行代码就创建好了</span><u><span style="font-size: 9pt; font-family: Consolas;">indexwriter</span></u><u><span style="font-size: 9pt; font-family: 宋体;">，</span></u><span style="font-size: 9pt; font-family: 宋体;">下面把数据填入就好了，写入有多种方式如下图：</span></p><p><img src="http://www.blogjava.net/images/blogjava_net/brock/1.png" border="0" alt="" width="668" height="128" /><br /></p><p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span style="font-family:宋体;">用</span> addDocment <span style="font-family:宋体;">举例代码如下：</span></p><p align="left"><span style="font-size: 9pt; font-family: Consolas;">Document doc=</span><strong><span style="font-size:9.0pt;font-family:Consolas; color:#7F0055;">new</span></strong><span style="font-size: 9pt; font-family: Consolas;"> Document();&nbsp; </span></p><p align="left"><span style="font-size: 9pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;doc.add(</span><strong><span style="font-size:9.0pt;font-family:Consolas; color:#7F0055;">new</span></strong><span style="font-size: 9pt; font-family: Consolas;"> StringField(</span><span style="font-size:9.0pt;font-family:Consolas; color:#2A00FF;">"id"</span><span style="font-size: 9pt; font-family: Consolas;">, </span><span style="font-size: 9.0pt;font-family:Consolas;color:#2A00FF;">"1"</span><span style="font-size: 9pt; font-family: Consolas;">, Store.</span><em><span style="font-size:9.0pt;font-family:Consolas; color:#0000C0;">YES</span></em><span style="font-size: 9pt; font-family: Consolas;">));</span></p><p align="left"><span style="font-size: 9pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doc.add(</span><strong><span style="font-size:9.0pt;font-family:Consolas; color:#7F0055;">new</span></strong><span style="font-size: 9pt; font-family: Consolas;"> StringField(</span><span style="font-size:9.0pt;font-family:Consolas; color:#2A00FF;">"name"</span><span style="font-size: 9pt; font-family: Consolas;">, </span><span style="font-size: 9.0pt;font-family:Consolas;color:#2A00FF;">"brockhong"</span><span style="font-size: 9pt; font-family: Consolas;">, Store.</span><em><span style="font-size:9.0pt;font-family:Consolas;color:#0000C0;">YES</span></em><span style="font-size: 9pt; font-family: Consolas;">));</span></p><p align="left"><span style="font-size: 9pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doc.add(</span><strong><span style="font-size:9.0pt;font-family:Consolas; color:#7F0055;">new</span></strong><span style="font-size: 9pt; font-family: Consolas;"> TextField(</span><span style="font-size:9.0pt;font-family:Consolas; color:#2A00FF;">"content"</span><span style="font-size: 9pt; font-family: Consolas;">, </span><span style="font-size: 9.0pt;font-family:Consolas;color:#2A00FF;">"lucene </span><span style="font-size:9.0pt; font-family:宋体;color:#2A00FF;">文档第一次写看着给分吧</span><span style="font-size:9.0pt;font-family:Consolas; color:#2A00FF;">"</span><span style="font-size: 9pt; font-family: Consolas;">, Store.</span><em><span style="font-size:9.0pt;font-family:Consolas; color:#0000C0;">YES</span></em><span style="font-size: 9pt; font-family: Consolas;">));&nbsp; </span></p><p align="left"><span style="font-size:9.0pt;font-family:Consolas;color:#3F7F5F;">//</span><span style="font-size:9.0pt;font-family:宋体;color:#3F7F5F;">写入数据</span></p><p align="left"><span style="font-size: 9pt; font-family: Consolas;">indexwriter.addDocument(doc);</span></p><p align="left"><span style="font-size:9.0pt;font-family:Consolas;color:#3F7F5F;">//</span><span style="font-size:9.0pt;font-family:宋体;color:#3F7F5F;">提交</span></p><p><span style="font-size: 9pt; font-family: Consolas;">indexwriter.commit();</span></p><p><span style="font-size: 9pt; font-family: 宋体;">用</span><span style="font-size: 9pt; font-family: Consolas;"> Luke </span><span style="font-size: 9pt; font-family: 宋体;">工具查看</span><span style="font-size: 9pt; font-family: Consolas;">Text</span><span style="font-size: 9pt; font-family: 宋体;">列，这是标准分词惹的祸哦！写入成功。</span></p><p><img src="http://www.blogjava.net/images/blogjava_net/brock/2.png" border="0" alt="" width="876" height="686" /><br /></p><p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span style="font-family:宋体;">读数据查询，创建</span> IndexSearcher <span style="font-family:宋体;">构造函数设置</span>indexReader <span style="font-family:宋体;">，输入查询条件，上面</span>content<span style="font-family:宋体;">字段数据设置了分词，所以必须通过查询解析类</span>QueryParser<span style="font-family:宋体;">设定分词字段、版本、分词模式，并通过</span>parse<span style="font-family:宋体;">方法得到查询条件。代码如下：</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </p><p align="left"><span style="font-size:9.0pt;font-family:Consolas;color:#3F7F5F;">&nbsp;//</span><span style="font-size:9.0pt;font-family:宋体;color:#3F7F5F;">读数据</span></p><p align="left"><span style="font-size:9.0pt;font-family:Consolas;color:#3F7F5F;">&nbsp;//</span><span style="font-size:9.0pt;font-family:宋体;color:#3F7F5F;">创建</span><span style="font-size:9.0pt;font-family:Consolas; color:#3F7F5F;"> indexReader </span><span style="font-size: 9.0pt;font-family:宋体;color:#3F7F5F;">这个已过时</span><span style="font-size:9.0pt;font-family:Consolas; color:#3F7F5F;"> IndexReader.open(d)</span><span style="font-size:9.0pt;font-family:宋体;color:#3F7F5F;">，里面的代码一样可能为了兼容老版本</span></p><p><span style="font-size: 9pt; font-family: Consolas;">&nbsp;IndexReader <u>indexReader</u> = DirectoryReader.<em>open</em>(d);</span></p><p><span style="font-size: 8pt; font-family: Consolas;">&nbsp;IndexSearcher <u>indexSearcher</u> = </span><strong><span style="font-size:8.0pt;font-family:Consolas; color:#7F0055;">new</span></strong><span style="font-size: 8pt; font-family: Consolas;"> IndexSearcher(indexReader);</span> </p><p align="left"><span style="font-size:7.5pt;font-family:Consolas;color:#3F7F5F;">//</span><span style="font-size:7.5pt;font-family:宋体;color:#3F7F5F;">查询</span> <span style="font-size:7.5pt;font-family:宋体;color:#3F7F5F;">设置分词字段</span></p><p align="left"><span style="font-size: 8pt; font-family: Consolas;">QueryParser queryParser = </span><strong><span style="font-size:8.0pt;font-family:Consolas;color:#7F0055;">new</span></strong><span style="font-size: 8pt; font-family: Consolas;"> QueryParser(Version.</span><em><span style="font-size:8.0pt;font-family:Consolas; color:#0000C0;">LUCENE_45</span></em><span style="font-size: 8pt; font-family: Consolas;">, </span><span style="font-size: 8.0pt;font-family:Consolas;color:#2A00FF;">"content"</span><span style="font-size: 8pt; font-family: Consolas;">,</span></p><p align="left"><span style="font-size: 8pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><strong><span style="font-size:8.0pt;font-family:Consolas; color:#7F0055;">new</span></strong><span style="font-size: 8pt; font-family: Consolas;"> StandardAnalyzer(Version.</span><em><span style="font-size:8.0pt;font-family:Consolas; color:#0000C0;">LUCENE_45</span></em><span style="font-size: 8pt; font-family: Consolas;">));</span></p><p align="left">&nbsp;<span style="font-size:7.5pt;font-family:Consolas;color:#3F7F5F;">//or </span><span style="font-size:7.5pt; font-family:宋体;color:#3F7F5F;">关系</span> <span style="font-size:7.5pt;font-family: 宋体; color:#3F7F5F;">&#8220;给&#8221;、&#8220;分&#8221;</span></p><p align="left"><span style="font-size: 8pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; queryParser.setDefaultOperator(QueryParser.</span><em><span style="font-size:8.0pt;font-family:Consolas; color:#0000C0;">OR_OPERATOR</span></em><span style="font-size: 8pt; font-family: Consolas;">);</span></p><p align="left"><span style="font-size: 8pt; font-family: Consolas;">Query query = queryParser.parse(</span><span style="font-size:8.0pt;font-family:Consolas;color:#2A00FF;">"</span><span style="font-size:8.0pt;font-family:宋体;color:#2A00FF;">给分</span><span style="font-size:8.0pt;font-family:Consolas;color:#2A00FF;">"</span><span style="font-size: 8pt; font-family: Consolas;">);</span></p><p align="left">&nbsp;</p><p align="left"><span style="font-size: 8pt; font-family: Consolas;">TopDocs results = indexSearcher.search(query, 100);</span></p><p align="left"><strong><span style="font-size:8.0pt;font-family:Consolas;color:#7F0055;">int</span></strong><span style="font-size: 8pt; font-family: Consolas;"> numTotalHits = results.</span><span style="font-size:8.0pt; font-family:Consolas;color:#0000C0;">totalHits</span><span style="font-size: 8pt; font-family: Consolas;">;</span></p><p align="left"><span style="font-size: 8pt; font-family: Consolas;">System.</span><em><span style="font-size:8.0pt;font-family:Consolas; color:#0000C0;">out</span></em><span style="font-size: 8pt; font-family: Consolas;">.println(</span><span style="font-size:8.0pt;font-family:Consolas; color:#2A00FF;">"</span><span style="font-size:8.0pt;font-family:宋体;color:#2A00FF;">共</span><span style="font-size:8.0pt;font-family:Consolas; color:#2A00FF;"> "</span><span style="font-size: 8pt; font-family: Consolas;"> + numTotalHits + </span><span style="font-size:8.0pt;font-family:Consolas; color:#2A00FF;">" </span><span style="font-size:8.0pt;font-family:宋体;color:#2A00FF;">完全匹配的文档</span><span style="font-size:8.0pt;font-family:Consolas; color:#2A00FF;">"</span><span style="font-size: 8pt; font-family: Consolas;">);</span></p><p align="left"><span style="font-size: 8pt; font-family: Consolas;">ScoreDoc[] hits = results.</span><span style="font-size:8.0pt; font-family:Consolas;color:#0000C0;">scoreDocs</span><span style="font-size: 8pt; font-family: Consolas;">;</span> </p><p align="left"><strong><span style="font-size:8.0pt;font-family:Consolas;color:#7F0055;">for</span></strong><span style="font-size: 8pt; font-family: Consolas;"> (</span><strong><span style="font-size:8.0pt;font-family:Consolas; color:#7F0055;">int</span></strong><span style="font-size: 8pt; font-family: Consolas;"> i = 0; i &lt; hits.</span><span style="font-size:8.0pt;font-family:Consolas; color:#0000C0;">length</span><span style="font-size: 8pt; font-family: Consolas;">; i++) {</span></p><p align="left"><span style="font-size: 8pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Document document = indexSearcher.doc(hits[i].</span><span style="font-size:8.0pt;font-family:Consolas;color:#0000C0;">doc</span><span style="font-size: 8pt; font-family: Consolas;">);</span></p><p align="left"><span style="font-size: 8pt; font-family: Consolas;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; System.</span><em><span style="font-size:8.0pt;font-family:Consolas; color:#0000C0;">out</span></em><span style="font-size: 8pt; font-family: Consolas;">.println(</span><span style="font-size:8.0pt;font-family:Consolas; color:#2A00FF;">"content:"</span><span style="font-size: 8pt; font-family: Consolas;"> + document.get(</span><span style="font-size:8.0pt;font-family:Consolas; color:#2A00FF;">"content"</span><span style="font-size: 8pt; font-family: Consolas;">));</span></p><p class="MsoNormal" align="left"><span lang="EN-US" style="font-size: 8pt; font-family: Consolas;">}<o:p></o:p></span></p><p><br /></p>pasting<img src ="http://www.blogjava.net/brock/aggbug/421991.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/brock/" target="_blank">brock</a> 2014-12-31 17:07 <a href="http://www.blogjava.net/brock/archive/2014/12/31/421991.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>