﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>BlogJava-Change Dir-随笔分类-机器学习</title><link>http://www.blogjava.net/changedi/category/43772.html</link><description>先知cd——热爱生活是一切艺术的开始</description><language>zh-cn</language><lastBuildDate>Tue, 28 May 2013 07:23:38 GMT</lastBuildDate><pubDate>Tue, 28 May 2013 07:23:38 GMT</pubDate><ttl>60</ttl><item><title>weka定制计划 已添加到github</title><link>http://www.blogjava.net/changedi/archive/2013/05/28/399860.html</link><dc:creator>changedi</dc:creator><author>changedi</author><pubDate>Tue, 28 May 2013 03:46:00 GMT</pubDate><guid>http://www.blogjava.net/changedi/archive/2013/05/28/399860.html</guid><wfw:comment>http://www.blogjava.net/changedi/comments/399860.html</wfw:comment><comments>http://www.blogjava.net/changedi/archive/2013/05/28/399860.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/changedi/comments/commentRss/399860.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/changedi/services/trackbacks/399860.html</trackback:ping><description><![CDATA[今天把weka3.7.0官方的开发版本添加到github，有需要的同学可以去下载使用，其中我已经配置好libsvm和liblinear，聚类的clusterEvaluation也定制输出了一些额外的信息比如错误聚类的原始类标和聚类类标的对比（这个功能可以帮助我们定位到类似EM或者KMEANS算法聚类结果中哪些instance被标记的类型）。<br />另外，对weka感兴趣的朋友也欢迎贡献代码和想法需求，我可以帮助实现。未来我会不定期的新增一些weka的定制，以及在源代码层级做一些中文注释辅助应用者使用。<br /><br />我的weka github地址：<a href="https://github.com/changedi/weka">https://github.com/changedi/weka</a>，只读git路径：git://github.com/changedi/weka.git，欢迎fork<img src ="http://www.blogjava.net/changedi/aggbug/399860.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/changedi/" target="_blank">changedi</a> 2013-05-28 11:46 <a href="http://www.blogjava.net/changedi/archive/2013/05/28/399860.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>weka特征预处理的一些tip</title><link>http://www.blogjava.net/changedi/archive/2012/04/24/376482.html</link><dc:creator>changedi</dc:creator><author>changedi</author><pubDate>Tue, 24 Apr 2012 08:09:00 GMT</pubDate><guid>http://www.blogjava.net/changedi/archive/2012/04/24/376482.html</guid><wfw:comment>http://www.blogjava.net/changedi/comments/376482.html</wfw:comment><comments>http://www.blogjava.net/changedi/archive/2012/04/24/376482.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/changedi/comments/commentRss/376482.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/changedi/services/trackbacks/376482.html</trackback:ping><description><![CDATA[首先，提供两个地址，这里包含了全部的内容原文：<br /><a href="http://weka.wikispaces.com/Text+categorization+with+Weka">http://weka.wikispaces.com/Text+categorization+with+Weka</a><br /><a href="http://weka.wikispaces.com/ARFF+files+from+Text+Collections">http://weka.wikispaces.com/ARFF+files+from+Text+Collections</a><br /><br /><strong>weka可以以目录形式读入数据</strong>。<br />然后再简单说一下weka在做文本特征内容处理时候需要注意的东西：<br />声明一点，在weka的gui下是没法使用这个功能的：以目录形式读入数据。<br />首先，把要处理的数据写入到这样的目录结构下：<br /><pre class="text">...
|
+- text_example
|
+- class1
|  |
|  + file1.txt
|  |
|  + file2.txt
|  |
|  ...
|
+- class2
|  |
|  + another_file1.txt
|  |
|  + another_file2.txt
|  |
|  ...</pre><br />然后在源码包下，命令行执行 java weka.core.converters.TextDirectoryLoader <span class="re5">-dir</span> text_example <span class="sy0">&gt;</span> text_example.arff<br />其中text_example就是数据所在的目录，而后面的arff文件就是生成的arff文件。另外值得补充的一点是在获得这样的arff后哦，文本内容是作为一个字符串特征存在的，也就是说生成的arff就是一个特征项加一个类标签，其中的类标就是text_example目录下级classX子目录的名字。为了更方便使用，weka提供了一个有监督的属性过滤器，帮助分词（这里指英文的split） &#8212;&#8212;StringToWordVector，这个是可以做TF/IDF的~~~<br />下面的简单代码可以完成一个分类： 
<div style="border-bottom: #cccccc 1px solid; border-left: #cccccc 1px solid; padding-bottom: 4px; background-color: #eeeeee; padding-left: 4px; width: 98%; padding-right: 5px; font-size: 13px; word-break: break-all; border-top: #cccccc 1px solid; border-right: #cccccc 1px solid; padding-top: 4px"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #008080">&nbsp;1</span><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/None.gif" /><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.core.</span><span style="color: #000000">*</span><span style="color: #000000">;<br /></span><span style="color: #008080">&nbsp;2</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/None.gif" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.core.converters.</span><span style="color: #000000">*</span><span style="color: #000000">;<br /></span><span style="color: #008080">&nbsp;3</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/None.gif" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.classifiers.trees.</span><span style="color: #000000">*</span><span style="color: #000000">;<br /></span><span style="color: #008080">&nbsp;4</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/None.gif" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.filters.</span><span style="color: #000000">*</span><span style="color: #000000">;<br /></span><span style="color: #008080">&nbsp;5</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/None.gif" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.filters.unsupervised.attribute.</span><span style="color: #000000">*</span><span style="color: #000000">;<br /></span><span style="color: #008080">&nbsp;6</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/None.gif" /><br /></span><span style="color: #008080">&nbsp;7</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/None.gif" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;java.io.</span><span style="color: #000000">*</span><span style="color: #000000">;<br /></span><span style="color: #008080">&nbsp;8</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/None.gif" /><br /></span><span style="color: #008080">&nbsp;9</span><span style="color: #000000"><img id="Codehighlighter1_173_466_Open_Image" onclick="this.style.display='none'; Codehighlighter1_173_466_Open_Text.style.display='none'; Codehighlighter1_173_466_Closed_Image.style.display='inline'; Codehighlighter1_173_466_Closed_Text.style.display='inline';" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedBlockStart.gif"><img style="display: none" id="Codehighlighter1_173_466_Closed_Image" onclick="this.style.display='none'; Codehighlighter1_173_466_Closed_Text.style.display='none'; Codehighlighter1_173_466_Open_Image.style.display='inline'; Codehighlighter1_173_466_Open_Text.style.display='inline';" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ContractedBlock.gif"></span><span style="border-bottom: #808080 1px solid; border-left: #808080 1px solid; background-color: #ffffff; display: none; border-top: #808080 1px solid; border-right: #808080 1px solid" id="Codehighlighter1_173_466_Closed_Text">/**&nbsp;*/</span><span id="Codehighlighter1_173_466_Open_Text"><span style="color: #008000">/**</span><span style="color: #008000"><br /></span><span style="color: #008080">10</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;*&nbsp;Example&nbsp;class&nbsp;that&nbsp;converts&nbsp;HTML&nbsp;files&nbsp;stored&nbsp;in&nbsp;a&nbsp;directory&nbsp;structure&nbsp;into&nbsp;<br /></span><span style="color: #008080">11</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;*&nbsp;and&nbsp;ARFF&nbsp;file&nbsp;using&nbsp;the&nbsp;TextDirectoryLoader&nbsp;converter.&nbsp;It&nbsp;then&nbsp;applies&nbsp;the<br /></span><span style="color: #008080">12</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;*&nbsp;StringToWordVector&nbsp;to&nbsp;the&nbsp;data&nbsp;and&nbsp;feeds&nbsp;a&nbsp;J48&nbsp;classifier&nbsp;with&nbsp;it.<br /></span><span style="color: #008080">13</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;*<br /></span><span style="color: #008080">14</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;*&nbsp;</span><span style="color: #808080">@author</span><span style="color: #008000">&nbsp;FracPete&nbsp;(fracpete&nbsp;at&nbsp;waikato&nbsp;dot&nbsp;ac&nbsp;dot&nbsp;nz)<br /></span><span style="color: #008080">15</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedBlockEnd.gif" />&nbsp;</span><span style="color: #008000">*/</span></span><span style="color: #000000"><br /></span><span style="color: #008080">16</span><span style="color: #000000"><img id="Codehighlighter1_504_1797_Open_Image" onclick="this.style.display='none'; Codehighlighter1_504_1797_Open_Text.style.display='none'; Codehighlighter1_504_1797_Closed_Image.style.display='inline'; Codehighlighter1_504_1797_Closed_Text.style.display='inline';" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedBlockStart.gif"><img style="display: none" id="Codehighlighter1_504_1797_Closed_Image" onclick="this.style.display='none'; Codehighlighter1_504_1797_Closed_Text.style.display='none'; Codehighlighter1_504_1797_Open_Image.style.display='inline'; Codehighlighter1_504_1797_Open_Text.style.display='inline';" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ContractedBlock.gif"></span><span style="color: #0000ff">public</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">class</span><span style="color: #000000">&nbsp;TextCategorizationTest&nbsp;</span><span style="border-bottom: #808080 1px solid; border-left: #808080 1px solid; background-color: #ffffff; display: none; border-top: #808080 1px solid; border-right: #808080 1px solid" id="Codehighlighter1_504_1797_Closed_Text"><img alt="" src="http://www.blogjava.net/Images/dot.gif" /></span><span id="Codehighlighter1_504_1797_Open_Text"><span style="color: #000000">{<br /></span><span style="color: #008080">17</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" /><br /></span><span style="color: #008080">18</span><span style="color: #000000"><img id="Codehighlighter1_509_836_Open_Image" onclick="this.style.display='none'; Codehighlighter1_509_836_Open_Text.style.display='none'; Codehighlighter1_509_836_Closed_Image.style.display='inline'; Codehighlighter1_509_836_Closed_Text.style.display='inline';" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockStart.gif"><img style="display: none" id="Codehighlighter1_509_836_Closed_Image" onclick="this.style.display='none'; Codehighlighter1_509_836_Closed_Text.style.display='none'; Codehighlighter1_509_836_Open_Image.style.display='inline'; Codehighlighter1_509_836_Open_Text.style.display='inline';" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ContractedSubBlock.gif">&nbsp;&nbsp;</span><span style="border-bottom: #808080 1px solid; border-left: #808080 1px solid; background-color: #ffffff; display: none; border-top: #808080 1px solid; border-right: #808080 1px solid" id="Codehighlighter1_509_836_Closed_Text">/**&nbsp;*/</span><span id="Codehighlighter1_509_836_Open_Text"><span style="color: #008000">/**</span><span style="color: #008000"><br /></span><span style="color: #008080">19</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;*&nbsp;Expects&nbsp;the&nbsp;first&nbsp;parameter&nbsp;to&nbsp;point&nbsp;to&nbsp;the&nbsp;directory&nbsp;with&nbsp;the&nbsp;text&nbsp;files.<br /></span><span style="color: #008080">20</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;*&nbsp;In&nbsp;that&nbsp;directory,&nbsp;each&nbsp;sub-directory&nbsp;represents&nbsp;a&nbsp;class&nbsp;and&nbsp;the&nbsp;text<br /></span><span style="color: #008080">21</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;*&nbsp;files&nbsp;in&nbsp;these&nbsp;sub-directories&nbsp;will&nbsp;be&nbsp;labeled&nbsp;as&nbsp;such.<br /></span><span style="color: #008080">22</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;*<br /></span><span style="color: #008080">23</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;*&nbsp;</span><span style="color: #808080">@param</span><span style="color: #008000">&nbsp;args&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;the&nbsp;commandline&nbsp;arguments<br /></span><span style="color: #008080">24</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;*&nbsp;</span><span style="color: #808080">@throws</span><span style="color: #008000">&nbsp;Exception&nbsp;&nbsp;if&nbsp;something&nbsp;goes&nbsp;wrong<br /></span><span style="color: #008080">25</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockEnd.gif" />&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">*/</span></span><span style="color: #000000"><br /></span><span style="color: #008080">26</span><span style="color: #000000"><img id="Codehighlighter1_896_1795_Open_Image" onclick="this.style.display='none'; Codehighlighter1_896_1795_Open_Text.style.display='none'; Codehighlighter1_896_1795_Closed_Image.style.display='inline'; Codehighlighter1_896_1795_Closed_Text.style.display='inline';" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockStart.gif"><img style="display: none" id="Codehighlighter1_896_1795_Closed_Image" onclick="this.style.display='none'; Codehighlighter1_896_1795_Closed_Text.style.display='none'; Codehighlighter1_896_1795_Open_Image.style.display='inline'; Codehighlighter1_896_1795_Open_Text.style.display='inline';" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ContractedSubBlock.gif">&nbsp;&nbsp;</span><span style="color: #0000ff">public</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">static</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">void</span><span style="color: #000000">&nbsp;main(String[]&nbsp;args)&nbsp;</span><span style="color: #0000ff">throws</span><span style="color: #000000">&nbsp;Exception&nbsp;</span><span style="border-bottom: #808080 1px solid; border-left: #808080 1px solid; background-color: #ffffff; display: none; border-top: #808080 1px solid; border-right: #808080 1px solid" id="Codehighlighter1_896_1795_Closed_Text"><img alt="" src="http://www.blogjava.net/Images/dot.gif" /></span><span id="Codehighlighter1_896_1795_Open_Text"><span style="color: #000000">{<br /></span><span style="color: #008080">27</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">&nbsp;convert&nbsp;the&nbsp;directory&nbsp;into&nbsp;a&nbsp;dataset</span><span style="color: #008000"><br /></span><span style="color: #008080">28</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" /></span><span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;TextDirectoryLoader&nbsp;loader&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;TextDirectoryLoader();<br /></span><span style="color: #008080">29</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;loader.setDirectory(</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;File(</span><span style="color: #000000">"</span><span style="color: #000000">./text_example</span><span style="color: #000000">"</span><span style="color: #000000">));<br /></span><span style="color: #008080">30</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;Instances&nbsp;dataRaw&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;loader.getDataSet();<br /></span><span style="color: #008080">31</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(</span><span style="color: #000000">"</span><span style="color: #000000">\n\nImported&nbsp;data:\n\n</span><span style="color: #000000">"</span><span style="color: #000000">&nbsp;</span><span style="color: #000000">+</span><span style="color: #000000">&nbsp;dataRaw.numClasses());<br /></span><span style="color: #008080">32</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" /><br /></span><span style="color: #008080">33</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">&nbsp;apply&nbsp;the&nbsp;StringToWordVector<br /></span><span style="color: #008080">34</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">&nbsp;(see&nbsp;the&nbsp;source&nbsp;code&nbsp;of&nbsp;setOptions(String[])&nbsp;method&nbsp;of&nbsp;the&nbsp;filter<br /></span><span style="color: #008080">35</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">&nbsp;if&nbsp;you&nbsp;want&nbsp;to&nbsp;know&nbsp;which&nbsp;command-line&nbsp;option&nbsp;corresponds&nbsp;to&nbsp;which<br /></span><span style="color: #008080">36</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">&nbsp;bean&nbsp;property)</span><span style="color: #008000"><br /></span><span style="color: #008080">37</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" /></span><span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;StringToWordVector&nbsp;filter&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;StringToWordVector();<br /></span><span style="color: #008080">38</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;filter.setInputFormat(dataRaw);<br /></span><span style="color: #008080">39</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;Instances&nbsp;dataFiltered&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;Filter.useFilter(dataRaw,&nbsp;filter);<br /></span><span style="color: #008080">40</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(</span><span style="color: #000000">"</span><span style="color: #000000">\n\nFiltered&nbsp;data:\n\n</span><span style="color: #000000">"</span><span style="color: #000000">&nbsp;</span><span style="color: #000000">+</span><span style="color: #000000">&nbsp;dataFiltered);<br /></span><span style="color: #008080">41</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" /><br /></span><span style="color: #008080">42</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">&nbsp;train&nbsp;J48&nbsp;and&nbsp;output&nbsp;model</span><span style="color: #008000"><br /></span><span style="color: #008080">43</span><span style="color: #008000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" /></span><span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;J48&nbsp;classifier&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;J48();<br /></span><span style="color: #008080">44</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;classifier.buildClassifier(dataFiltered);<br /></span><span style="color: #008080">45</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" />&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(</span><span style="color: #000000">"</span><span style="color: #000000">\n\nClassifier&nbsp;model:\n\n</span><span style="color: #000000">"</span><span style="color: #000000">&nbsp;</span><span style="color: #000000">+</span><span style="color: #000000">&nbsp;classifier);<br /></span><span style="color: #008080">46</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockEnd.gif" />&nbsp;&nbsp;}</span></span><span style="color: #000000"><br /></span><span style="color: #008080">47</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedBlockEnd.gif" />}</span></span><span style="color: #000000"><br /></span><span style="color: #008080">48</span><span style="color: #000000"><img alt="" align="top" src="http://www.blogjava.net/images/OutliningIndicators/None.gif" /></span></div><br />最后，我还是建议数据建模和生成都自己写程序，数据准备往往自己的程序才能准确的控制，weka最多是帮我们做一下selection和classification。<br />另外补充一点，很多朋友问到了如何做文本分类，好吧，如果大家懒得去读paper的话，首先我普及一点，不管什么分类，分类器基本是可以通用的，注意是基本。关键是模型的构建和特征的生成。至于文本分类中用到的特征，TF*IDF还有其他如互信息，卡方统计，期望交叉熵等等，公式摆在那里，计算真的不难。因为就我接触过的分类问题，文本分类的特征计算应该是很容易的了。<br /><br /><img src ="http://www.blogjava.net/changedi/aggbug/376482.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/changedi/" target="_blank">changedi</a> 2012-04-24 16:09 <a href="http://www.blogjava.net/changedi/archive/2012/04/24/376482.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>weka的java使用(3)——特征选择</title><link>http://www.blogjava.net/changedi/archive/2010/11/23/338756.html</link><dc:creator>changedi</dc:creator><author>changedi</author><pubDate>Tue, 23 Nov 2010 02:06:00 GMT</pubDate><guid>http://www.blogjava.net/changedi/archive/2010/11/23/338756.html</guid><wfw:comment>http://www.blogjava.net/changedi/comments/338756.html</wfw:comment><comments>http://www.blogjava.net/changedi/archive/2010/11/23/338756.html#Feedback</comments><slash:comments>16</slash:comments><wfw:commentRss>http://www.blogjava.net/changedi/comments/commentRss/338756.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/changedi/services/trackbacks/338756.html</trackback:ping><description><![CDATA[继续weka的编程系列。数据挖掘的一个重要的过程就是要特征选择，主要作用就是降维，并且降低计算的复杂性，摒弃那些可能的潜在噪声。在我的paper中和硕士论文中都用到了CFS的特征子集选择方法，配以最佳优先的搜索或者贪心搜索，这样可以将维度比较高的训练特征集降维并简化，大概用CFS+Best first可以将我的训练样本中的145维特征降到40-50之间。<br />
具体的实现方法见下面的测试代码（只做示范用）：<br />
<div style="border-right: #cccccc 1px solid; padding-right: 5px; border-top: #cccccc 1px solid; padding-left: 4px; font-size: 13px; padding-bottom: 4px; border-left: #cccccc 1px solid; width: 98%; word-break: break-all; padding-top: 4px; border-bottom: #cccccc 1px solid; background-color: #eeeeee"><span style="color: #008080">&nbsp;1</span><img id="Codehighlighter1_0_10_Open_Image" onclick="this.style.display='none'; Codehighlighter1_0_10_Open_Text.style.display='none'; Codehighlighter1_0_10_Closed_Image.style.display='inline'; Codehighlighter1_0_10_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_0_10_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_0_10_Closed_Text.style.display='none'; Codehighlighter1_0_10_Open_Image.style.display='inline'; Codehighlighter1_0_10_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedBlock.gif" align="top"  alt="" /><span id="Codehighlighter1_0_10_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff">/**&nbsp;*/</span><span id="Codehighlighter1_0_10_Open_Text"><span style="color: #008000">/**</span><span style="color: #008000"><br />
</span><span style="color: #008080">&nbsp;2</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;*&nbsp;<br />
</span><span style="color: #008080">&nbsp;3</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/ExpandedBlockEnd.gif" align="top"  alt="" />&nbsp;</span><span style="color: #008000">*/</span></span><span style="color: #000000"><br />
</span><span style="color: #008080">&nbsp;4</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span><span style="color: #0000ff">package</span><span style="color: #000000">&nbsp;edu.tju.ikse.mi.util;<br />
</span><span style="color: #008080">&nbsp;5</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /><br />
</span><span style="color: #008080">&nbsp;6</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;java.io.File;<br />
</span><span style="color: #008080">&nbsp;7</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;java.io.IOException;<br />
</span><span style="color: #008080">&nbsp;8</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;java.util.Random;<br />
</span><span style="color: #008080">&nbsp;9</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /><br />
</span><span style="color: #008080">10</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.attributeSelection.ASEvaluation;<br />
</span><span style="color: #008080">11</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.attributeSelection.ASSearch;<br />
</span><span style="color: #008080">12</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.attributeSelection.AttributeSelection;<br />
</span><span style="color: #008080">13</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.attributeSelection.BestFirst;<br />
</span><span style="color: #008080">14</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.attributeSelection.CfsSubsetEval;<br />
</span><span style="color: #008080">15</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.core.Instances;<br />
</span><span style="color: #008080">16</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span><span style="color: #0000ff">import</span><span style="color: #000000">&nbsp;weka.core.converters.ArffLoader;<br />
</span><span style="color: #008080">17</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /><br />
</span><span style="color: #008080">18</span><span style="color: #000000"><img id="Codehighlighter1_412_456_Open_Image" onclick="this.style.display='none'; Codehighlighter1_412_456_Open_Text.style.display='none'; Codehighlighter1_412_456_Closed_Image.style.display='inline'; Codehighlighter1_412_456_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_412_456_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_412_456_Closed_Text.style.display='none'; Codehighlighter1_412_456_Open_Image.style.display='inline'; Codehighlighter1_412_456_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedBlock.gif" align="top"  alt="" /></span><span id="Codehighlighter1_412_456_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff">/**&nbsp;*/</span><span id="Codehighlighter1_412_456_Open_Text"><span style="color: #008000">/**</span><span style="color: #008000"><br />
</span><span style="color: #008080">19</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;*&nbsp;</span><span style="color: #808080">@author</span><span style="color: #008000">&nbsp;Jia&nbsp;Yu<br />
</span><span style="color: #008080">20</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;*&nbsp;@date&nbsp;2010-11-23<br />
</span><span style="color: #008080">21</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/ExpandedBlockEnd.gif" align="top"  alt="" />&nbsp;</span><span style="color: #008000">*/</span></span><span style="color: #000000"><br />
</span><span style="color: #008080">22</span><span style="color: #000000"><img id="Codehighlighter1_484_2532_Open_Image" onclick="this.style.display='none'; Codehighlighter1_484_2532_Open_Text.style.display='none'; Codehighlighter1_484_2532_Closed_Image.style.display='inline'; Codehighlighter1_484_2532_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_484_2532_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_484_2532_Closed_Text.style.display='none'; Codehighlighter1_484_2532_Open_Image.style.display='inline'; Codehighlighter1_484_2532_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedBlock.gif" align="top"  alt="" /></span><span style="color: #0000ff">public</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">class</span><span style="color: #000000">&nbsp;WekaSelector&nbsp;</span><span id="Codehighlighter1_484_2532_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff"><img src="http://www.blogjava.net/Images/dot.gif"  alt="" /></span><span id="Codehighlighter1_484_2532_Open_Text"><span style="color: #000000">{<br />
</span><span style="color: #008080">23</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" /><br />
</span><span style="color: #008080">24</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">private</span><span style="color: #000000">&nbsp;ArffLoader&nbsp;loader;<br />
</span><span style="color: #008080">25</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">private</span><span style="color: #000000">&nbsp;Instances&nbsp;dataSet;<br />
</span><span style="color: #008080">26</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">private</span><span style="color: #000000">&nbsp;File&nbsp;arffFile;<br />
</span><span style="color: #008080">27</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">private</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">int</span><span style="color: #000000">&nbsp;sizeOfDataset;<br />
</span><span style="color: #008080">28</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">private</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">int</span><span style="color: #000000">&nbsp;numOfOldAttributes;<br />
</span><span style="color: #008080">29</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">private</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">int</span><span style="color: #000000">&nbsp;numOfNewAttributes;<br />
</span><span style="color: #008080">30</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">private</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">int</span><span style="color: #000000">&nbsp;classIndex;<br />
</span><span style="color: #008080">31</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">private</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">int</span><span style="color: #000000">[]&nbsp;selectedAttributes;<br />
</span><span style="color: #008080">32</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" /><br />
</span><span style="color: #008080">33</span><span style="color: #000000"><img id="Codehighlighter1_773_1051_Open_Image" onclick="this.style.display='none'; Codehighlighter1_773_1051_Open_Text.style.display='none'; Codehighlighter1_773_1051_Closed_Image.style.display='inline'; Codehighlighter1_773_1051_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_773_1051_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_773_1051_Closed_Text.style.display='none'; Codehighlighter1_773_1051_Open_Image.style.display='inline'; Codehighlighter1_773_1051_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedSubBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">public</span><span style="color: #000000">&nbsp;WekaSelector(File&nbsp;file)&nbsp;</span><span style="color: #0000ff">throws</span><span style="color: #000000">&nbsp;IOException&nbsp;</span><span id="Codehighlighter1_773_1051_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff"><img src="http://www.blogjava.net/Images/dot.gif"  alt="" /></span><span id="Codehighlighter1_773_1051_Open_Text"><span style="color: #000000">{<br />
</span><span style="color: #008080">34</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;loader&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;ArffLoader();<br />
</span><span style="color: #008080">35</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;arffFile&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;file;<br />
</span><span style="color: #008080">36</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;loader.setFile(arffFile);<br />
</span><span style="color: #008080">37</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dataSet&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;loader.getDataSet();<br />
</span><span style="color: #008080">38</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sizeOfDataset&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;dataSet.numInstances();<br />
</span><span style="color: #008080">39</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;numOfOldAttributes&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;dataSet.numAttributes();<br />
</span><span style="color: #008080">40</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;classIndex&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;numOfOldAttributes&nbsp;</span><span style="color: #000000">-</span><span style="color: #000000">&nbsp;</span><span style="color: #000000">1</span><span style="color: #000000">;<br />
</span><span style="color: #008080">41</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dataSet.setClassIndex(classIndex);<br />
</span><span style="color: #008080">42</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockEnd.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;}</span></span><span style="color: #000000"><br />
</span><span style="color: #008080">43</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" /><br />
</span><span style="color: #008080">44</span><span style="color: #000000"><img id="Codehighlighter1_1093_2127_Open_Image" onclick="this.style.display='none'; Codehighlighter1_1093_2127_Open_Text.style.display='none'; Codehighlighter1_1093_2127_Closed_Image.style.display='inline'; Codehighlighter1_1093_2127_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_1093_2127_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_1093_2127_Closed_Text.style.display='none'; Codehighlighter1_1093_2127_Open_Image.style.display='inline'; Codehighlighter1_1093_2127_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedSubBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">public</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">void</span><span style="color: #000000">&nbsp;select()&nbsp;</span><span style="color: #0000ff">throws</span><span style="color: #000000">&nbsp;Exception&nbsp;</span><span id="Codehighlighter1_1093_2127_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff"><img src="http://www.blogjava.net/Images/dot.gif"  alt="" /></span><span id="Codehighlighter1_1093_2127_Open_Text"><span style="color: #000000">{<br />
</span><span style="color: #008080">45</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ASEvaluation&nbsp;evaluator&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;CfsSubsetEval();<br />
</span><span style="color: #008080">46</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ASSearch&nbsp;search&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;BestFirst();<br />
</span><span style="color: #008080">47</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AttributeSelection&nbsp;eval&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">null</span><span style="color: #000000">;<br />
</span><span style="color: #008080">48</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" /><br />
</span><span style="color: #008080">49</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;eval&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;AttributeSelection();<br />
</span><span style="color: #008080">50</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;eval.setEvaluator(evaluator);<br />
</span><span style="color: #008080">51</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;eval.setSearch(search);<br />
</span><span style="color: #008080">52</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" /><br />
</span><span style="color: #008080">53</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;eval.SelectAttributes(dataSet);<br />
</span><span style="color: #008080">54</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;numOfNewAttributes&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;eval.numberAttributesSelected();<br />
</span><span style="color: #008080">55</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;selectedAttributes&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;eval.selectedAttributes();<br />
</span><span style="color: #008080">56</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(</span><span style="color: #000000">"</span><span style="color: #000000">result&nbsp;is&nbsp;</span><span style="color: #000000">"</span><span style="color: #000000">+</span><span style="color: #000000">eval.toResultsString());<br />
</span><span style="color: #008080">57</span><span style="color: #000000"><img id="Codehighlighter1_1510_1880_Open_Image" onclick="this.style.display='none'; Codehighlighter1_1510_1880_Open_Text.style.display='none'; Codehighlighter1_1510_1880_Closed_Image.style.display='inline'; Codehighlighter1_1510_1880_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_1510_1880_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_1510_1880_Closed_Text.style.display='none'; Codehighlighter1_1510_1880_Open_Image.style.display='inline'; Codehighlighter1_1510_1880_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedSubBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span id="Codehighlighter1_1510_1880_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff">/**/</span><span id="Codehighlighter1_1510_1880_Open_Text"><span style="color: #008000">/*</span><span style="color: #008000"><br />
</span><span style="color: #008080">58</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Random&nbsp;random&nbsp;=&nbsp;new&nbsp;Random(seed);<br />
</span><span style="color: #008080">59</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dataSet.randomize(random);<br />
</span><span style="color: #008080">60</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(dataSet.attribute(classIndex).isNominal())&nbsp;{<br />
</span><span style="color: #008080">61</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dataSet.stratify(numFolds);<br />
</span><span style="color: #008080">62</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />
</span><span style="color: #008080">63</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for&nbsp;(int&nbsp;fold&nbsp;=&nbsp;0;&nbsp;fold&nbsp;&lt;&nbsp;numFolds;&nbsp;fold++)&nbsp;{<br />
</span><span style="color: #008080">64</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Instances&nbsp;train&nbsp;=&nbsp;dataSet.trainCV(numFolds,&nbsp;fold,&nbsp;random);<br />
</span><span style="color: #008080">65</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;eval.selectAttributesCVSplit(train);<br />
</span><span style="color: #008080">66</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />
</span><span style="color: #008080">67</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println("result&nbsp;is&nbsp;"+eval.CVResultsString());<br />
</span><span style="color: #008080">68</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockEnd.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">*/</span></span><span style="color: #000000"><br />
</span><span style="color: #008080">69</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(</span><span style="color: #000000">"</span><span style="color: #000000">old&nbsp;number&nbsp;of&nbsp;Attributes&nbsp;is&nbsp;</span><span style="color: #000000">"</span><span style="color: #000000">+</span><span style="color: #000000">numOfOldAttributes);<br />
</span><span style="color: #008080">70</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(</span><span style="color: #000000">"</span><span style="color: #000000">new&nbsp;number&nbsp;of&nbsp;Attributes&nbsp;is&nbsp;</span><span style="color: #000000">"</span><span style="color: #000000">+</span><span style="color: #000000">numOfNewAttributes);<br />
</span><span style="color: #008080">71</span><span style="color: #000000"><img id="Codehighlighter1_2074_2124_Open_Image" onclick="this.style.display='none'; Codehighlighter1_2074_2124_Open_Text.style.display='none'; Codehighlighter1_2074_2124_Closed_Image.style.display='inline'; Codehighlighter1_2074_2124_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_2074_2124_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_2074_2124_Closed_Text.style.display='none'; Codehighlighter1_2074_2124_Open_Image.style.display='inline'; Codehighlighter1_2074_2124_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedSubBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">for</span><span style="color: #000000">(</span><span style="color: #0000ff">int</span><span style="color: #000000">&nbsp;i</span><span style="color: #000000">=</span><span style="color: #000000">0</span><span style="color: #000000">;i</span><span style="color: #000000">&lt;</span><span style="color: #000000">selectedAttributes.length;i</span><span style="color: #000000">++</span><span style="color: #000000">)</span><span id="Codehighlighter1_2074_2124_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff"><img src="http://www.blogjava.net/Images/dot.gif"  alt="" /></span><span id="Codehighlighter1_2074_2124_Open_Text"><span style="color: #000000">{<br />
</span><span style="color: #008080">72</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(selectedAttributes[i]);<br />
</span><span style="color: #008080">73</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockEnd.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}</span></span><span style="color: #000000"><br />
</span><span style="color: #008080">74</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockEnd.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;}</span></span><span style="color: #000000"><br />
</span><span style="color: #008080">75</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" /><br />
</span><span style="color: #008080">76</span><span style="color: #000000"><img id="Codehighlighter1_2131_2154_Open_Image" onclick="this.style.display='none'; Codehighlighter1_2131_2154_Open_Text.style.display='none'; Codehighlighter1_2131_2154_Closed_Image.style.display='inline'; Codehighlighter1_2131_2154_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_2131_2154_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_2131_2154_Closed_Text.style.display='none'; Codehighlighter1_2131_2154_Open_Image.style.display='inline'; Codehighlighter1_2131_2154_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedSubBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span id="Codehighlighter1_2131_2154_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff">/**&nbsp;*/</span><span id="Codehighlighter1_2131_2154_Open_Text"><span style="color: #008000">/**</span><span style="color: #008000"><br />
</span><span style="color: #008080">77</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*&nbsp;</span><span style="color: #808080">@param</span><span style="color: #008000">&nbsp;args<br />
</span><span style="color: #008080">78</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockEnd.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">*/</span></span><span style="color: #000000"><br />
</span><span style="color: #008080">79</span><span style="color: #000000"><img id="Codehighlighter1_2196_2529_Open_Image" onclick="this.style.display='none'; Codehighlighter1_2196_2529_Open_Text.style.display='none'; Codehighlighter1_2196_2529_Closed_Image.style.display='inline'; Codehighlighter1_2196_2529_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_2196_2529_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_2196_2529_Closed_Text.style.display='none'; Codehighlighter1_2196_2529_Open_Image.style.display='inline'; Codehighlighter1_2196_2529_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedSubBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">public</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">static</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">void</span><span style="color: #000000">&nbsp;main(String[]&nbsp;args)&nbsp;</span><span id="Codehighlighter1_2196_2529_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff"><img src="http://www.blogjava.net/Images/dot.gif"  alt="" /></span><span id="Codehighlighter1_2196_2529_Open_Text"><span style="color: #000000">{<br />
</span><span style="color: #008080">80</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">&nbsp;TODO&nbsp;Auto-generated&nbsp;method&nbsp;stub</span><span style="color: #008000"><br />
</span><span style="color: #008080">81</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" /></span><span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File&nbsp;file&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;File(</span><span style="color: #000000">"</span><span style="color: #000000">iris.arff</span><span style="color: #000000">"</span><span style="color: #000000">);<br />
</span><span style="color: #008080">82</span><span style="color: #000000"><img id="Codehighlighter1_2278_2347_Open_Image" onclick="this.style.display='none'; Codehighlighter1_2278_2347_Open_Text.style.display='none'; Codehighlighter1_2278_2347_Closed_Image.style.display='inline'; Codehighlighter1_2278_2347_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_2278_2347_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_2278_2347_Closed_Text.style.display='none'; Codehighlighter1_2278_2347_Open_Image.style.display='inline'; Codehighlighter1_2278_2347_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedSubBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000ff">try</span><span style="color: #000000">&nbsp;</span><span id="Codehighlighter1_2278_2347_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff"><img src="http://www.blogjava.net/Images/dot.gif"  alt="" /></span><span id="Codehighlighter1_2278_2347_Open_Text"><span style="color: #000000">{<br />
</span><span style="color: #008080">83</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;WekaSelector&nbsp;ws&nbsp;</span><span style="color: #000000">=</span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">new</span><span style="color: #000000">&nbsp;WekaSelector(file);<br />
</span><span style="color: #008080">84</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ws.select();<br />
</span><span style="color: #008080">85</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br />
</span><span style="color: #008080">86</span><span style="color: #000000"><img id="Codehighlighter1_2371_2437_Open_Image" onclick="this.style.display='none'; Codehighlighter1_2371_2437_Open_Text.style.display='none'; Codehighlighter1_2371_2437_Closed_Image.style.display='inline'; Codehighlighter1_2371_2437_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_2371_2437_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_2371_2437_Closed_Text.style.display='none'; Codehighlighter1_2371_2437_Open_Image.style.display='inline'; Codehighlighter1_2371_2437_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedSubBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}</span></span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">catch</span><span style="color: #000000">&nbsp;(IOException&nbsp;e)&nbsp;</span><span id="Codehighlighter1_2371_2437_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff"><img src="http://www.blogjava.net/Images/dot.gif"  alt="" /></span><span id="Codehighlighter1_2371_2437_Open_Text"><span style="color: #000000">{<br />
</span><span style="color: #008080">87</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">&nbsp;TODO&nbsp;Auto-generated&nbsp;catch&nbsp;block</span><span style="color: #008000"><br />
</span><span style="color: #008080">88</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" /></span><span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;e.printStackTrace();<br />
</span><span style="color: #008080">89</span><span style="color: #000000"><img id="Codehighlighter1_2459_2525_Open_Image" onclick="this.style.display='none'; Codehighlighter1_2459_2525_Open_Text.style.display='none'; Codehighlighter1_2459_2525_Closed_Image.style.display='inline'; Codehighlighter1_2459_2525_Closed_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockStart.gif" align="top"  alt="" /><img id="Codehighlighter1_2459_2525_Closed_Image" style="display: none" onclick="this.style.display='none'; Codehighlighter1_2459_2525_Closed_Text.style.display='none'; Codehighlighter1_2459_2525_Open_Image.style.display='inline'; Codehighlighter1_2459_2525_Open_Text.style.display='inline';" src="http://www.blogjava.net/images/OutliningIndicators/ContractedSubBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}</span></span><span style="color: #000000">&nbsp;</span><span style="color: #0000ff">catch</span><span style="color: #000000">&nbsp;(Exception&nbsp;e)&nbsp;</span><span id="Codehighlighter1_2459_2525_Closed_Text" style="border-right: #808080 1px solid; border-top: #808080 1px solid; display: none; border-left: #808080 1px solid; border-bottom: #808080 1px solid; background-color: #ffffff"><img src="http://www.blogjava.net/Images/dot.gif"  alt="" /></span><span id="Codehighlighter1_2459_2525_Open_Text"><span style="color: #000000">{<br />
</span><span style="color: #008080">90</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000">//</span><span style="color: #008000">&nbsp;TODO&nbsp;Auto-generated&nbsp;catch&nbsp;block</span><span style="color: #008000"><br />
</span><span style="color: #008080">91</span><span style="color: #008000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" /></span><span style="color: #000000">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;e.printStackTrace();<br />
</span><span style="color: #008080">92</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockEnd.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}</span></span><span style="color: #000000"><br />
</span><span style="color: #008080">93</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" /><br />
</span><span style="color: #008080">94</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/ExpandedSubBlockEnd.gif" align="top"  alt="" />&nbsp;&nbsp;&nbsp;&nbsp;}</span></span><span style="color: #000000"><br />
</span><span style="color: #008080">95</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/InBlock.gif" align="top"  alt="" /><br />
</span><span style="color: #008080">96</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/ExpandedBlockEnd.gif" align="top"  alt="" />}</span></span><span style="color: #000000"><br />
</span><span style="color: #008080">97</span><span style="color: #000000"><img src="http://www.blogjava.net/images/OutliningIndicators/None.gif" align="top"  alt="" /></span></div>
<br />
其中的注释部分是使用交叉验证的部分。默认是十折交叉验证，当然这个可以通过set方法设置。具体的使用或者用到reduce dimensionality的方法大家可以参看源代码。毕竟weka开源很是方便。源代码涉及到的类主要是查看weka.attributeSelection.AttributeSelection类就可以了。当然如何调用和选择可以看看weka.gui.explorer.AttributeSelectionPanel类。<br />
<br />
上面代码的实验结果如下：<br />
<p>result is </p>
<p>=== Attribute Selection on all input data ===</p>
<p>Search Method:<br />
&nbsp;Best first.<br />
&nbsp;Start set: no attributes<br />
&nbsp;Search direction: forward<br />
&nbsp;Stale search after 5 node expansions<br />
&nbsp;Total number of subsets evaluated: 12<br />
&nbsp;Merit of best subset found:&nbsp;&nbsp;&nbsp; 0.887</p>
<p>Attribute Subset Evaluator (supervised, Class (nominal): 5 class):<br />
&nbsp;CFS Subset Evaluator<br />
&nbsp;Including locally predictive attributes</p>
<p>Selected attributes: 3,4 : 2<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; petallength<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; petalwidth</p>
<p>old number of Attributes is 5<br />
new number of Attributes is 2<br />
2<br />
3<br />
4<br />
<br />
原来的iris数据集中共有4个属性（包含一个分类类标所以一共5维），经过特征选择后，只有第3和第4两个维度的特征保留，所以新特征子集有两个维度（不包含类标，有点绕，不好意思，我总是这样）。<br />
最后的2，3，4是属性数组的下标，表示经过特征选择保留的属性子集是第3，4，5个属性。<br />
</p>
<img src ="http://www.blogjava.net/changedi/aggbug/338756.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/changedi/" target="_blank">changedi</a> 2010-11-23 10:06 <a href="http://www.blogjava.net/changedi/archive/2010/11/23/338756.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>weka的java使用(2)——分类</title><link>http://www.blogjava.net/changedi/archive/2010/11/04/337197.html</link><dc:creator>changedi</dc:creator><author>changedi</author><pubDate>Thu, 04 Nov 2010 01:51:00 GMT</pubDate><guid>http://www.blogjava.net/changedi/archive/2010/11/04/337197.html</guid><wfw:comment>http://www.blogjava.net/changedi/comments/337197.html</wfw:comment><comments>http://www.blogjava.net/changedi/archive/2010/11/04/337197.html#Feedback</comments><slash:comments>2</slash:comments><wfw:commentRss>http://www.blogjava.net/changedi/comments/commentRss/337197.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/changedi/services/trackbacks/337197.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp; 摘要: 书接上文，既然写了聚类，再把我用到的分类的相关代码奉上。&nbsp;&nbsp;1/**&nbsp;*//**&nbsp;&nbsp;2&nbsp;*&nbsp;&nbsp;&nbsp;3&nbsp;*/&nbsp;&nbsp;4package&nbsp;edu.tju.ikse.mi.util;&nbsp;&nbsp;5&nbsp;&nbsp;6import&nbsp;j...&nbsp;&nbsp;<a href='http://www.blogjava.net/changedi/archive/2010/11/04/337197.html'>阅读全文</a><img src ="http://www.blogjava.net/changedi/aggbug/337197.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/changedi/" target="_blank">changedi</a> 2010-11-04 09:51 <a href="http://www.blogjava.net/changedi/archive/2010/11/04/337197.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>weka的java使用(1)——聚类</title><link>http://www.blogjava.net/changedi/archive/2010/11/04/337190.html</link><dc:creator>changedi</dc:creator><author>changedi</author><pubDate>Thu, 04 Nov 2010 01:24:00 GMT</pubDate><guid>http://www.blogjava.net/changedi/archive/2010/11/04/337190.html</guid><wfw:comment>http://www.blogjava.net/changedi/comments/337190.html</wfw:comment><comments>http://www.blogjava.net/changedi/archive/2010/11/04/337190.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/changedi/comments/commentRss/337190.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/changedi/services/trackbacks/337190.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp; 摘要: weka是著名的数据挖掘工具，在这里有详细介绍，IDMer老师的博客里也有比较详细的用法描述。当然，如果直接使用weka的工具，自然没有问题，但是如果想用weka的功能在自己的平台框架中呢？我这里放出一个当初对weka的源码学习过程，主要是如何调用weka的api。仅供参考，代码中有什么问题，欢迎邮件联系。这里简单讲解一下流程。构造方法首先载入一个arff文件，然后调用doCluster（）方...&nbsp;&nbsp;<a href='http://www.blogjava.net/changedi/archive/2010/11/04/337190.html'>阅读全文</a><img src ="http://www.blogjava.net/changedi/aggbug/337190.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/changedi/" target="_blank">changedi</a> 2010-11-04 09:24 <a href="http://www.blogjava.net/changedi/archive/2010/11/04/337190.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>贝叶斯决策——总结笔记</title><link>http://www.blogjava.net/changedi/archive/2010/09/15/332059.html</link><dc:creator>changedi</dc:creator><author>changedi</author><pubDate>Wed, 15 Sep 2010 03:23:00 GMT</pubDate><guid>http://www.blogjava.net/changedi/archive/2010/09/15/332059.html</guid><wfw:comment>http://www.blogjava.net/changedi/comments/332059.html</wfw:comment><comments>http://www.blogjava.net/changedi/archive/2010/09/15/332059.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/changedi/comments/commentRss/332059.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/changedi/services/trackbacks/332059.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;贝叶斯决策论的基本思想非常简单。为最小化总风险，总是选择那些能够最小化条件风险R(<span style="font-family: symbol">a</span>|x)的行为。尤其是，为了最小化分类问题中的误差概率，总是选择那些使后验概率P(<span style="font-family: symbol">w</span>j|x)最大的类别。贝叶斯公式允许我们通过先验概率P(<span style="font-family: symbol">w</span>j)和条件密度p(x|<span style="font-family: symbol">w</span>j)来计算后验概率。如果对在模式<span style="font-family: symbol">w</span>j中所做的误分的惩罚与模式<span style="font-family: symbol">w</span>j的不同，那么在做出判决行为之前，必须先根据该惩罚函数对后验概率加权。<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;如果内在分布为多元的高斯分布，判决边界将是超二次型，其形状和位置取决于先验概率、该分布的均值和协方差。实际的期望误差率的上界可由Chernoff界和计算上较简单的Bhattacharyya界来确定。如果其输入测试模式具有丢失或遭到破坏的特征量，必须通过在这些特征量上积分来形成边缘分布，然后将贝叶斯决策过程用于其所得分布上。<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;而实际操作中，我们得到的多是包含各种属性的特征数据，从中定义风险函数、先验概率和条件概率往往是重要的前提操作。这样在给定了有限数据的情况下，这些概率的获取就是统计的事情了。下一步问题就是获取这些概率，那么常用的方法就是最大似然估计和贝叶斯参数估计了。 
<img src ="http://www.blogjava.net/changedi/aggbug/332059.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/changedi/" target="_blank">changedi</a> 2010-09-15 11:23 <a href="http://www.blogjava.net/changedi/archive/2010/09/15/332059.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>