﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>BlogJava-全世界的屋顶-文章分类-Web Data Mining</title><link>http://www.blogjava.net/honeybee/category/30856.html</link><description /><language>zh-cn</language><lastBuildDate>Fri, 18 Apr 2008 11:00:26 GMT</lastBuildDate><pubDate>Fri, 18 Apr 2008 11:00:26 GMT</pubDate><ttl>60</ttl><item><title>weka学习（安装和部署）</title><link>http://www.blogjava.net/honeybee/articles/193605.html</link><dc:creator>sun</dc:creator><author>sun</author><pubDate>Wed, 16 Apr 2008 16:15:00 GMT</pubDate><guid>http://www.blogjava.net/honeybee/articles/193605.html</guid><wfw:comment>http://www.blogjava.net/honeybee/comments/193605.html</wfw:comment><comments>http://www.blogjava.net/honeybee/articles/193605.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/honeybee/comments/commentRss/193605.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/honeybee/services/trackbacks/193605.html</trackback:ping><description><![CDATA[<p><span style="font-family: 宋体"><span style="font-size: 14pt"><span style="font-size: 12pt">&nbsp;</p>
<p style="background: white; margin: 6pt 0cm; text-indent: 21pt; line-height: 150%; text-align: left" align="left"><span style="line-height: 150%; font-family: 宋体">最近的工作重点是Web Data Mining, 经过近一周的Paper学习后，对于Web日志的挖掘有了一些想法。下面就应该是尽快进行实践。</span></p>
<p style="background: white; margin: 6pt 0cm; text-indent: 21pt; line-height: 150%; text-align: left" align="left"><span style="line-height: 150%; font-family: 宋体">于是，今天利用晚上的时间，成功安装了Weka(version 3.4.12)，对于Weka的安装，由于Weka是一个数据挖掘软件，当然需要和数据库进行连接，因此需要下载驱动，常用的及其支持的有：MySQL, HSQL Database, Mckoi SQL Database, RmiJdbc, 需要注意以下几点：</span></p>
<p style="background: white; margin: 6pt 0cm; text-indent: 21pt; line-height: 150%; text-align: left" align="left"><span style="line-height: 150%; font-family: 宋体">一.正常情况下，要在CLASSPATH添加上面下载的数据驱动jar包，但目前的问题是即使正确添加，也会提示&#8220;Trying to add JDBC driver: ***Driver - Error, not in CLASSPATH?&#8221;等类似的语句（我用的是Windows系统，Linux有待于做实验确认），所以建议直接在命令行输入路径信息，如：java &#8211;Xmx128m &#8211;classpath "hsqldb.jar;mysql-connector-java-5.15.bin.jar;RmiJdbc.jar;mkjdbc.jar;weka.jar" weka.gui.GUIChooser （注：我将这些数据驱动jia包放在了Weka安装目录下）</span></p>
<p style="background: white; margin: 6pt 0cm; text-indent: 21pt; line-height: 150%; text-align: left" align="left"><span style="line-height: 150%; font-family: 宋体">二.Weka(Version3.4.12)对于RmiJdbc，一定选择版本2.5（版本3.3，3.2，3.05我下载后添加依然提示Trying to add JDBC driver:</span><span style="line-height: 150%; font-family: 宋体">RmiJdbc.RJDriver - Error, not in CLASSPATH?</span><span style="line-height: 150%; font-family: 宋体">错误，1.0版本同样也不行）；对于Weka(version 3.5.5) 对于RmiJdbc，一定选择版本3.05或2.5。</span></p>
</span></span></span>
<p style="margin: 6pt 0cm 6pt 18pt; text-indent: -18pt; line-height: 150%; tab-stops: list 18.0pt"><span style="font-size: 10.5pt; line-height: 150%"><br />
</span><span style="font-size: 10.5pt; line-height: 150%"><span style="font-family: 宋体"><span style="font-size: 14pt"><span style="font-size: 12pt">下面是对于Weka学习的一个日程安排，以做备忘：<br />
1．下载和安装Weka (4.16-4.21)<br />
2．按照参考<a title="ppt" href="http://prdownloads.sourceforge.net/weka/Weka_a_tool_for_exploratory_data_mining.ppt">ppt</a></span></span></span><span style="font-family: 宋体"><span style="font-size: 14pt"><span style="font-size: 12pt">提供的例子跑通clustering算法，并且了解它的各项意义(4.21-4.30)&nbsp;<br />
3．找个复杂的例子（下载数据集<span lang="EN-US" style="font-size: 12pt; font-family: 'Times New Roman'; mso-fareast-font-family: 宋体; mso-font-kerning: 1.0pt; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA"><a href="http://www.cs.waikato.ac.nz/ml/weka/index_datasets.html">http://www.cs.waikato.ac.nz/ml/weka/index_datasets.html</a></span></span></span></span></a></a><span style="font-family: 宋体"><span style="font-size: 14pt"><span style="font-size: 12pt">）跑通并解释其数据意义(5.1-5.6)<br />
4．把一个Clustering算法改写成Hadoop代码运行在服务器上(5.6-5.20)</span></span></span></span></p>
<img src ="http://www.blogjava.net/honeybee/aggbug/193605.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/honeybee/" target="_blank">sun</a> 2008-04-17 00:15 <a href="http://www.blogjava.net/honeybee/articles/193605.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>