﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>BlogJava-SIMONE-随笔分类-hbase</title><link>http://www.blogjava.net/wangxinsh55/category/53334.html</link><description /><language>zh-cn</language><lastBuildDate>Tue, 16 Sep 2014 21:04:32 GMT</lastBuildDate><pubDate>Tue, 16 Sep 2014 21:04:32 GMT</pubDate><ttl>60</ttl><item><title>(转)Hadoop 解除 "Name node is in safe mode"  </title><link>http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397110.html</link><dc:creator>SIMONE</dc:creator><author>SIMONE</author><pubDate>Thu, 28 Mar 2013 08:55:00 GMT</pubDate><guid>http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397110.html</guid><wfw:comment>http://www.blogjava.net/wangxinsh55/comments/397110.html</wfw:comment><comments>http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397110.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/wangxinsh55/comments/commentRss/397110.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/wangxinsh55/services/trackbacks/397110.html</trackback:ping><description><![CDATA[<div>http://39382728.blog.163.com/blog/static/35360069201182710565420/</div><div><p>运行hadoop程序时， 中途我把它终止了，然后再向hdfs加文件或删除文件时，出现Name node is in safe mode错误：<br />rmr: org.apache.hadoop.dfs.SafeModeException: Cannot delete /user/hadoop/input. Name node is in safe mode</p>  <p>解决的命令：</p>  <p>bin/hadoop dfsadmin -safemode leave #关闭safe mode</p>  <p><br /></p>  <p>转自：&nbsp;<a rel="nofollow" href="http://shutiao2008.iteye.com/blog/318950">http://shutiao2008.iteye.com/blog/318950</a></p>  <p>&nbsp;</p>  <p>附 安全模式 学习：</p>  safemode模式<br />NameNode在启动的时候首先进入安全模式，如果datanode丢失的block达到一定的比例（1-dfs.safemode.threshold.pct），则系统会一直处于安全模式状态即只读状态。<br />dfs.safemode.threshold.pct（缺省值0.999f）表示HDFS启动的时候，如果DataNode上报的block个数达到了元数据记录的block个数的0.999倍才可以离开安全模式，否则一直是这种只读模式。如果设为1则HDFS永远是处于SafeMode。<br />下面这行摘录自NameNode启动时的日志（block上报比例1达到了阀值0.9990）<br />The ratio of reported blocks 1.0000 has reached the threshold 0.9990. Safe mode will be turned off automatically in 18 seconds.<br />hadoop dfsadmin -safemode leave<br />有两个方法离开这种安全模式<br />1. 修改dfs.safemode.threshold.pct为一个比较小的值，缺省是0.999。<br />2. hadoop dfsadmin -safemode leave命令强制离开<br />http://bbs.hadoopor.com/viewthread.php?tid=61&amp;extra=page%3D1<br />－－－－－－－－－－－－－－－－－－－－－－－－－－－－－<br />Safe mode is exited when the minimal replication condition is reached, plus an extension<br />time of 30 seconds. The minimal replication condition is when 99.9% of the blocks in<br />the whole filesystem meet their minimum replication level (which defaults to one, and<br />is set by dfs.replication.min).<br />安全模式的退出前提 - 整个文件系统中的99.9%（默认是99.9%，可以通过dfs.safemode.threshold.pct设置）的Blocks达到最小备份级别(默认是1，可以通过dfs.replication.min设置)。<br />dfs.safemode.threshold.pct&nbsp; &nbsp;&nbsp; &nbsp;float&nbsp; &nbsp;&nbsp; &nbsp; 0.999 <br />The proportion of blocks in the system that must meet the minimum<br />replication level defined by dfs.rep lication.min before the namenode<br />will exit safe mode. Setting<br />this value to 0 or less forces the name-node not to start in safe mode.<br />Setting this value to more than 1 means the namenode never exits safe<br />mode.<br />－－－－－－－－－－－－－－－－－－－－－－－－－－－－－<br />用户可以通过dfsadmin -safemode value&nbsp;&nbsp;来操作安全模式，参数value的说明如下：<br />enter - 进入安全模式<br />leave - 强制NameNode离开安全模式<br />get -&nbsp;&nbsp;返回安全模式是否开启的信息<br />wait - 等待，一直到安全模式结束。</div><img src ="http://www.blogjava.net/wangxinsh55/aggbug/397110.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/wangxinsh55/" target="_blank">SIMONE</a> 2013-03-28 16:55 <a href="http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397110.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>hadoop集群处理</title><link>http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397108.html</link><dc:creator>SIMONE</dc:creator><author>SIMONE</author><pubDate>Thu, 28 Mar 2013 08:29:00 GMT</pubDate><guid>http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397108.html</guid><wfw:comment>http://www.blogjava.net/wangxinsh55/comments/397108.html</wfw:comment><comments>http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397108.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/wangxinsh55/comments/commentRss/397108.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/wangxinsh55/services/trackbacks/397108.html</trackback:ping><description><![CDATA[<div>详细内容查看这个<div>http://www.cnblogs.com/xia520pi/</div>这个博客的内容</div><img src ="http://www.blogjava.net/wangxinsh55/aggbug/397108.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/wangxinsh55/" target="_blank">SIMONE</a> 2013-03-28 16:29 <a href="http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397108.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>成功编译hadoop eclipse插件。此方式适用于win 7</title><link>http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397107.html</link><dc:creator>SIMONE</dc:creator><author>SIMONE</author><pubDate>Thu, 28 Mar 2013 08:26:00 GMT</pubDate><guid>http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397107.html</guid><wfw:comment>http://www.blogjava.net/wangxinsh55/comments/397107.html</wfw:comment><comments>http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397107.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/wangxinsh55/comments/commentRss/397107.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/wangxinsh55/services/trackbacks/397107.html</trackback:ping><description><![CDATA[<div><p>&nbsp;</p><div>http://www.07net01.com/linux/tongguoeclipsexiangmubianyi_hadoop_1_0_3_eclipse_4_2___juno___plugin_20146_1350034161.html</div><br /><p>&nbsp;</p><p><br /></p><p>流程如下：</p><p>1.下载hadoop 1.0.3 （http://hadoop.apache.org/releases.html#Download），解压在自定义的一个目录中（最好全英文路径，试过中文路径出了问题）。</p><p>2.Eclipse导入..\hadoop-1.0.3\src\contrib\eclipse-plugin项目，默认项目是MapReduceTools。</p><p>3. 在项目MapReduceTools中新建lib目录，并把hadoop的hadoop-core（由hadoop根目录的hadoop-*.jar改名 获得）、commons-cli-1.2.jar、commons-lang-2.4.jar、commons-configuration- 1.6.jar、jackson-mapper-asl-1.8.8.jar、jackson-core-asl-1.8.8.jar、commons- httpclient-3.0.1.jar拷贝到该目录。</p><p>4.修改上级目录中的build-contrib.xml：</p><p>	找到&lt;property name="hadoop.root" location="${root}/../../../"/&gt;修改location为hadoop1.0.3实际解压目录，在其下添加</p><p>      &lt;property name="eclipse.home" location="D:/Program Files/eclipse"/&gt;</p><p>      &lt;property name="version" value="http://x-goder.iteye.com/blog/1.0.3"/&gt;</p><p>5.修改项目目录下的build.xml：</p><p>  &lt;target name="jar" depends="compile" unless="skip.contrib"&gt;</p><p>    &lt;mkdir dir="${build.dir}/lib"/&gt;</p><p>    &lt;copy file="${hadoop.root}/hadoop-core-${version}.jar" tofile="${build.dir}/lib/hadoop-core.jar" verbose="true"/&gt;</p><p>    &lt;copy file="${hadoop.root}/lib/commons-cli-1.2.jar"  todir="${build.dir}/lib" verbose="true"/&gt;</p><p>    &lt;copy file="${hadoop.root}/lib/commons-lang-2.4.jar"  todir="${build.dir}/lib" verbose="true"/&gt;</p><p>    &lt;copy file="${hadoop.root}/lib/commons-configuration-1.6.jar"  todir="${build.dir}/lib" verbose="true"/&gt;</p><p>    &lt;copy file="${hadoop.root}/lib/jackson-mapper-asl-1.8.8.jar"  todir="${build.dir}/lib" verbose="true"/&gt;</p><p>    &lt;copy file="${hadoop.root}/lib/jackson-core-asl-1.8.8.jar"  todir="${build.dir}/lib" verbose="true"/&gt;</p><p>    &lt;copy file="${hadoop.root}/lib/commons-httpclient-3.0.1.jar"  todir="${build.dir}/lib" verbose="true"/&gt;</p><p>    &lt;jar</p><p>      jarfile="${build.dir}/hadoop-${name}-${version}.jar"</p><p>      manifest="${root}/META-INF/MANIFEST.MF"&gt;</p><p>      &lt;fileset dir="${build.dir}" includes="classes/ lib/"/&gt;</p><p>      &lt;fileset dir="${root}" includes="resources/ plugin.xml"/&gt;</p><p>    &lt;/jar&gt;</p><p>  &lt;/target&gt;</p><p>6.右键eclipse里的build.xml选择run as - ant build。</p><p>如果出现：&#8220;软件包org.apache.hadoop.fs 不存在&#8221;的错误则修改build.xml：</p><p> &lt;path id="hadoop-jars"&gt;</p><p>       &lt;fileset dir="${hadoop.root}/"&gt;</p><p>          &lt;include name="hadoop-*.jar"/&gt;</p><p>       &lt;/fileset&gt; </p><p> &lt;/path&gt;</p><p>在&lt;path id="classpath"&gt;中添加：&lt;path refid="hadoop-jars"/&gt;</p><p>7.等Ant编译完毕后。编译后的文件在：\build\contrib 中的 hadoop-eclipse-plugin-1.0.3.jar。</p><p>8.查看编译好的jar包下META-INF/MANIFEST.MF 下的配置属性是否完整，如果不完整，补充完整。</p><p><div>Bundle-ClassPath: classes/,lib/hadoop-core.jar,lib/commons-cli-1.2.jar<br />&nbsp;,lib/commons-lang-2.4.jar,lib/commons-configuration-1.6.jar,lib/jacks<br />&nbsp;on-mapper-asl-1.8.8.jar,lib/jackson-core-asl-1.8.8.jar,lib/commons-ht<br />&nbsp;tpclient-3.0.1.jar</div><br /></p><p>9.放入eclipse/plugins下，重启eclipse，查看是否安装成功。</p></div><img src ="http://www.blogjava.net/wangxinsh55/aggbug/397107.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/wangxinsh55/" target="_blank">SIMONE</a> 2013-03-28 16:26 <a href="http://www.blogjava.net/wangxinsh55/archive/2013/03/28/397107.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>在HBase里使用MapReduce例子</title><link>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395572.html</link><dc:creator>SIMONE</dc:creator><author>SIMONE</author><pubDate>Fri, 22 Feb 2013 06:12:00 GMT</pubDate><guid>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395572.html</guid><wfw:comment>http://www.blogjava.net/wangxinsh55/comments/395572.html</wfw:comment><comments>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395572.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/wangxinsh55/comments/commentRss/395572.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/wangxinsh55/services/trackbacks/395572.html</trackback:ping><description><![CDATA[<div>http://jeffxie.blog.51cto.com/1365360/305538</div><br /><div>我在<a href="http://www.nabble.com/Hadoop-lucene-users-f17067.html" target="_blank"><span href="http://hadoop.hadoopor.com/tag.php?name=Hadoop">Hadoop</span>的用户邮件列表中</a>看到一些国内的用 户在讯问一些关于如何操作的<span href="http://hadoop.hadoopor.com/tag.php?name=HBase">HBase</span>的<span href="http://hadoop.hadoopor.com/tag.php?name=%CE%CA%CC%E2">问题</span>，还看到了HBase中没有Example。觉得有 必要跟大家<span href="http://hadoop.hadoopor.com/tag.php?name=%B7%D6%CF%ED">分享</span>自己的经验。<br /> 在下面的例子中我们分析<span href="http://hadoop.hadoopor.com/tag.php?name=Apache">Apache</span>的log并把这些log进行分析并把分析完 的结果按用户IP为ROW，把log中用户的访问时间，请求方法，用户请求的协议，用户的浏览器，服务状态等写到HBase的表中。<br /> <br /> <br /> 首先我们要在HBase中建立我们的一个表来存储<span href="http://hadoop.hadoopor.com/tag.php?name=%CA%FD%BE%DD">数据</span>。&nbsp;&nbsp;<div><div id="code0"><ol><li>public static void  creatTable(String table) throws IOException{<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;HConnection conn =  HConnectionManager.getConnection(conf);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;HBaseAdmin admin = new HBaseAdmin(conf);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;if(!admin.tableExists(new Text(table))){<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; System.out.println("1. " + table + " table  creating ... please wait");<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; HTableDescriptor tableDesc = new  HTableDescriptor(table);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; tableDesc.addFamily(new  HColumnDescriptor("http:"));<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; tableDesc.addFamily(new  HColumnDescriptor("url:"));<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; tableDesc.addFamily(new  HColumnDescriptor("referrer:"));<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; admin.createTable(tableDesc);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;} else {<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; System.out.println("1. " + table + " table  already exists.");<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;}<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;System.out.println("2. access_log <span href="http://hadoop.hadoopor.com/tag.php?name=file">file</span>s fetching using  map/reduce");<br /> </li><li>&nbsp;&nbsp;}</li></ol></div><em>复制代码</em></div><br /> 然后我们<span href="http://hadoop.hadoopor.com/tag.php?name=%D4%CB%D0%D0">运行</span>一个MapReduce任务来取得log中的每一行 数据。因为我们只要取得数据而不需要对结果进行规约，我们只要编写一个Map<span href="http://hadoop.hadoopor.com/tag.php?name=%B3%CC%D0%F2">程序</span>即可。&nbsp; &nbsp;<br /> <div><div id="code1"><ol><li>public static class  MapClass extends MapReduceBase implements<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;Mapper&lt;WritableComparable, Text, Text, Writable&gt; {<br /> </li><li> <br /> </li><li>&nbsp; &nbsp; @Override<br /> </li><li>&nbsp; &nbsp; public void configure(JobConf <span href="http://hadoop.hadoopor.com/tag.php?name=job">job</span>) {<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;tableName = job.get(TABLE, "");<br /> </li><li>&nbsp; &nbsp; }<br /> </li><li> <br /> </li><li>&nbsp; &nbsp; public void map(WritableComparable key, Text value,<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;OutputCollector&lt;Text, Writable&gt; output, Reporter  reporter)<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;throws IOException {<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;try {<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;&nbsp;Access<span href="http://hadoop.hadoopor.com/tag.php?name=Log">Log</span>Parser log = new  AccessLogParser(value.toString());<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;if(table==null)<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; table = new HTable(conf, new Text(tableName));<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;long lockId = table.startUpdate(new Text(log.getIp()));<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;table.put(lockId, new Text("http:protocol"),  log.getProtocol().getBytes());<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;table.put(lockId, new Text("http:method"),  log.getMethod().getBytes());<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;table.put(lockId, new Text("http:code"),  log.getCode().getBytes());<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;table.put(lockId, new Text("http:bytesize"),  log.getByteSize().getBytes());<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;table.put(lockId, new Text("http:agent"),  log.getAgent().getBytes());<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;table.put(lockId, new Text("url:" + log.getUrl()),  log.getReferrer().getBytes());<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;table.put(lockId, new Text("referrer:" +  log.getReferrer()), log.getUrl().getBytes());<br /> </li><li> <br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;table.commit(lockId, log.getTimestamp());<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;} catch (ParseException e) {<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;e.printStackTrace();<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;} catch (Exception e) {<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;e.printStackTrace();<br /> </li><li>&nbsp; &nbsp;&nbsp; &nbsp;}<br /> </li><li>&nbsp; &nbsp; }<br /> </li><li>&nbsp;&nbsp;}</li></ol></div><em>复制代码</em></div><br /> 我们在Map程序中对于传进来的每一行先交给AccessLogParser去处理在AccessLogParser德构造器中用一个正则表达式"([^  ]*) ([^ ]*) ([^ ]*) \\[([^]]*)\\] \"([^\"]*)\"　" ([^ ]*) ([^ ]*)  \"([^\"]*)\"  \"([^\"]*)\".*"来匹配每一行的log。接下来我们把这些AccessLogParser处理出来的结果更新到HBase的表中去，好的， 我们的程序写完了。我们要启动一个MapReduce的话我们要对工作进行配置。&nbsp; &nbsp;<br /> <div><div id="code2"><ol><li>public static void  runMapReduce(String table,String dir) throws IOException{<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;Path tempDir = new Path("log/temp");<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;Path InputDir = new Path(dir);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;FileSystem fs = FileSystem.get(conf);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;JobConf jobConf = new JobConf(conf,  LogFetcher.class);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;jobConf.setJobName("apache log fetcher");<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;jobConf.set(TABLE, table);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;Path[] in = fs.listPaths(InputDir);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;if (fs.isFile(InputDir)) {<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; jobConf.setInputPath(InputDir);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;} else {<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; for (int i = 0; i &lt; in.length; i++) {<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;if (fs.isFile(in[i])) {<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;jobConf.addInputPath(in[i]);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;} else {<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;Path[] sub = fs.listPaths(in[i]);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;for (int j = 0; j &lt; sub.length; j++) {<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; if (fs.isFile(sub[j])) {<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;jobConf.addInputPath(sub[j]);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; }<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;}<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp;}<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp; &nbsp; }<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;}<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;jobConf.setOutputPath(tempDir);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;jobConf.setMapperClass(MapClass.class);<br /> </li><li> <br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;JobClient client = new JobClient(jobConf);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;<span href="http://hadoop.hadoopor.com/tag.php?name=Cluster">Cluster</span>Status cluster =  client.getClusterStatus();<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;jobConf.setNumMapTasks(cluster.getMapTasks());<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;jobConf.setNumReduceTasks(0);<br /> </li><li> <br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;JobClient.runJob(jobConf);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;fs.delete(tempDir);<br /> </li><li>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;&nbsp;&nbsp;fs.close();<br /> </li><li>&nbsp;&nbsp;}</li></ol></div><em>复制代码</em></div><br /> 在上面的代码中我们先产生一个jobConf对象，然后设定我们的InputPath和OutputPath，告诉MapReduce我们的Map类，设 定我们用多少个Map任务和Reduce任务，然后我们不任务提交给JobClient，关于MapReduce跟详细的<span href="http://hadoop.hadoopor.com/tag.php?name=%D7%CA%C1%CF">资料</span>在<a href="http://wiki.apache.org/hadoop/HadoopMapReduce" target="_blank">Hadoop  Wiki</a>上。<br /> <span href="http://hadoop.hadoopor.com/tag.php?name=%CF%C2%D4%D8">下载</span>：源码和已编译好的<span href="http://hadoop.hadoopor.com/tag.php?name=jar">jar</span>文件<a href="http://www.hadoop.org.cn/wp-content/uploads/2008/03/example-src.tgz" target="_blank">example-src.tgz</a><br /> 例子的运行命令是：<br /> <br /> bin/hadoop jar examples.jar logfetcher &lt;access_log file or  directory&gt; &lt;table_name&gt;<br /> <br /> 如何运行上面的应用程序呢？我们假定解压缩完Hadoop分发包的目录为%HADOOP%<br /> 拷贝%HADOOP%\contrib\hbase\bin下的文件到%HADOOP%\bin下,拷贝%HADOOP%\contrib\hbase \conf的文件到%HADOOP%\conf下,拷贝%HADOOP%\src\contrib\hbase\lib的文件到%HADOOP%\lib 下,拷贝%HADOOP%\src\contrib\hbase\hadoop-*-hbase.jar的文件到%HADOOP%\lib下.然后编辑配 置文件hbase-site.xml设定你的hbase.master例子：192.168.2.92:60000。把这些文件分发到运行Hadoop的 机器上去。在regionservers文件添加上这些已分发过的地址。运行bin/start-hbase.sh命令启动HBase，把你的 apache log文件拷贝到HDFS的apache-log目录下，等启动完成后运行下面的命令。<br /> <br /> bin/hadoop jar examples.jar logfetcher apache-log apache<br /> <br /> 访问<a href="http://localhost:50030/" target="_blank">http://localhost:50030/</a>能 看到你的MapReduce任务的运行情况，访问<a href="http://localhost:60010/" target="_blank">http://localhost:60010/</a>能 看到HBase的运行情况。<br /> <br />  <img src="http://hadoop.hadoopor.com/attachment.php?aid=MTI5fDFkMTUyYjRkfDEyNzIzNTE1NDh8YmRjMmRnWmFZc1R4Nm13SnliaWRGTzNjbzltdWdBdEJWWUdSN3ZBbEJ3VFFYYVU%3D&amp;noupdate=yes" id="aimg_129" alt="hbaseguiinterface.jpg" height="474" width="500" />  <br /> <br /> 等任务MapReduce完成后访问<a href="http://localhost:60010/hql.jsp" target="_blank">http://localhost:60010/<span href="http://hadoop.hadoopor.com/tag.php?name=hql">hql</span>.jsp</a>,在Query输入框中输入 SELECT * FROM apache limit=50;。将会看到已经插入表中的数据。  <img src="http://hadoop.hadoopor.com/attachment.php?aid=MTI4fDBhMWJkZmQ5fDEyNzIzNTE1NDh8YmRjMmRnWmFZc1R4Nm13SnliaWRGTzNjbzltdWdBdEJWWUdSN3ZBbEJ3VFFYYVU%3D&amp;noupdate=yes" id="aimg_128" alt="hqlguiinterface.jpg" height="474" width="500" /></div><img src ="http://www.blogjava.net/wangxinsh55/aggbug/395572.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/wangxinsh55/" target="_blank">SIMONE</a> 2013-02-22 14:12 <a href="http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395572.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Hadoop学习笔记之在Eclipse中远程调试Hadoop</title><link>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395570.html</link><dc:creator>SIMONE</dc:creator><author>SIMONE</author><pubDate>Fri, 22 Feb 2013 06:06:00 GMT</pubDate><guid>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395570.html</guid><wfw:comment>http://www.blogjava.net/wangxinsh55/comments/395570.html</wfw:comment><comments>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395570.html#Feedback</comments><slash:comments>1</slash:comments><wfw:commentRss>http://www.blogjava.net/wangxinsh55/comments/commentRss/395570.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/wangxinsh55/services/trackbacks/395570.html</trackback:ping><description><![CDATA[<div>http://www.blogjava.net/yongboy/archive/2012/04/26/376486.html</div><br /><div><p><strong>插件</strong></p> <p>话说Hadoop 1.0.2/src/contrib/eclipse-plugin只有插件的源代码，这里给出一个我打包好的对应的Eclipse插件：<br /><a href="https://skydrive.live.com/redir.aspx?cid=cf7746837803bc50&amp;resid=CF7746837803BC50%211277&amp;parid=CF7746837803BC50%211274&amp;authkey=%21ACiM_IinIoEmTz8" target="_blank">下载地址</a></p> <p>下载后扔到eclipse/dropins目录下即可，当然eclipse/plugins也是可以的，前者更为轻便，推荐；重启Eclipse，即可在透视图(Perspective)中看到Map/Reduce。</p> <p><strong>配置</strong></p> <p>点击蓝色的小象图标，新建一个Hadoop连接：</p> <p><a href="http://www.blogjava.net/images/blogjava_net/yongboy/WindowsLiveWriter/hadoopEclipseHadoop_EEE6/2_2.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="2" alt="2" src="http://www.blogjava.net/images/blogjava_net/yongboy/WindowsLiveWriter/hadoopEclipseHadoop_EEE6/2_thumb.png" border="0" height="403" width="644" /></a> </p> <p>注意，一定要填写正确，修改了某些端口，以及默认运行的用户名等</p> <p>具体的设置，可见</p> <p>正常情况下，可以在项目区域可以看到</p> <p><a href="http://www.blogjava.net/images/blogjava_net/yongboy/WindowsLiveWriter/hadoopEclipseHadoop_EEE6/image_2.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" alt="image" src="http://www.blogjava.net/images/blogjava_net/yongboy/WindowsLiveWriter/hadoopEclipseHadoop_EEE6/image_thumb.png" border="0" height="363" width="644" /></a> </p> <p>这样可以正常的进行HDFS分布式文件系统的管理：上传，删除等操作。</p> <p>为下面测试做准备，需要先建了一个目录 user/root/input2，然后上传两个txt文件到此目录：</p> <p>intput1.txt 对应内容：Hello Hadoop Goodbye Hadoop </p> <p>intput2.txt 对应内容：Hello World Bye World </p> <p>HDFS的准备工作好了，下面可以开始测试了。</p> <p><strong>Hadoop工程</strong></p> <p>新建一个Map/Reduce Project工程，设定好本地的hadoop目录</p> <p><a href="http://www.blogjava.net/images/blogjava_net/yongboy/WindowsLiveWriter/hadoopEclipseHadoop_EEE6/1_2.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="1" alt="1" src="http://www.blogjava.net/images/blogjava_net/yongboy/WindowsLiveWriter/hadoopEclipseHadoop_EEE6/1_thumb.png" border="0" height="405" width="644" /></a></p> <p>新建一个测试类WordCountTest：</p><div id="gist2477347">       <div>         <div gist-syntax"="">      <div>     <table highlight"="" cellpadding="0" cellspacing="0">       <tbody><tr>         <td>           <span id="file-wordcounttest-java-L1" rel="file-wordcounttest-java-L1">1</span>           <span id="file-wordcounttest-java-L2" rel="file-wordcounttest-java-L2">2</span>           <span id="file-wordcounttest-java-L3" rel="file-wordcounttest-java-L3">3</span>           <span id="file-wordcounttest-java-L4" rel="file-wordcounttest-java-L4">4</span>           <span id="file-wordcounttest-java-L5" rel="file-wordcounttest-java-L5">5</span>           <span id="file-wordcounttest-java-L6" rel="file-wordcounttest-java-L6">6</span>           <span id="file-wordcounttest-java-L7" rel="file-wordcounttest-java-L7">7</span>           <span id="file-wordcounttest-java-L8" rel="file-wordcounttest-java-L8">8</span>           <span id="file-wordcounttest-java-L9" rel="file-wordcounttest-java-L9">9</span>           <span id="file-wordcounttest-java-L10" rel="file-wordcounttest-java-L10">10</span>           <span id="file-wordcounttest-java-L11" rel="file-wordcounttest-java-L11">11</span>           <span id="file-wordcounttest-java-L12" rel="file-wordcounttest-java-L12">12</span>           <span id="file-wordcounttest-java-L13" rel="file-wordcounttest-java-L13">13</span>           <span id="file-wordcounttest-java-L14" rel="file-wordcounttest-java-L14">14</span>           <span id="file-wordcounttest-java-L15" rel="file-wordcounttest-java-L15">15</span>           <span id="file-wordcounttest-java-L16" rel="file-wordcounttest-java-L16">16</span>           <span id="file-wordcounttest-java-L17" rel="file-wordcounttest-java-L17">17</span>           <span id="file-wordcounttest-java-L18" rel="file-wordcounttest-java-L18">18</span>           <span id="file-wordcounttest-java-L19" rel="file-wordcounttest-java-L19">19</span>           <span id="file-wordcounttest-java-L20" rel="file-wordcounttest-java-L20">20</span>           <span id="file-wordcounttest-java-L21" rel="file-wordcounttest-java-L21">21</span>           <span id="file-wordcounttest-java-L22" rel="file-wordcounttest-java-L22">22</span>           <span id="file-wordcounttest-java-L23" rel="file-wordcounttest-java-L23">23</span>           <span id="file-wordcounttest-java-L24" rel="file-wordcounttest-java-L24">24</span>           <span id="file-wordcounttest-java-L25" rel="file-wordcounttest-java-L25">25</span>           <span id="file-wordcounttest-java-L26" rel="file-wordcounttest-java-L26">26</span>           <span id="file-wordcounttest-java-L27" rel="file-wordcounttest-java-L27">27</span>           <span id="file-wordcounttest-java-L28" rel="file-wordcounttest-java-L28">28</span>           <span id="file-wordcounttest-java-L29" rel="file-wordcounttest-java-L29">29</span>           <span id="file-wordcounttest-java-L30" rel="file-wordcounttest-java-L30">30</span>           <span id="file-wordcounttest-java-L31" rel="file-wordcounttest-java-L31">31</span>           <span id="file-wordcounttest-java-L32" rel="file-wordcounttest-java-L32">32</span>           <span id="file-wordcounttest-java-L33" rel="file-wordcounttest-java-L33">33</span>           <span id="file-wordcounttest-java-L34" rel="file-wordcounttest-java-L34">34</span>           <span id="file-wordcounttest-java-L35" rel="file-wordcounttest-java-L35">35</span>           <span id="file-wordcounttest-java-L36" rel="file-wordcounttest-java-L36">36</span>           <span id="file-wordcounttest-java-L37" rel="file-wordcounttest-java-L37">37</span>           <span id="file-wordcounttest-java-L38" rel="file-wordcounttest-java-L38">38</span>           <span id="file-wordcounttest-java-L39" rel="file-wordcounttest-java-L39">39</span>           <span id="file-wordcounttest-java-L40" rel="file-wordcounttest-java-L40">40</span>           <span id="file-wordcounttest-java-L41" rel="file-wordcounttest-java-L41">41</span>           <span id="file-wordcounttest-java-L42" rel="file-wordcounttest-java-L42">42</span>           <span id="file-wordcounttest-java-L43" rel="file-wordcounttest-java-L43">43</span>           <span id="file-wordcounttest-java-L44" rel="file-wordcounttest-java-L44">44</span>           <span id="file-wordcounttest-java-L45" rel="file-wordcounttest-java-L45">45</span>           <span id="file-wordcounttest-java-L46" rel="file-wordcounttest-java-L46">46</span>           <span id="file-wordcounttest-java-L47" rel="file-wordcounttest-java-L47">47</span>           <span id="file-wordcounttest-java-L48" rel="file-wordcounttest-java-L48">48</span>           <span id="file-wordcounttest-java-L49" rel="file-wordcounttest-java-L49">49</span>           <span id="file-wordcounttest-java-L50" rel="file-wordcounttest-java-L50">50</span>           <span id="file-wordcounttest-java-L51" rel="file-wordcounttest-java-L51">51</span>           <span id="file-wordcounttest-java-L52" rel="file-wordcounttest-java-L52">52</span>           <span id="file-wordcounttest-java-L53" rel="file-wordcounttest-java-L53">53</span>           <span id="file-wordcounttest-java-L54" rel="file-wordcounttest-java-L54">54</span>           <span id="file-wordcounttest-java-L55" rel="file-wordcounttest-java-L55">55</span>           <span id="file-wordcounttest-java-L56" rel="file-wordcounttest-java-L56">56</span>           <span id="file-wordcounttest-java-L57" rel="file-wordcounttest-java-L57">57</span>           <span id="file-wordcounttest-java-L58" rel="file-wordcounttest-java-L58">58</span>           <span id="file-wordcounttest-java-L59" rel="file-wordcounttest-java-L59">59</span>           <span id="file-wordcounttest-java-L60" rel="file-wordcounttest-java-L60">60</span>           <span id="file-wordcounttest-java-L61" rel="file-wordcounttest-java-L61">61</span>           <span id="file-wordcounttest-java-L62" rel="file-wordcounttest-java-L62">62</span>           <span id="file-wordcounttest-java-L63" rel="file-wordcounttest-java-L63">63</span>           <span id="file-wordcounttest-java-L64" rel="file-wordcounttest-java-L64">64</span>           <span id="file-wordcounttest-java-L65" rel="file-wordcounttest-java-L65">65</span>           <span id="file-wordcounttest-java-L66" rel="file-wordcounttest-java-L66">66</span>           <span id="file-wordcounttest-java-L67" rel="file-wordcounttest-java-L67">67</span>           <span id="file-wordcounttest-java-L68" rel="file-wordcounttest-java-L68">68</span>           <span id="file-wordcounttest-java-L69" rel="file-wordcounttest-java-L69">69</span>           <span id="file-wordcounttest-java-L70" rel="file-wordcounttest-java-L70">70</span>           <span id="file-wordcounttest-java-L71" rel="file-wordcounttest-java-L71">71</span>           <span id="file-wordcounttest-java-L72" rel="file-wordcounttest-java-L72">72</span>           <span id="file-wordcounttest-java-L73" rel="file-wordcounttest-java-L73">73</span>           <span id="file-wordcounttest-java-L74" rel="file-wordcounttest-java-L74">74</span>           <span id="file-wordcounttest-java-L75" rel="file-wordcounttest-java-L75">75</span>           <span id="file-wordcounttest-java-L76" rel="file-wordcounttest-java-L76">76</span>           <span id="file-wordcounttest-java-L77" rel="file-wordcounttest-java-L77">77</span>           <span id="file-wordcounttest-java-L78" rel="file-wordcounttest-java-L78">78</span>           <span id="file-wordcounttest-java-L79" rel="file-wordcounttest-java-L79">79</span>           <span id="file-wordcounttest-java-L80" rel="file-wordcounttest-java-L80">80</span>           <span id="file-wordcounttest-java-L81" rel="file-wordcounttest-java-L81">81</span>           <span id="file-wordcounttest-java-L82" rel="file-wordcounttest-java-L82">82</span>           <span id="file-wordcounttest-java-L83" rel="file-wordcounttest-java-L83">83</span>           <span id="file-wordcounttest-java-L84" rel="file-wordcounttest-java-L84">84</span>           <span id="file-wordcounttest-java-L85" rel="file-wordcounttest-java-L85">85</span>           <span id="file-wordcounttest-java-L86" rel="file-wordcounttest-java-L86">86</span>           <span id="file-wordcounttest-java-L87" rel="file-wordcounttest-java-L87">87</span>         </td>         <td>           <pre><div id="file-wordcounttest-java-LC1">package com.hadoop.learn.test;</div><div id="file-wordcounttest-java-LC2">&nbsp;</div><div id="file-wordcounttest-java-LC3">import java.io.IOException;</div><div id="file-wordcounttest-java-LC4">import java.util.StringTokenizer;</div><div id="file-wordcounttest-java-LC5">&nbsp;</div><div id="file-wordcounttest-java-LC6">import org.apache.hadoop.conf.Configuration;</div><div id="file-wordcounttest-java-LC7">import org.apache.hadoop.fs.Path;</div><div id="file-wordcounttest-java-LC8">import org.apache.hadoop.io.IntWritable;</div><div id="file-wordcounttest-java-LC9">import org.apache.hadoop.io.Text;</div><div id="file-wordcounttest-java-LC10">import org.apache.hadoop.mapreduce.Job;</div><div id="file-wordcounttest-java-LC11">import org.apache.hadoop.mapreduce.Mapper;</div><div id="file-wordcounttest-java-LC12">import org.apache.hadoop.mapreduce.Reducer;</div><div id="file-wordcounttest-java-LC13">import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;</div><div id="file-wordcounttest-java-LC14">import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;</div><div id="file-wordcounttest-java-LC15">import org.apache.hadoop.util.GenericOptionsParser;</div><div id="file-wordcounttest-java-LC16">import org.apache.log4j.Logger;</div><div id="file-wordcounttest-java-LC17">&nbsp;</div><div id="file-wordcounttest-java-LC18">/**</div><div id="file-wordcounttest-java-LC19"> * 运行测试程序</div><div id="file-wordcounttest-java-LC20"> * </div><div id="file-wordcounttest-java-LC21"> * @author yongboy</div><div id="file-wordcounttest-java-LC22"> * @date 2012-04-16</div><div id="file-wordcounttest-java-LC23"> */</div><div id="file-wordcounttest-java-LC24">public class WordCountTest {</div><div id="file-wordcounttest-java-LC25">	private static final Logger log = Logger.getLogger(WordCountTest.class);</div><div id="file-wordcounttest-java-LC26">&nbsp;</div><div id="file-wordcounttest-java-LC27">	public static class TokenizerMapper extends</div><div id="file-wordcounttest-java-LC28">			Mapper&lt;Object, Text, Text, IntWritable&gt; {</div><div id="file-wordcounttest-java-LC29">		private final static IntWritable one = new IntWritable(1);</div><div id="file-wordcounttest-java-LC30">		private Text word = new Text();</div><div id="file-wordcounttest-java-LC31">&nbsp;</div><div id="file-wordcounttest-java-LC32">		public void map(Object key, Text value, Context context)</div><div id="file-wordcounttest-java-LC33">				throws IOException, InterruptedException {</div><div id="file-wordcounttest-java-LC34">			log.info("Map key : " + key);</div><div id="file-wordcounttest-java-LC35">			log.info("Map value : " + value);</div><div id="file-wordcounttest-java-LC36">			StringTokenizer itr = new StringTokenizer(value.toString());</div><div id="file-wordcounttest-java-LC37">			while (itr.hasMoreTokens()) {</div><div id="file-wordcounttest-java-LC38">				String wordStr = itr.nextToken();</div><div id="file-wordcounttest-java-LC39">				word.set(wordStr);</div><div id="file-wordcounttest-java-LC40">				log.info("Map word : " + wordStr);</div><div id="file-wordcounttest-java-LC41">				context.write(word, one);</div><div id="file-wordcounttest-java-LC42">			}</div><div id="file-wordcounttest-java-LC43">		}</div><div id="file-wordcounttest-java-LC44">	}</div><div id="file-wordcounttest-java-LC45">&nbsp;</div><div id="file-wordcounttest-java-LC46">	public static class IntSumReducer extends</div><div id="file-wordcounttest-java-LC47">			Reducer&lt;Text, IntWritable, Text, IntWritable&gt; {</div><div id="file-wordcounttest-java-LC48">		private IntWritable result = new IntWritable();</div><div id="file-wordcounttest-java-LC49">&nbsp;</div><div id="file-wordcounttest-java-LC50">		public void reduce(Text key, Iterable&lt;IntWritable&gt; values,</div><div id="file-wordcounttest-java-LC51">				Context context) throws IOException, InterruptedException {</div><div id="file-wordcounttest-java-LC52">			log.info("Reduce key : " + key);</div><div id="file-wordcounttest-java-LC53">			log.info("Reduce value : " + values);</div><div id="file-wordcounttest-java-LC54">			int sum = 0;</div><div id="file-wordcounttest-java-LC55">			for (IntWritable val : values) {</div><div id="file-wordcounttest-java-LC56">				sum += val.get();</div><div id="file-wordcounttest-java-LC57">			}</div><div id="file-wordcounttest-java-LC58">			result.set(sum);</div><div id="file-wordcounttest-java-LC59">			log.info("Reduce sum : " + sum);</div><div id="file-wordcounttest-java-LC60">			context.write(key, result);</div><div id="file-wordcounttest-java-LC61">		}</div><div id="file-wordcounttest-java-LC62">	}</div><div id="file-wordcounttest-java-LC63">&nbsp;</div><div id="file-wordcounttest-java-LC64">	public static void main(String[] args) throws Exception {</div><div id="file-wordcounttest-java-LC65">		Configuration conf = new Configuration();</div><div id="file-wordcounttest-java-LC66">		String[] otherArgs = new GenericOptionsParser(conf, args)</div><div id="file-wordcounttest-java-LC67">				.getRemainingArgs();</div><div id="file-wordcounttest-java-LC68">		if (otherArgs.length != 2) {</div><div id="file-wordcounttest-java-LC69">			System.err.println("Usage: WordCountTest &lt;in&gt; &lt;out&gt;");</div><div id="file-wordcounttest-java-LC70">			System.exit(2);</div><div id="file-wordcounttest-java-LC71">		}</div><div id="file-wordcounttest-java-LC72">&nbsp;</div><div id="file-wordcounttest-java-LC73">		Job job = new Job(conf, "word count");</div><div id="file-wordcounttest-java-LC74">		job.setJarByClass(WordCountTest.class);</div><div id="file-wordcounttest-java-LC75">&nbsp;</div><div id="file-wordcounttest-java-LC76">		job.setMapperClass(TokenizerMapper.class);</div><div id="file-wordcounttest-java-LC77">		job.setCombinerClass(IntSumReducer.class);</div><div id="file-wordcounttest-java-LC78">		job.setReducerClass(IntSumReducer.class);</div><div id="file-wordcounttest-java-LC79">		job.setOutputKeyClass(Text.class);</div><div id="file-wordcounttest-java-LC80">		job.setOutputValueClass(IntWritable.class);</div><div id="file-wordcounttest-java-LC81">&nbsp;</div><div id="file-wordcounttest-java-LC82">		FileInputFormat.addInputPath(job, new Path(otherArgs[0]));</div><div id="file-wordcounttest-java-LC83">		FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));</div><div id="file-wordcounttest-java-LC84">&nbsp;</div><div id="file-wordcounttest-java-LC85">		System.exit(job.waitForCompletion(true) ? 0 : 1);</div><div id="file-wordcounttest-java-LC86">	}</div><div id="file-wordcounttest-java-LC87">}</div></pre>         </td>       </tr>     </tbody></table>   </div>          </div>          <div>           <a href="https://gist.github.com/yongboy/2477347/raw/0f14795f27dd698c8b598613ed287a523d8aff1d/WordCountTest.java" style="float:right">view raw</a>           <a href="https://gist.github.com/yongboy/2477347#file-wordcounttest-java" style="float:right; margin-right:10px; color:#666;">WordCountTest.java</a>           <a href="https://gist.github.com/yongboy/2477347">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.         </div>       </div> </div>  <p>右键，选择&#8220;Run Configurations&#8221;,弹出窗口，点击&#8220;Arguments&#8221;选项卡,在&#8220;Program argumetns&#8221;处预先输入参数:</p> <blockquote> <p>hdfs://master:9000/user/root/input2 dfs://master:9000/user/root/output2</p></blockquote> <p>备注：参数为了在本地调试使用，而非真实环境。</p> <p>然后，点击&#8220;Apply&#8221;，然后&#8220;Close&#8221;。现在可以右键，选择&#8220;Run on Hadoop&#8221;，运行。</p> <p>但此时会出现类似异常信息：</p> <blockquote> <p>12/04/24  15:32:44 WARN util.NativeCodeLoader: Unable to load native-hadoop  library for your platform... using builtin-java classes where applicable<br />12/04/24  15:32:44 ERROR security.UserGroupInformation:  PriviledgedActionException as:Administrator cause:java.io.IOException:  Failed to set permissions of path:  \tmp\hadoop-Administrator\mapred\staging\Administrator-519341271\.staging  to 0700<br />Exception in thread "main" java.io.IOException: Failed to  set permissions of path:  \tmp\hadoop-Administrator\mapred\staging\Administrator-519341271\.staging  to 0700<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:682)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:655)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)<br />&nbsp;&nbsp;&nbsp; at java.security.AccessController.doPrivileged(Native Method)<br />&nbsp;&nbsp;&nbsp; at javax.security.auth.Subject.doAs(Subject.java:396)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)<br />&nbsp;&nbsp;&nbsp; at com.hadoop.learn.test.WordCountTest.main(WordCountTest.java:85)</p></blockquote> <p>这个是Windows下文件权限问题，在Linux下可以正常运行，不存在这样的问题。</p> <p>解决方法是，修改/hadoop-1.0.2/src/core/org/apache/hadoop/fs/FileUtil.java里面的checkReturnValue，注释掉即可（有些粗暴，在Window下，可以不用检查）：</p><div id="gist2477544">       <div>         <div gist-syntax"="">      <div>     <table highlight"="" cellpadding="0" cellspacing="0">       <tbody><tr>         <td>           <span id="file-fileutil-java-L1" rel="file-fileutil-java-L1">1</span>           <span id="file-fileutil-java-L2" rel="file-fileutil-java-L2">2</span>           <span id="file-fileutil-java-L3" rel="file-fileutil-java-L3">3</span>           <span id="file-fileutil-java-L4" rel="file-fileutil-java-L4">4</span>           <span id="file-fileutil-java-L5" rel="file-fileutil-java-L5">5</span>           <span id="file-fileutil-java-L6" rel="file-fileutil-java-L6">6</span>           <span id="file-fileutil-java-L7" rel="file-fileutil-java-L7">7</span>           <span id="file-fileutil-java-L8" rel="file-fileutil-java-L8">8</span>           <span id="file-fileutil-java-L9" rel="file-fileutil-java-L9">9</span>           <span id="file-fileutil-java-L10" rel="file-fileutil-java-L10">10</span>           <span id="file-fileutil-java-L11" rel="file-fileutil-java-L11">11</span>           <span id="file-fileutil-java-L12" rel="file-fileutil-java-L12">12</span>           <span id="file-fileutil-java-L13" rel="file-fileutil-java-L13">13</span>         </td>         <td>           <pre><div id="file-fileutil-java-LC1">......</div><div id="file-fileutil-java-LC2">  private static void checkReturnValue(boolean rv, File p, </div><div id="file-fileutil-java-LC3">                                       FsPermission permission</div><div id="file-fileutil-java-LC4">                                       ) throws IOException {</div><div id="file-fileutil-java-LC5">    /**</div><div id="file-fileutil-java-LC6">	if (!rv) {</div><div id="file-fileutil-java-LC7">      throw new IOException("Failed to set permissions of path: " + p + </div><div id="file-fileutil-java-LC8">                            " to " + </div><div id="file-fileutil-java-LC9">                            String.format("%04o", permission.toShort()));</div><div id="file-fileutil-java-LC10">    }</div><div id="file-fileutil-java-LC11">	**/</div><div id="file-fileutil-java-LC12">  }</div><div id="file-fileutil-java-LC13">......</div></pre>         </td>       </tr>     </tbody></table>   </div>          </div>          <div>           <a href="https://gist.github.com/yongboy/2477544/raw/3c94505a05fe26f6c27375d854b831109517501f/FileUtil.java" style="float:right">view raw</a>           <a href="https://gist.github.com/yongboy/2477544#file-fileutil-java" style="float:right; margin-right:10px; color:#666;">FileUtil.java</a>           <a href="https://gist.github.com/yongboy/2477544">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.         </div>       </div> </div>  <p>重新编译打包hadoop-core-1.0.2.jar，替换掉hadoop-1.0.2根目录下的hadoop-core-1.0.2.jar即可。</p> <p>这里提供一份修改版的<a href="https://skydrive.live.com/redir.aspx?cid=cf7746837803bc50&amp;resid=CF7746837803BC50%211276&amp;parid=CF7746837803BC50%211274&amp;authkey=%21AJCcrNRX9RCF6FA" target="_blank">hadoop-core-1.0.2-modified.jar</a>文件，替换原hadoop-core-1.0.2.jar即可。</p> <p>替换之后，刷新项目，设置好正确的jar包依赖，现在再运行WordCountTest，即可。</p> <p>成功之后，在Eclipse下刷新HDFS目录，可以看到生成了ouput2目录：</p> <p><a href="http://www.blogjava.net/images/blogjava_net/yongboy/WindowsLiveWriter/hadoopEclipseHadoop_EEE6/image_4.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" alt="image" src="http://www.blogjava.net/images/blogjava_net/yongboy/WindowsLiveWriter/hadoopEclipseHadoop_EEE6/image_thumb_1.png" border="0" height="64" width="220" /></a> </p> <p>点击&#8220; part-r-00000&#8221;文件，可以看到排序结果：</p> <blockquote> <p>Bye&nbsp;&nbsp;&nbsp; 1<br />Goodbye&nbsp;&nbsp;&nbsp; 1<br />Hadoop&nbsp;&nbsp;&nbsp; 2<br />Hello&nbsp;&nbsp;&nbsp; 2<br />World&nbsp;&nbsp;&nbsp; 2</p></blockquote> <p>嗯，一样可以正常Debug调试该程序，设置断点（右键 &#8211;&gt; Debug As &#8211; &gt; Java Application），即可（每次运行之前，都需要收到删除输出目录）。</p> <p>另外，该插件会在eclipse对应的workspace\.metadata\.plugins\org.apache.hadoop.eclipse下，自动生成jar文件，以及其他文件，包括Haoop的一些具体配置等。</p> <p>嗯，更多细节，慢慢体验吧。</p> <p><strong>遇到的异常</strong></p> <blockquote> <p>org.apache.hadoop.ipc.RemoteException:  org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create  directory /user/root/output2/_temporary. Name node is in safe mode.<br />The ratio of reported blocks 0.5000 has not reached the threshold 0.9990. Safe mode will be turned off automatically.<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:2055)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2029)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.hdfs.server.namenode.NameNode.mkdirs(NameNode.java:817)<br />&nbsp;&nbsp;&nbsp; at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)<br />&nbsp;&nbsp;&nbsp; at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)<br />&nbsp;&nbsp;&nbsp; at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)<br />&nbsp;&nbsp;&nbsp; at java.lang.reflect.Method.invoke(Method.java:597)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)<br />&nbsp;&nbsp;&nbsp; at java.security.AccessController.doPrivileged(Native Method)<br />&nbsp;&nbsp;&nbsp; at javax.security.auth.Subject.doAs(Subject.java:396)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)<br />&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)</p></blockquote> <p>在主节点处，关闭掉安全模式： </p> <blockquote> <p>#bin/hadoop dfsadmin &#8211;safemode leave </p></blockquote> <p><strong>如何打包</strong></p> <p>将创建的Map/Reduce项目打包成jar包，很简单的事情，无需多言。保证jar文件的META-INF/MANIFEST.MF文件中存在Main-Class映射：</p> <blockquote> <p>Main-Class: com.hadoop.learn.test.TestDriver</p></blockquote> <p>若使用到第三方jar包，那么在MANIFEST.MF中增加Class-Path好了。</p> <p>另外可使用插件提供的MapReduce Driver向导，可以帮忙我们在Hadoop中运行，直接指定别名，尤其是包含多个Map/Reduce作业时，很有用。</p> <p>一个MapReduce Driver只要包含一个main函数，指定别名：</p><div id="gist2498401">       <div>         <div gist-syntax"="">      <div>     <table highlight"="" cellpadding="0" cellspacing="0">       <tbody><tr>         <td>           <span id="file-testdriver-java-L1" rel="file-testdriver-java-L1">1</span>           <span id="file-testdriver-java-L2" rel="file-testdriver-java-L2">2</span>           <span id="file-testdriver-java-L3" rel="file-testdriver-java-L3">3</span>           <span id="file-testdriver-java-L4" rel="file-testdriver-java-L4">4</span>           <span id="file-testdriver-java-L5" rel="file-testdriver-java-L5">5</span>           <span id="file-testdriver-java-L6" rel="file-testdriver-java-L6">6</span>           <span id="file-testdriver-java-L7" rel="file-testdriver-java-L7">7</span>           <span id="file-testdriver-java-L8" rel="file-testdriver-java-L8">8</span>           <span id="file-testdriver-java-L9" rel="file-testdriver-java-L9">9</span>           <span id="file-testdriver-java-L10" rel="file-testdriver-java-L10">10</span>           <span id="file-testdriver-java-L11" rel="file-testdriver-java-L11">11</span>           <span id="file-testdriver-java-L12" rel="file-testdriver-java-L12">12</span>           <span id="file-testdriver-java-L13" rel="file-testdriver-java-L13">13</span>           <span id="file-testdriver-java-L14" rel="file-testdriver-java-L14">14</span>           <span id="file-testdriver-java-L15" rel="file-testdriver-java-L15">15</span>           <span id="file-testdriver-java-L16" rel="file-testdriver-java-L16">16</span>           <span id="file-testdriver-java-L17" rel="file-testdriver-java-L17">17</span>           <span id="file-testdriver-java-L18" rel="file-testdriver-java-L18">18</span>           <span id="file-testdriver-java-L19" rel="file-testdriver-java-L19">19</span>           <span id="file-testdriver-java-L20" rel="file-testdriver-java-L20">20</span>           <span id="file-testdriver-java-L21" rel="file-testdriver-java-L21">21</span>           <span id="file-testdriver-java-L22" rel="file-testdriver-java-L22">22</span>           <span id="file-testdriver-java-L23" rel="file-testdriver-java-L23">23</span>           <span id="file-testdriver-java-L24" rel="file-testdriver-java-L24">24</span>           <span id="file-testdriver-java-L25" rel="file-testdriver-java-L25">25</span>           <span id="file-testdriver-java-L26" rel="file-testdriver-java-L26">26</span>           <span id="file-testdriver-java-L27" rel="file-testdriver-java-L27">27</span>           <span id="file-testdriver-java-L28" rel="file-testdriver-java-L28">28</span>         </td>         <td>           <pre><div id="file-testdriver-java-LC1">package com.hadoop.learn.test;</div><div id="file-testdriver-java-LC2">&nbsp;</div><div id="file-testdriver-java-LC3">import org.apache.hadoop.util.ProgramDriver;</div><div id="file-testdriver-java-LC4">&nbsp;</div><div id="file-testdriver-java-LC5">/**</div><div id="file-testdriver-java-LC6"> * </div><div id="file-testdriver-java-LC7"> * @author yongboy</div><div id="file-testdriver-java-LC8"> * @time 2012-4-24</div><div id="file-testdriver-java-LC9"> * @version 1.0</div><div id="file-testdriver-java-LC10"> */</div><div id="file-testdriver-java-LC11">public class TestDriver {</div><div id="file-testdriver-java-LC12">&nbsp;</div><div id="file-testdriver-java-LC13">	public static void main(String[] args) {</div><div id="file-testdriver-java-LC14">		int exitCode = -1;</div><div id="file-testdriver-java-LC15">		ProgramDriver pgd = new ProgramDriver();</div><div id="file-testdriver-java-LC16">		try {</div><div id="file-testdriver-java-LC17">			pgd.addClass("testcount", WordCountTest.class,</div><div id="file-testdriver-java-LC18">					"A test map/reduce program that counts the words in the input files.");</div><div id="file-testdriver-java-LC19">			pgd.driver(args);</div><div id="file-testdriver-java-LC20">&nbsp;</div><div id="file-testdriver-java-LC21">			exitCode = 0;</div><div id="file-testdriver-java-LC22">		} catch (Throwable e) {</div><div id="file-testdriver-java-LC23">			e.printStackTrace();</div><div id="file-testdriver-java-LC24">		}</div><div id="file-testdriver-java-LC25">&nbsp;</div><div id="file-testdriver-java-LC26">		System.exit(exitCode);</div><div id="file-testdriver-java-LC27">	}</div><div id="file-testdriver-java-LC28">}</div></pre>         </td>       </tr>     </tbody></table>   </div>          </div>          <div>           <a href="https://gist.github.com/yongboy/2498401/raw/83b1932a9f17ca5c4e7c20c54a847aae08ded175/TestDriver.java" style="float:right">view raw</a>           <a href="https://gist.github.com/yongboy/2498401#file-testdriver-java" style="float:right; margin-right:10px; color:#666;">TestDriver.java</a>           <a href="https://gist.github.com/yongboy/2498401">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.         </div>       </div> </div>  <p>这里有一个小技巧，MapReduce Driver类上面，右键运行，Run on  Hadoop，会在Eclipse的workspace\.metadata\.plugins\org.apache.hadoop.eclipse目 录下自动生成jar包，上传到HDFS，或者远程hadoop根目录下，运行它:</p> <blockquote> <p># bin/hadoop jar LearnHadoop_TestDriver.java-460881982912511899.jar testcount input2 output3</p></blockquote> <p>OK，本文结束。</p></div><img src ="http://www.blogjava.net/wangxinsh55/aggbug/395570.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/wangxinsh55/" target="_blank">SIMONE</a> 2013-02-22 14:06 <a href="http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395570.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Hadoop作业提交分析（五）</title><link>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395569.html</link><dc:creator>SIMONE</dc:creator><author>SIMONE</author><pubDate>Fri, 22 Feb 2013 06:05:00 GMT</pubDate><guid>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395569.html</guid><wfw:comment>http://www.blogjava.net/wangxinsh55/comments/395569.html</wfw:comment><comments>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395569.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/wangxinsh55/comments/commentRss/395569.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/wangxinsh55/services/trackbacks/395569.html</trackback:ping><description><![CDATA[<div>http://www.cnblogs.com/spork/archive/2010/04/21/1717592.html</div><br /><div><div id="cnblogs_post_body"><p>　　经过上一篇的分析，我们知道了Hadoop的作业提交目标是Cluster还是Local，与conf文件夹内的配置文件参数有着密切关系，不仅如此，其它的很多类都跟conf有关，所以提交作业时切记把conf放到你的classpath中。</p> <p>　　因为Configuration是利用当前线程上下文的类加载器来加载资源和文件的，所以这里我们采用动态载入的方式，先添加好对应的依赖库和资源，然后再构建一个URLClassLoader作为当前线程上下文的类加载器。</p> <div><div><a title="复制代码"><img src="http://common.cnblogs.com/images/copycode.gif" alt="复制代码" /></a></div> <pre><div> <span style="color: #0000ff;">public</span> <span style="color: #0000ff;">static</span><span style="color: #000000;"> ClassLoader getClassLoader() {<br />        ClassLoader parent </span><span style="color: #000000;">=</span><span style="color: #000000;"> Thread.currentThread().getContextClassLoader();<br />        </span><span style="color: #0000ff;">if</span><span style="color: #000000;"> (parent </span><span style="color: #000000;">==</span> <span style="color: #0000ff;">null</span><span style="color: #000000;">) {<br />            parent </span><span style="color: #000000;">=</span><span style="color: #000000;"> EJob.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">.getClassLoader();<br />        }<br />        </span><span style="color: #0000ff;">if</span><span style="color: #000000;"> (parent </span><span style="color: #000000;">==</span> <span style="color: #0000ff;">null</span><span style="color: #000000;">) {<br />            parent </span><span style="color: #000000;">=</span><span style="color: #000000;"> ClassLoader.getSystemClassLoader();<br />        }<br />        </span><span style="color: #0000ff;">return</span> <span style="color: #0000ff;">new</span><span style="color: #000000;"> URLClassLoader(classPath.toArray(</span><span style="color: #0000ff;">new</span><span style="color: #000000;"> URL[</span><span style="color: #000000;">0</span><span style="color: #000000;">]), parent);<br />    }</span></div></pre> <div><a title="复制代码"><img src="http://common.cnblogs.com/images/copycode.gif" alt="复制代码" /></a></div></div> <p>　　代码很简单，废话就不多说了。调用例子如下：</p> <div> <pre><div><span style="color: #000000;">   EJob.addClasspath(</span><span style="color: #000000;">"</span><span style="color: #000000;">/usr/lib/hadoop-0.20/conf</span><span style="color: #000000;">"</span><span style="color: #000000;">);<br />   ClassLoader classLoader </span><span style="color: #000000;">=</span><span style="color: #000000;"> EJob.getClassLoader();<br />   Thread.currentThread().setContextClassLoader(classLoader);</span></div></pre> </div> <p>　　设置好了类加载器，下面还有一步就是要打包Jar文件，就是让Project自打包自己的class为一个Jar包，我这里以标准Eclipse工程文件夹布局为例，打包的就是bin文件夹里的class。</p> <div><div><a title="复制代码"><img src="http://common.cnblogs.com/images/copycode.gif" alt="复制代码" /></a></div> <div id="cnblogs_code_open_e5d79072-c219-44a1-9025-8122614a7142"> <pre><div><span style="color: #0000ff;">    public</span> <span style="color: #0000ff;">static</span><span style="color: #000000;"> File createTempJar(String root) </span><span style="color: #0000ff;">throws</span><span style="color: #000000;"> IOException {<br />        </span><span style="color: #0000ff;">if</span><span style="color: #000000;"> (</span><span style="color: #000000;">!</span><span style="color: #0000ff;">new</span><span style="color: #000000;"> File(root).exists()) {<br />            </span><span style="color: #0000ff;">return</span> <span style="color: #0000ff;">null</span><span style="color: #000000;">;<br />        }<br />        Manifest manifest </span><span style="color: #000000;">=</span> <span style="color: #0000ff;">new</span><span style="color: #000000;"> Manifest();<br />        manifest.getMainAttributes().putValue(</span><span style="color: #000000;">"</span><span style="color: #000000;">Manifest-Version</span><span style="color: #000000;">"</span><span style="color: #000000;">, </span><span style="color: #000000;">"</span><span style="color: #000000;">1.0</span><span style="color: #000000;">"</span><span style="color: #000000;">);<br />        </span><span style="color: #0000ff;">final</span><span style="color: #000000;"> File jarFile </span><span style="color: #000000;">=</span><span style="color: #000000;"> File.createTempFile(</span><span style="color: #000000;">"</span><span style="color: #000000;">EJob-</span><span style="color: #000000;">"</span><span style="color: #000000;">, </span><span style="color: #000000;">"</span><span style="color: #000000;">.jar</span><span style="color: #000000;">"</span><span style="color: #000000;">, </span><span style="color: #0000ff;">new</span><span style="color: #000000;"> File(System<br />                .getProperty(</span><span style="color: #000000;">"</span><span style="color: #000000;">java.io.tmpdir</span><span style="color: #000000;">"</span><span style="color: #000000;">)));<br /><br />        Runtime.getRuntime().addShutdownHook(</span><span style="color: #0000ff;">new</span><span style="color: #000000;"> Thread() {<br />            </span><span style="color: #0000ff;">public</span> <span style="color: #0000ff;">void</span><span style="color: #000000;"> run() {<br />                jarFile.delete();<br />            }<br />        });<br /><br />        JarOutputStream out </span><span style="color: #000000;">=</span> <span style="color: #0000ff;">new</span><span style="color: #000000;"> JarOutputStream(</span><span style="color: #0000ff;">new</span><span style="color: #000000;"> FileOutputStream(jarFile),<br />                manifest);<br />        createTempJarInner(out, </span><span style="color: #0000ff;">new</span><span style="color: #000000;"> File(root), </span><span style="color: #000000;">""</span><span style="color: #000000;">);<br />        out.flush();<br />        out.close();<br />        </span><span style="color: #0000ff;">return</span><span style="color: #000000;"> jarFile;<br />    }<br /><br />    </span><span style="color: #0000ff;">private</span> <span style="color: #0000ff;">static</span> <span style="color: #0000ff;">void</span><span style="color: #000000;"> createTempJarInner(JarOutputStream out, File f,<br />            String base) </span><span style="color: #0000ff;">throws</span><span style="color: #000000;"> IOException {<br />        </span><span style="color: #0000ff;">if</span><span style="color: #000000;"> (f.isDirectory()) {<br />            File[] fl </span><span style="color: #000000;">=</span><span style="color: #000000;"> f.listFiles();<br />            </span><span style="color: #0000ff;">if</span><span style="color: #000000;"> (base.length() </span><span style="color: #000000;">&gt;</span> <span style="color: #000000;">0</span><span style="color: #000000;">) {<br />                base </span><span style="color: #000000;">=</span><span style="color: #000000;"> base </span><span style="color: #000000;">+</span> <span style="color: #000000;">"</span><span style="color: #000000;">/</span><span style="color: #000000;">"</span><span style="color: #000000;">;<br />            }<br />            </span><span style="color: #0000ff;">for</span><span style="color: #000000;"> (</span><span style="color: #0000ff;">int</span><span style="color: #000000;"> i </span><span style="color: #000000;">=</span> <span style="color: #000000;">0</span><span style="color: #000000;">; i </span><span style="color: #000000;">&lt;</span><span style="color: #000000;"> fl.length; i</span><span style="color: #000000;">++</span><span style="color: #000000;">) {<br />                createTempJarInner(out, fl[i], base </span><span style="color: #000000;">+</span><span style="color: #000000;"> fl[i].getName());<br />            }<br />        } </span><span style="color: #0000ff;">else</span><span style="color: #000000;"> {<br />            out.putNextEntry(</span><span style="color: #0000ff;">new</span><span style="color: #000000;"> JarEntry(base));<br />            FileInputStream in </span><span style="color: #000000;">=</span> <span style="color: #0000ff;">new</span><span style="color: #000000;"> FileInputStream(f);<br />            </span><span style="color: #0000ff;">byte</span><span style="color: #000000;">[] buffer </span><span style="color: #000000;">=</span> <span style="color: #0000ff;">new</span> <span style="color: #0000ff;">byte</span><span style="color: #000000;">[</span><span style="color: #000000;">1024</span><span style="color: #000000;">];<br />            </span><span style="color: #0000ff;">int</span><span style="color: #000000;"> n </span><span style="color: #000000;">=</span><span style="color: #000000;"> in.read(buffer);<br />            </span><span style="color: #0000ff;">while</span><span style="color: #000000;"> (n </span><span style="color: #000000;">!=</span> <span style="color: #000000;">-</span><span style="color: #000000;">1</span><span style="color: #000000;">) {<br />                out.write(buffer, </span><span style="color: #000000;">0</span><span style="color: #000000;">, n);<br />                n </span><span style="color: #000000;">=</span><span style="color: #000000;"> in.read(buffer);<br />            }<br />            in.close();<br />        }<br />    }</span></div></pre> </div> <div><a title="复制代码"><img src="http://common.cnblogs.com/images/copycode.gif" alt="复制代码" /></a></div></div> <p>　　这里的对外接口是createTempJar，接收参数为需要打包的文件夹根路径，支持子文件夹打包。使用递归处理法，依次把文件夹里的结构和 文件打包到Jar里。很简单，就是基本的文件流操作，陌生一点的就是Manifest和JarOutputStream，查查API就明了。</p> <p>　　好，万事具备，只欠东风了，我们来实践一下试试。还是拿WordCount来举例：</p> <div><div><a title="复制代码"><img src="http://common.cnblogs.com/images/copycode.gif" alt="复制代码" /></a></div> <div id="cnblogs_code_open_fc92678b-77e8-420b-abab-e1994c50d7ac"> <pre><div> <span style="color: #008000;">//</span><span style="color: #008000;"> Add these statements. XXX</span><span style="color: #008000;"><br /></span><span style="color: #000000;">        <span style="color: #3366ff;">File jarFile </span></span><span style="color: #3366ff;">= EJob.createTempJar("bin");<br />        EJob.addClasspath("/usr/lib/hadoop-0.20/conf");<br />        ClassLoader classLoader =</span><span style="color: #000000;"><span style="color: #3366ff;"> EJob.getClassLoader();<br />        Thread.currentThread().setContextClassLoader(classLoader);</span><br /><br />        Configuration conf </span><span style="color: #000000;">=</span> <span style="color: #0000ff;">new</span><span style="color: #000000;"> Configuration();<br />        String[] otherArgs </span><span style="color: #000000;">=</span> <span style="color: #0000ff;">new</span><span style="color: #000000;"> GenericOptionsParser(conf, args)<br />                .getRemainingArgs();<br />        </span><span style="color: #0000ff;">if</span><span style="color: #000000;"> (otherArgs.length </span><span style="color: #000000;">!=</span> <span style="color: #000000;">2</span><span style="color: #000000;">) {<br />            System.err.println(</span><span style="color: #000000;">"</span><span style="color: #000000;">Usage: wordcount &lt;in&gt; &lt;out&gt;</span><span style="color: #000000;">"</span><span style="color: #000000;">);<br />            System.exit(</span><span style="color: #000000;">2</span><span style="color: #000000;">);<br />        }<br /><br />        Job job </span><span style="color: #000000;">=</span> <span style="color: #0000ff;">new</span><span style="color: #000000;"> Job(conf, </span><span style="color: #000000;">"</span><span style="color: #000000;">word count</span><span style="color: #000000;">"</span><span style="color: #000000;">);</span><span style="color: #000000;"><br />        job.setJarByClass(WordCountTest.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);<br />        job.setMapperClass(TokenizerMapper.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);<br />        job.setCombinerClass(IntSumReducer.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);<br />        job.setReducerClass(IntSumReducer.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);<br />        job.setOutputKeyClass(Text.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);<br />        job.setOutputValueClass(IntWritable.</span><span style="color: #0000ff;">class</span><span style="color: #000000;">);<br />        FileInputFormat.addInputPath(job, </span><span style="color: #0000ff;">new</span><span style="color: #000000;"> Path(otherArgs[</span><span style="color: #000000;">0</span><span style="color: #000000;">]));<br />        FileOutputFormat.setOutputPath(job, </span><span style="color: #0000ff;">new</span><span style="color: #000000;"> Path(otherArgs[</span><span style="color: #000000;">1</span><span style="color: #000000;">]));<br />        System.exit(job.waitForCompletion(</span><span style="color: #0000ff;">true</span><span style="color: #000000;">) </span><span style="color: #000000;">?</span> <span style="color: #000000;">0</span><span style="color: #000000;"> : </span><span style="color: #000000;">1</span><span style="color: #000000;">);</span></div></pre> </div> <div><a title="复制代码"><img src="http://common.cnblogs.com/images/copycode.gif" alt="复制代码" /></a></div></div> <p>　　Run as Java Application。。。<span style="color: #ff0000;">！！！</span><span style="color: #ff0000;">No job jar file set...<span style="color: #000000;"><span style="color: #ff0000;">异常</span>，看来</span></span>job.setJarByClass(WordCountTest.class)这个语句设置作业Jar包没有成功。这是为什么呢？</p> <p>因为这个方法使用了WordCount.class的类加载器来寻找包含该类的Jar包，然后设置该Jar包为作业所用的Jar包。但是我们的作业   Jar包是在程序运行时才打包的，而WordCount.class的类加载器是AppClassLoader，运行后我们无法改变它的搜索路径，所以使 用setJarByClass是无法设置作业Jar包的。我们必须使用JobConf里的setJar来直接设置作业Jar包，像下面一样： </p> <div> <pre><div><span style="color: #000000;">((JobConf)job.getConfiguration()).setJar(jarFile);</span></div></pre> </div> <p> 　　好，我们对上面的例子再做下修改，加上上面这条语句。</p> <div> <pre><div><span style="color: #000000;">Job job </span><span style="color: #000000;">=</span> <span style="color: #0000ff;">new</span><span style="color: #000000;"> Job(conf, </span><span style="color: #000000;">"</span><span style="color: #000000;">word count</span><span style="color: #000000;">"</span><span style="color: #000000;">);<br /></span><span style="color: #008000;">//</span><span style="color: #008000;"> And add this statement. XXX</span><span style="color: #008000;"><br /></span><span style="color: #3366ff;">((JobConf) job.getConfiguration()).setJar(jarFile.toString());</span></div></pre> </div> <p> 　　再Run as Java Application，终于OK了~~</p> <p>　　该种方法的Run on Hadoop使用简单，兼容性好，推荐一试。：）</p> <p>　　本例子由于时间关系，只在Ubuntu上做了伪分布式测试，但理论上是可以用到真实分布式上去的。</p> <p>&nbsp;&nbsp;&nbsp;&nbsp; <a href="http://files.cnblogs.com/spork/jobutil_2.rar">&gt;&gt;点我下载&lt;&lt;</a></p> <p>&nbsp;</p> <p>　　The end.</p></div></div><img src ="http://www.blogjava.net/wangxinsh55/aggbug/395569.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/wangxinsh55/" target="_blank">SIMONE</a> 2013-02-22 14:05 <a href="http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395569.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>HBASE的MAPREDUCE任务运行异常解决办法，无需CYGWIN，纯WINDOWS环境</title><link>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395568.html</link><dc:creator>SIMONE</dc:creator><author>SIMONE</author><pubDate>Fri, 22 Feb 2013 06:03:00 GMT</pubDate><guid>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395568.html</guid><wfw:comment>http://www.blogjava.net/wangxinsh55/comments/395568.html</wfw:comment><comments>http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395568.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/wangxinsh55/comments/commentRss/395568.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/wangxinsh55/services/trackbacks/395568.html</trackback:ping><description><![CDATA[<div>http://www.blogjava.net/paulwong/archive/2012/10/03/388977.html</div><br /><div>如果是在WINDOWS的ECLIPSE中，运行HBASE的MAPREDUCE，会出现异常，这是由于默认运行MAPREDUCE任务是在本地运行，而由于会建立文件赋权限是按照UNIX的方式进行，因此会报错：<br /><br /><div style="padding: 4px 5px 4px 4px; border: 1px solid #cccccc; width: 98%; font-size: 13px; word-break: break-all; background-color: #eeeeee;"><img alt="" src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><span style="color: #000000;">java.lang.RuntimeException: Error </span><span style="color: #0000ff;">while</span><span style="color: #000000;"> running command to get file permissions : java.io.IOException: Cannot run program </span><span style="color: #000000;">"</span><span style="color: #000000;">ls</span><span style="color: #000000;">"</span><span style="color: #000000;">: CreateProcess error</span><span style="color: #000000;">=</span><span style="color: #000000;">2</span><span style="color: #000000;">, </span></div><br /><br />解决办法是将任务发到运程主机，通常是LINUX上运行，在hbase-site.xml中加入：<br /><br /><div style="padding: 4px 5px 4px 4px; border: 1px solid #cccccc; width: 98%; font-size: 13px; word-break: break-all; background-color: #eeeeee;"><img alt="" src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><span style="color: #0000ff;">&lt;</span><span style="color: #800000;">property</span><span style="color: #0000ff;">&gt;</span><span style="color: #000000;"><br /><img alt="" src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp; </span><span style="color: #0000ff;">&lt;</span><span style="color: #800000;">name</span><span style="color: #0000ff;">&gt;</span><span style="color: #000000;">mapred.job.tracker</span><span style="color: #0000ff;">&lt;/</span><span style="color: #800000;">name</span><span style="color: #0000ff;">&gt;</span><span style="color: #000000;"><br /><img alt="" src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp; </span><span style="color: #0000ff;">&lt;</span><span style="color: #800000;">value</span><span style="color: #0000ff;">&gt;</span><span style="color: #000000;">master:9001</span><span style="color: #0000ff;">&lt;/</span><span style="color: #800000;">value</span><span style="color: #0000ff;">&gt;</span><span style="color: #000000;"><br /><img alt="" src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /></span><span style="color: #0000ff;">&lt;/</span><span style="color: #800000;">property</span><span style="color: #0000ff;">&gt;</span></div><br />同时需把HDFS的权限机制关掉：<br /><br /><div style="padding: 4px 5px 4px 4px; border: 1px solid #cccccc; width: 98%; font-size: 13px; word-break: break-all; background-color: #eeeeee;"><img alt="" src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><span style="color: #0000ff;">&lt;</span><span style="color: #800000;">property</span><span style="color: #0000ff;">&gt;</span><span style="color: #000000;"><br /><img alt="" src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp; </span><span style="color: #0000ff;">&lt;</span><span style="color: #800000;">name</span><span style="color: #0000ff;">&gt;</span><span style="color: #000000;">dfs.permissions</span><span style="color: #0000ff;">&lt;/</span><span style="color: #800000;">name</span><span style="color: #0000ff;">&gt;</span><span style="color: #000000;"><br /><img alt="" src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" />&nbsp;&nbsp;&nbsp; </span><span style="color: #0000ff;">&lt;</span><span style="color: #800000;">value</span><span style="color: #0000ff;">&gt;</span><span style="color: #000000;">false</span><span style="color: #0000ff;">&lt;/</span><span style="color: #800000;">value</span><span style="color: #0000ff;">&gt;</span><span style="color: #000000;"><br /><img alt="" src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /></span><span style="color: #0000ff;">&lt;/</span><span style="color: #800000;">property</span><span style="color: #0000ff;">&gt;</span></div><br /><br />另外由于是在远程上执行任务，自定义的类文件，如Maper/Reducer等需打包成jar文件上传，具体见方案：<br />Hadoop作业提交分析（五）<a href="http://www.cnblogs.com/spork/archive/2010/04/21/1717592.html" target="_blank">http://www.cnblogs.com/spork/archive/2010/04/21/1717592.html</a><br />  <br /><br />研究了好几天，终于搞清楚，CONFIGUARATION就是JOB的配置信息，远程JOBTRACKER就是以此为参数构建JOB去执行，由于远程主机并没有自定义的MAPREDUCE类，需打成JAR包后，上传到主机处，但无需每次都手动传，可以代码设置：<br /><br /><div style="padding: 4px 5px 4px 4px; border: 1px solid #cccccc; width: 98%; font-size: 13px; word-break: break-all; background-color: #eeeeee;"><img alt="" src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><span style="color: #000000;">conf.set(</span><span style="color: #000000;">"</span><span style="color: #000000;">tmpjars</span><span style="color: #000000;">"</span><span style="color: #000000;">, </span><span style="color: #000000;">"</span><span style="color: #000000;">d:/aaa.jar</span><span style="color: #000000;">"</span><span style="color: #000000;">);</span></div><br /><br />另注意，如果在WINDOWS系统中，文件分隔号是&#8220;；&#8221;，生成的JAR包信息是以&#8220;；&#8221;间隔的，在远程主机的LINUX上是无法辨别，需改为：<br /><br /><div style="padding: 4px 5px 4px 4px; border: 1px solid #cccccc; width: 98%; font-size: 13px; word-break: break-all; background-color: #eeeeee;"><img alt="" src="http://www.blogjava.net/Images/OutliningIndicators/None.gif" align="top" /><span style="color: #000000;">System.setProperty(</span><span style="color: #000000;">"</span><span style="color: #000000;">path.separator</span><span style="color: #000000;">"</span><span style="color: #000000;">, </span><span style="color: #000000;">"</span><span style="color: #000000;">:</span><span style="color: #000000;">"</span><span style="color: #000000;">);</span></div><br /><br />参考文章：<br /><a href="http://www.cnblogs.com/xia520pi/archive/2012/05/20/2510723.html" target="_blank">http://www.cnblogs.com/xia520pi/archive/2012/05/20/2510723.html</a><br /><br /><br />使用hadoop eclipse plugin提交Job并添加多个第三方jar（完美版）<br /><a href="http://heipark.iteye.com/blog/1171923" target="_blank">http://heipark.iteye.com/blog/1171923</a>&nbsp;     		</div><img src ="http://www.blogjava.net/wangxinsh55/aggbug/395568.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/wangxinsh55/" target="_blank">SIMONE</a> 2013-02-22 14:03 <a href="http://www.blogjava.net/wangxinsh55/archive/2013/02/22/395568.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>