﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>BlogJava-so true-随笔分类-Hadoop</title><link>http://www.blogjava.net/bacoo/category/35981.html</link><description>心怀未来，开创未来！</description><language>zh-cn</language><lastBuildDate>Wed, 07 Jan 2009 23:15:44 GMT</lastBuildDate><pubDate>Wed, 07 Jan 2009 23:15:44 GMT</pubDate><ttl>60</ttl><item><title>InputFormat学习</title><link>http://www.blogjava.net/bacoo/archive/2009/01/07/250221.html</link><dc:creator>so true</dc:creator><author>so true</author><pubDate>Wed, 07 Jan 2009 01:40:00 GMT</pubDate><guid>http://www.blogjava.net/bacoo/archive/2009/01/07/250221.html</guid><wfw:comment>http://www.blogjava.net/bacoo/comments/250221.html</wfw:comment><comments>http://www.blogjava.net/bacoo/archive/2009/01/07/250221.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/bacoo/comments/commentRss/250221.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/bacoo/services/trackbacks/250221.html</trackback:ping><description><![CDATA[<p>InputFormat，就是为了能够从一个jobconf中得到一个split集合（InputSplit[]），然后再为这个split集合配上一个合适的RecordReader（getRecordReader）来读取每个split中的数据。</p>
<p>InputSplit，继承自Writable接口，因此一个InputSplit实则包含了四个接口函数，读和写（readFields和write），getLength能够给出这个split中所记录的数据大小，getLocations能够得到这个split位于哪些主机之上（blkLocations[blkIndex].getHosts()），这里需要说明的是一个block要么对应一个split，要么对应多个split，因此每个split都可以从它所属的block中获取主机信息，而且我猜测block的大小应该是split的整数倍，否则有可能一个split跨越两个block。</p>
<p>对于RecordReader，其实这个接口主要就是为了维护一组&lt;K,V&gt;键值对，任何一个实现了该接口的类的构造函数都需要是&#8220;(Configuration conf, Class&lt; ? extends InputSplit&gt; split)&#8221;的形式，因为一个RecordReader是有针对性的，就是针对某种split来进行的，因此必须得与某种split绑定起来。这个接口中最重要的方法就是next，在利用next进行读取K和V时，需要先通过createKey和createValue来创建K和V的对象，然后再传给next作为参数，使得next对形参中的数据成员进行修改。</p>
<p>一个file（FileStatus）分成多个block存储（BlockLocation[]），每个block都有固定的大小（file.getBlockSize()），然后计算出每个split所需的大小（computeSplitSize(goalSize, minSize, blockSize)），然后将长度为length（file.getLen()）的file分割为多个split，最后一个不足一个split大小的部分单独为其分配一个split，最后返回这个file分割的最终结果（return splits.toArray(new FileSplit[splits.size()])）。</p>
<p>一个job，会得到输入的文件路径（conf.get("mapred.input.dir", "")），然后据此可以得到一个Path[]，对于每个Path，都可以得到一个fs（FileSystem fs = p.getFileSystem(job)），然后再得到一个FileStatus[]（FileStatus[] matches = fs.globStatus(p, inputFilter)），再把里面的每个FileStatus拿出来，判断其是否为dir，如果是的话就FileStatus stat:fs.listStatus(globStat.getPath(), inputFilter)，然后再将stat加入到最终的结果集中result；如果是文件的话，那就直接加入到结果集中。说得简洁一些，就是一个job会得到input.dir中的所有文件，每个文件都用FileStatus来记录。</p>
<p>MultiFileSplit的官方描述是&#8220;A sub-collection of input files. Unlike {@link FileSplit}, MultiFileSplit class does not represent a split of a file, but a split of input files into smaller sets. The atomic unit of split is a file.&#8221;，一个MultiFileSplit中含有多个小文件，每个文件应该只隶属于一个block，然后getLocations就返回所有小文件对应的block的getHosts；getLength返回所有文件的大小总和。</p>
<p>对于MultiFileInputFormat，它的getSplits返回的是一个MultiFileSplit的集合，也就是一个个的小文件簇，举个简单的例子就会很清楚了：假定这个job中有5个小文件，大小分别为2，3，5，1，4；假定我们期望split的总数目为3的话，先算出个double avgLengthPerSplit = ((double)totLength) / numSplits，结果应该为5；然后再切分，因此得到的三个文件簇为：{文件1和2}、{文件3}、{文件4和5}。如果这五个文件的大小分别为2，5，3，1，4；那么应该得到四个文件簇为：{文件1}、{文件2}、{文件3和4}、{文件5}。此外，这个类的getRecordReader依然是个abstract的方法，因此其子类必须得实现这个函数。</p>
<img src ="http://www.blogjava.net/bacoo/aggbug/250221.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/bacoo/" target="_blank">so true</a> 2009-01-07 09:40 <a href="http://www.blogjava.net/bacoo/archive/2009/01/07/250221.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>配置分布式hadoop时ssh方面该注意的事项</title><link>http://www.blogjava.net/bacoo/archive/2008/11/15/240625.html</link><dc:creator>so true</dc:creator><author>so true</author><pubDate>Fri, 14 Nov 2008 17:25:00 GMT</pubDate><guid>http://www.blogjava.net/bacoo/archive/2008/11/15/240625.html</guid><wfw:comment>http://www.blogjava.net/bacoo/comments/240625.html</wfw:comment><comments>http://www.blogjava.net/bacoo/archive/2008/11/15/240625.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/bacoo/comments/commentRss/240625.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/bacoo/services/trackbacks/240625.html</trackback:ping><description><![CDATA[<p>配置ssh无密码访问：<br />
比如，A是server，B是client，现在B希望通过ssh无密码访问A，那么就需要把B的公匙放到A的authorized_keys文件中。</p>
<p>1。首先需要A支持这种访问模式：<br />
配置A的/etc/ssh/sshd_config，将这两项设置如下：<br />
RSAAuthentication yes<br />
PubkeyAuthentication yes</p>
<p>2。B生产id_rsa.pub，并将这个文件中的内容最终用&#8220;&gt;&gt;&#8221;添加到A的authorized_keys文件末尾。</p>
<p>3。在B上，ssh A的ip/A的hostname就可以实现无密码登陆A了</p>
<p>但是这么做是有前提的，很多人都忽略了这个前提，导致费了很多周折都没有成功，就像我似的，我就费了很多时间才找到问题所在。<br />
因为A或B机器里都有很多个账户，在B上键入ssh命令后，我们并没有制定连接到A上的那个帐户，那么这里面默认的潜规则是什么呢？就是你在B上ssh时，当前使用的那个帐户（假如名字是haha）就会作为你期待连接到A上的帐户，我们可以显示的通过ssh -l haha [hostname]或者ssh haha@[hostname]这种方式来连接到A上的haha帐户，如果用隐士规则的话，那么系统就是依据你在B上当前使用的帐户来作为A上被连接的帐户。<br />
因此，要实现无密码访问的前提就是：A和B上有同样的帐户名称，完全一致，包括大小写。（我就很郁闷，因为我在windows下用cygwin和一个linux机器连接，windows下的帐户第一个字母大写了，而linux的帐户的第一个字母是小写的，导致我费了很长时间都没有发现问题症结所在）。其实，这也就是为什么在配置hadoop分布式计算时，必须要求的每个机器上都必须有一个完全一样的用户名。</p>
<p>既然说到了后面的这些注意事项，那么也要提醒大家，在上面给出的三个步骤中的第2步，必须是在等同的帐户下得到的id_rsa.pub文件，否则还是不行。</p>
<img src ="http://www.blogjava.net/bacoo/aggbug/240625.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/bacoo/" target="_blank">so true</a> 2008-11-15 01:25 <a href="http://www.blogjava.net/bacoo/archive/2008/11/15/240625.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>一个简单shell脚本</title><link>http://www.blogjava.net/bacoo/archive/2008/11/15/240624.html</link><dc:creator>so true</dc:creator><author>so true</author><pubDate>Fri, 14 Nov 2008 17:23:00 GMT</pubDate><guid>http://www.blogjava.net/bacoo/archive/2008/11/15/240624.html</guid><wfw:comment>http://www.blogjava.net/bacoo/comments/240624.html</wfw:comment><comments>http://www.blogjava.net/bacoo/archive/2008/11/15/240624.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/bacoo/comments/commentRss/240624.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/bacoo/services/trackbacks/240624.html</trackback:ping><description><![CDATA[<p>今天能写出这样一个shell脚本，其实并没有费太大力气，因此并不是说我几经周折终有结果而兴奋，而是觉得自己现在终于可以踏实下来做自己喜欢做的事情，能够专注的去学该学的东西而兴奋。之前学了很多杂七杂八的东西，因为目标不明确，很痛苦，究其根本，是因为不知道自己将从事什么职业，只知道自己想从事IT这行，但具体的工作方向却不知道，因此啥都要学习，这个过程对于我来说很痛苦。因为我是一个比较喜欢踏踏实实做事的人，不做就不做，做就要做得很好。我之前看过一篇关于论述程序员浮躁的文章，写得太精彩了。而里面提到的很多浮躁的做法都在我身上得到了印证，这让我很郁闷。现在，工作定了，我知道该学点啥了，目标专注了，太美好了。</p>
<p>借用Steven Jobs的一番话来说就是：</p>
<p>The only way to be truely satisfied is to do what you believe is great work, and the only way to do great work is to love what you do!</p>
<p>我觉得一个人能做到这一步，真的很幸福，自己去努力，去拼搏，去实现自己的价值，让自己对自己的表现满意，这是我经常对自己说的一句话。</p>
<p>现在的我，工作定了，女友也定了，也就是媳妇定了，我需要做的就是去奋斗，去努力，去拼搏。</p>
<p>我很感谢自己能遇到这样一个媳妇，能支持我，关心我，我不知道自己今后会不会很成功，但是我知道有了这个好内柱，我做什么都踏实。我知道，有了她，我太幸福，我也一定会带给她幸福的，I promise!</p>
<p>&nbsp;</p>
<p>好了，下面就把代码贴出来吧，呵呵：</p>
<p>#!/bin/sh</p>
<p>cd /hadoop/logs</p>
<p>var="`ls *.log`"<br />
cur=""<br />
name=""<br />
file=log_name.txt</p>
<p>if [ -e $file ]; then<br />
&nbsp;rm $file<br />
fi</p>
<p>for cur in $var<br />
do<br />
&nbsp;name=`echo $cur | cut -d'-' -f3`<br />
&nbsp;<br />
&nbsp;#cat $cur | grep ^2008 | awk '{print $0 " [`echo $name`]"}' &gt;&gt; $file<br />
&nbsp;cat $cur | grep ^2008 | sed "s/^.*$/&amp;[$name]/" &gt;&gt; $file<br />
&nbsp;#awk '{print $0 " [`echo $name`]"}' &gt;&gt; $file<br />
done</p>
<p>cp $file __temp.txt <br />
sort __temp.txt &gt;$file<br />
rm __temp.txt</p>
<p>运行的结果是：</p>
<p>2008-11-14 10:08:47,671 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG: [namenode]<br />
2008-11-14 10:08:48,140 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=9000[namenode]<br />
2008-11-14 10:08:48,171 INFO org.apache.hadoop.dfs.NameNode: Namenode up at: bacoo/192.168.1.34:9000[namenode]<br />
2008-11-14 10:08:48,171 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null[namenode]<br />
2008-11-14 10:08:48,234 INFO org.apache.hadoop.dfs.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext[namenode]<br />
2008-11-14 10:08:48,875 INFO org.apache.hadoop.dfs.FSNamesystemMetrics: Initializing FSNamesystemMeterics using context object:org.apache.hadoop.metrics.spi.NullContext[namenode]<br />
2008-11-14 10:08:48,875 INFO org.apache.hadoop.fs.FSNamesystem: fsOwner=Zhaoyb,None,root,Administrators,Users,Debugger,Users[namenode]<br />
2008-11-14 10:08:48,875 INFO org.apache.hadoop.fs.FSNamesystem: isPermissionEnabled=true[namenode]<br />
2008-11-14 10:08:48,875 INFO org.apache.hadoop.fs.FSNamesystem: supergroup=supergroup[namenode]<br />
2008-11-14 10:08:48,890 INFO org.apache.hadoop.fs.FSNamesystem: Registered FSNamesystemStatusMBean[namenode]<br />
2008-11-14 10:08:48,953 INFO org.apache.hadoop.dfs.Storage: Edits file edits of size 4 edits # 0 loaded in 0 seconds.[namenode]<br />
2008-11-14 10:08:48,953 INFO org.apache.hadoop.dfs.Storage: Image file of size 80 loaded in 0 seconds.[namenode]<br />
2008-11-14 10:08:48,953 INFO org.apache.hadoop.dfs.Storage: Number of files = 0[namenode]<br />
2008-11-14 10:08:48,953 INFO org.apache.hadoop.dfs.Storage: Number of files under construction = 0[namenode]<br />
2008-11-14 10:08:48,953 INFO org.apache.hadoop.fs.FSNamesystem: Finished loading FSImage in 657 msecs[namenode]<br />
2008-11-14 10:08:49,000 INFO org.apache.hadoop.dfs.StateChange: STATE* Leaving safe mode after 0 secs.[namenode]<br />
2008-11-14 10:08:49,000 INFO org.apache.hadoop.dfs.StateChange: STATE* Network topology has 0 racks and 0 datanodes[namenode]<br />
2008-11-14 10:08:49,000 INFO org.apache.hadoop.dfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks[namenode]<br />
2008-11-14 10:08:49,609 INFO org.mortbay.util.Credential: Checking Resource aliases[namenode]<br />
2008-11-14 10:08:50,015 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4[namenode]<br />
2008-11-14 10:08:50,015 INFO org.mortbay.util.Container: Started HttpContext[/logs,/logs][namenode]<br />
2008-11-14 10:08:50,015 INFO org.mortbay.util.Container: Started HttpContext[/static,/static][namenode]<br />
2008-11-14 10:08:54,656 INFO org.mortbay.util.Container: Started org.mortbay.jetty.servlet.WebApplicationHandler@17f11fb[namenode]<br />
2008-11-14 10:08:55,453 INFO org.mortbay.util.Container: Started WebApplicationContext[/,/][namenode]<br />
2008-11-14 10:08:55,468 INFO org.apache.hadoop.fs.FSNamesystem: Web-server up at: 0.0.0.0:50070[namenode]<br />
2008-11-14 10:08:55,468 INFO org.mortbay.http.SocketListener: Started SocketListener on 0.0.0.0:50070[namenode]<br />
2008-11-14 10:08:55,468 INFO org.mortbay.util.Container: Started org.mortbay.jetty.Server@61a907[namenode]<br />
2008-11-14 10:08:55,484 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting[namenode]<br />
2008-11-14 10:08:55,484 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 9000: starting[namenode]<br />
2008-11-14 10:08:55,515 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000: starting[namenode]<br />
2008-11-14 10:08:55,515 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9000: starting[namenode]<br />
2008-11-14 10:08:55,515 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000: starting[namenode]<br />
2008-11-14 10:08:55,515 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9000: starting[namenode]<br />
2008-11-14 10:08:55,515 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9000: starting[namenode]<br />
2008-11-14 10:08:55,531 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000: starting[namenode]<br />
2008-11-14 10:08:55,531 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000: starting[namenode]<br />
2008-11-14 10:08:55,531 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000: starting[namenode]<br />
2008-11-14 10:08:55,531 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000: starting[namenode]<br />
2008-11-14 10:08:55,531 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000: starting[namenode]<br />
2008-11-14 10:08:56,015 INFO org.apache.hadoop.dfs.NameNode.Secondary: STARTUP_MSG: [secondarynamenode]<br />
2008-11-14 10:08:56,156 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=SecondaryNameNode, sessionId=null[secondarynamenode]<br />
2008-11-14 10:08:56,468 WARN org.apache.hadoop.dfs.Storage: Checkpoint directory \tmp\hadoop-SYSTEM\dfs\namesecondary is added.[secondarynamenode]<br />
2008-11-14 10:08:56,546 INFO org.mortbay.util.Credential: Checking Resource aliases[secondarynamenode]<br />
2008-11-14 10:08:56,609 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4[secondarynamenode]<br />
2008-11-14 10:08:56,609 INFO org.mortbay.util.Container: Started HttpContext[/logs,/logs][secondarynamenode]<br />
2008-11-14 10:08:56,609 INFO org.mortbay.util.Container: Started HttpContext[/static,/static][secondarynamenode]<br />
2008-11-14 10:08:56,953 INFO org.mortbay.jetty.servlet.XMLConfiguration: No WEB-INF/web.xml in file:/E:/cygwin/hadoop/webapps/secondary. Serving files and default/dynamic servlets only[secondarynamenode]<br />
2008-11-14 10:08:56,953 INFO org.mortbay.util.Container: Started org.mortbay.jetty.servlet.WebApplicationHandler@b1a4e2[secondarynamenode]<br />
2008-11-14 10:08:57,062 INFO org.mortbay.util.Container: Started WebApplicationContext[/,/][secondarynamenode]<br />
2008-11-14 10:08:57,078 INFO org.apache.hadoop.dfs.NameNode.Secondary: Secondary Web-server up at: 0.0.0.0:50090[secondarynamenode]<br />
2008-11-14 10:08:57,078 INFO org.mortbay.http.SocketListener: Started SocketListener on 0.0.0.0:50090[secondarynamenode]<br />
2008-11-14 10:08:57,078 INFO org.mortbay.util.Container: Started org.mortbay.jetty.Server@18a8ce2[secondarynamenode]<br />
2008-11-14 10:08:57,078 WARN org.apache.hadoop.dfs.NameNode.Secondary: Checkpoint Period&nbsp;&nbsp; :3600 secs (60 min)[secondarynamenode]<br />
2008-11-14 10:08:57,078 WARN org.apache.hadoop.dfs.NameNode.Secondary: Log Size Trigger&nbsp;&nbsp;&nbsp; :67108864 bytes (65536 KB)[secondarynamenode]<br />
2008-11-14 10:08:59,828 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: [jobtracker]<br />
2008-11-14 10:09:00,015 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=JobTracker, port=9001[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9001: starting[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9001: starting[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9001: starting[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9001: starting[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9001: starting[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9001: starting[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9001: starting[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9001: starting[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9001: starting[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9001: starting[jobtracker]<br />
2008-11-14 10:09:00,031 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 9001: starting[jobtracker]<br />
2008-11-14 10:09:00,125 INFO org.mortbay.util.Credential: Checking Resource aliases[jobtracker]<br />
2008-11-14 10:09:01,703 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4[jobtracker]<br />
2008-11-14 10:09:01,703 INFO org.mortbay.util.Container: Started HttpContext[/logs,/logs][jobtracker]<br />
2008-11-14 10:09:01,703 INFO org.mortbay.util.Container: Started HttpContext[/static,/static][jobtracker]<br />
2008-11-14 10:09:02,312 INFO org.mortbay.util.Container: Started org.mortbay.jetty.servlet.WebApplicationHandler@1cd280b[jobtracker]<br />
2008-11-14 10:09:08,359 INFO org.mortbay.util.Container: Started WebApplicationContext[/,/][jobtracker]<br />
2008-11-14 10:09:08,375 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 9001[jobtracker]<br />
2008-11-14 10:09:08,375 INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030[jobtracker]<br />
2008-11-14 10:09:08,375 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=[jobtracker]<br />
2008-11-14 10:09:08,375 INFO org.mortbay.http.SocketListener: Started SocketListener on 0.0.0.0:50030[jobtracker]<br />
2008-11-14 10:09:08,375 INFO org.mortbay.util.Container: Started org.mortbay.jetty.Server@16a9b9c[jobtracker]<br />
2008-11-14 10:09:12,984 INFO org.apache.hadoop.mapred.JobTracker: Starting RUNNING[jobtracker]<br />
2008-11-14 10:09:56,894 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: [datanode]<br />
2008-11-14 10:10:02,516 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG: [tasktracker]<br />
2008-11-14 10:10:08,768 INFO org.apache.hadoop.dfs.Storage: Formatting ...[datanode]<br />
2008-11-14 10:10:08,768 INFO org.apache.hadoop.dfs.Storage: Storage directory /hadoop/hadoopfs/data is not formatted.[datanode]<br />
2008-11-14 10:10:11,343 INFO org.apache.hadoop.dfs.DataNode: Registered FSDatasetStatusMBean[datanode]<br />
2008-11-14 10:10:11,347 INFO org.apache.hadoop.dfs.DataNode: Opened info server at 50010[datanode]<br />
2008-11-14 10:10:11,352 INFO org.apache.hadoop.dfs.DataNode: Balancing bandwith is 1048576 bytes/s[datanode]<br />
2008-11-14 10:10:16,430 INFO org.mortbay.util.Credential: Checking Resource aliases[tasktracker]<br />
2008-11-14 10:10:17,976 INFO org.mortbay.util.Credential: Checking Resource aliases[datanode]<br />
2008-11-14 10:10:20,068 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4[datanode]<br />
2008-11-14 10:10:20,089 INFO org.mortbay.util.Container: Started HttpContext[/logs,/logs][datanode]<br />
2008-11-14 10:10:20,089 INFO org.mortbay.util.Container: Started HttpContext[/static,/static][datanode]<br />
2008-11-14 10:10:20,725 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4[tasktracker]<br />
2008-11-14 10:10:20,727 INFO org.mortbay.util.Container: Started HttpContext[/logs,/logs][tasktracker]<br />
2008-11-14 10:10:20,727 INFO org.mortbay.util.Container: Started HttpContext[/static,/static][tasktracker]<br />
2008-11-14 10:10:27,078 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/localhost[jobtracker]<br />
2008-11-14 10:10:32,171 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 192.168.1.167:50010 storage DS-1556534590-127.0.0.1-50010-1226628640386[namenode]<br />
2008-11-14 10:10:32,187 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/192.168.1.167:50010[namenode]<br />
2008-11-14 10:13:57,171 WARN org.apache.hadoop.dfs.Storage: Checkpoint directory \tmp\hadoop-SYSTEM\dfs\namesecondary is added.[secondarynamenode]<br />
2008-11-14 10:13:57,187 INFO org.apache.hadoop.fs.FSNamesystem: Number of transactions: 5 Total time for transactions(ms): 0 Number of syncs: 3 SyncTimes(ms): 4125 [namenode]<br />
2008-11-14 10:13:57,187 INFO org.apache.hadoop.fs.FSNamesystem: Roll Edit Log from 192.168.1.34[namenode]<br />
2008-11-14 10:13:57,953 INFO org.apache.hadoop.dfs.NameNode.Secondary: Downloaded file fsimage size 80 bytes.[secondarynamenode]<br />
2008-11-14 10:13:57,968 INFO org.apache.hadoop.dfs.NameNode.Secondary: Downloaded file edits size 288 bytes.[secondarynamenode]<br />
2008-11-14 10:13:58,593 INFO org.apache.hadoop.fs.FSNamesystem: fsOwner=Zhaoyb,None,root,Administrators,Users,Debugger,Users[secondarynamenode]<br />
2008-11-14 10:13:58,593 INFO org.apache.hadoop.fs.FSNamesystem: isPermissionEnabled=true[secondarynamenode]<br />
2008-11-14 10:13:58,593 INFO org.apache.hadoop.fs.FSNamesystem: supergroup=supergroup[secondarynamenode]<br />
2008-11-14 10:13:58,640 INFO org.apache.hadoop.dfs.Storage: Edits file edits of size 288 edits # 5 loaded in 0 seconds.[secondarynamenode]<br />
2008-11-14 10:13:58,640 INFO org.apache.hadoop.dfs.Storage: Number of files = 0[secondarynamenode]<br />
2008-11-14 10:13:58,640 INFO org.apache.hadoop.dfs.Storage: Number of files under construction = 0[secondarynamenode]<br />
2008-11-14 10:13:58,718 INFO org.apache.hadoop.dfs.Storage: Image file of size 367 saved in 0 seconds.[secondarynamenode]<br />
2008-11-14 10:13:58,796 INFO org.apache.hadoop.fs.FSNamesystem: Number of transactions: 0 Total time for transactions(ms): 0 Number of syncs: 0 SyncTimes(ms): 0 [secondarynamenode]<br />
2008-11-14 10:13:58,921 INFO org.apache.hadoop.dfs.NameNode.Secondary: Posted URL 0.0.0.0:50070putimage=1&amp;port=50090&amp;machine=192.168.1.34&amp;token=-16:145044639:0:1226628551796:1226628513000[secondarynamenode]<br />
2008-11-14 10:13:59,078 INFO org.apache.hadoop.fs.FSNamesystem: Number of transactions: 0 Total time for transactions(ms): 0 Number of syncs: 0 SyncTimes(ms): 0 [namenode]<br />
2008-11-14 10:13:59,078 INFO org.apache.hadoop.fs.FSNamesystem: Roll FSImage from 192.168.1.34[namenode]<br />
2008-11-14 10:13:59,265 WARN org.apache.hadoop.dfs.NameNode.Secondary: Checkpoint done. New Image Size: 367[secondarynamenode]<br />
2008-11-14 10:29:02,171 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 0 time(s).[secondarynamenode]<br />
2008-11-14 10:29:04,187 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 1 time(s).[secondarynamenode]<br />
2008-11-14 10:29:06,109 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 2 time(s).[secondarynamenode]<br />
2008-11-14 10:29:08,015 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 3 time(s).[secondarynamenode]<br />
2008-11-14 10:29:10,031 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 4 time(s).[secondarynamenode]<br />
2008-11-14 10:29:11,937 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 5 time(s).[secondarynamenode]<br />
2008-11-14 10:29:13,843 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 6 time(s).[secondarynamenode]<br />
2008-11-14 10:29:15,765 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 7 time(s).[secondarynamenode]<br />
2008-11-14 10:29:17,671 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 8 time(s).[secondarynamenode]<br />
2008-11-14 10:29:19,593 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 9 time(s).[secondarynamenode]<br />
2008-11-14 10:29:21,078 ERROR org.apache.hadoop.dfs.NameNode.Secondary: Exception in doCheckpoint: [secondarynamenode]<br />
2008-11-14 10:29:21,171 ERROR org.apache.hadoop.dfs.NameNode.Secondary: java.io.IOException: Call failed on local exception[secondarynamenode]<br />
2008-11-14 10:34:23,156 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 0 time(s).[secondarynamenode]<br />
2008-11-14 10:34:25,078 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 1 time(s).[secondarynamenode]<br />
2008-11-14 10:34:27,078 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 2 time(s).[secondarynamenode]<br />
2008-11-14 10:34:29,078 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 3 time(s).[secondarynamenode]<br />
2008-11-14 10:34:31,000 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 4 time(s).[secondarynamenode]<br />
2008-11-14 10:34:32,906 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 5 time(s).[secondarynamenode]<br />
2008-11-14 10:34:34,921 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 6 time(s).[secondarynamenode]<br />
2008-11-14 10:34:36,828 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 7 time(s).[secondarynamenode]<br />
2008-11-14 10:34:38,640 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 8 time(s).[secondarynamenode]<br />
2008-11-14 10:34:40,546 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Bacoo/192.168.1.34:9000. Already tried 9 time(s).[secondarynamenode]<br />
2008-11-14 10:34:41,468 ERROR org.apache.hadoop.dfs.NameNode.Secondary: Exception in doCheckpoint: [secondarynamenode]<br />
2008-11-14 10:34:41,468 ERROR org.apache.hadoop.dfs.NameNode.Secondary: java.io.IOException: Call failed on local exception[secondarynamenode]<br />
2008-11-14 10:38:43,359 INFO org.apache.hadoop.dfs.NameNode.Secondary: SHUTDOWN_MSG: [secondarynamenode]<br />
</p>
我相信，这样就可以按照时间的顺序，把生产的日志好好理一遍顺序了，而且每一个步骤后面还都有了各自对应的node类型。
  <img src ="http://www.blogjava.net/bacoo/aggbug/240624.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/bacoo/" target="_blank">so true</a> 2008-11-15 01:23 <a href="http://www.blogjava.net/bacoo/archive/2008/11/15/240624.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>