﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>BlogJava-Python, Java, Life, etc-随笔分类-Build Website</title><link>http://www.blogjava.net/pyguru/category/470.html</link><description>A blog of technology and life.</description><language>zh-cn</language><lastBuildDate>Fri, 02 Mar 2007 01:59:37 GMT</lastBuildDate><pubDate>Fri, 02 Mar 2007 01:59:37 GMT</pubDate><ttl>60</ttl><item><title>Add RSS feeds to your Web site with Perl XML::RSS</title><link>http://www.blogjava.net/pyguru/archive/2005/02/17/1268.html</link><dc:creator>pyguru</dc:creator><author>pyguru</author><pubDate>Wed, 16 Feb 2005 19:04:00 GMT</pubDate><guid>http://www.blogjava.net/pyguru/archive/2005/02/17/1268.html</guid><wfw:comment>http://www.blogjava.net/pyguru/comments/1268.html</wfw:comment><comments>http://www.blogjava.net/pyguru/archive/2005/02/17/1268.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/pyguru/comments/commentRss/1268.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/pyguru/services/trackbacks/1268.html</trackback:ping><description><![CDATA[<span class="mdeck">
Guest Contributor, TechRepublic<br>
December  22, 2004<br>
URL: <a href="http://www.builderau.com.au/architect/webservices/0,39024590,39171461,00.htm">http://www.builderau.com.au/architect/webservices/0,39024590,39171461,00.htm</a><p>
</p></span>
<!-- Story Body BEGIN -->
<br>

<span class="mdeck">
<span class="mdeck"><img src="http://www.builderau.com.au/resources/images/TechRepublic150x36.gif" alt="TechRepublic" height="36" width="150"><br><br><span class="smdeck">Take advantage of the XML::RSS CPAN package, which is specifically designed to read and parse RSS feeds.</span>  
<p>
You've probably already heard of RSS, the XML-based format which allows
Web sites to publish and syndicate the latest content on their site to
all interested parties. RSS is a boon to the lazy Webmaster, because
(s)he no longer has to manually update his or her Web site with new
content.
</p><p>Instead, all a Webmaster has to do is plug in an RSS client,
point it to the appropriate Web sites, and sit back and let the site
"update itself" with news, weather forecasts, stock market data, and
software alerts. You've already seen, in <a href="http://www.builderau.com.au/architect/webservices/0,39024590,39131860,00.htm" target="_blank">previous articles</a>,
how you can use the ASP.NET platform to manually parse an RSS feed and
extract information from it by searching for the appropriate elements.
But I'm a UNIX guy, and I have something that's even better than
ASP.NET. It's called Perl.
</p><p>
<span class="subhead1">Installing XML::RSS</span>
<br>RSS parsing in Perl is usually handled by the XML::RSS CPAN
package. Unlike ASP.NET, which comes with a generic XML parser and
expects you to manually write RSS-parsing code, the XML::RSS package is
specifically designed to read and parse RSS feeds. When you give
XML::RSS an RSS feed, it converts the various &lt;item&gt;s in the feed
into array elements, and exposes numerous methods and properties to
access the data in the feed. XML::RSS currently supports versions 0.9,
0.91, and 1.0 of RSS.
</p><p>
Written entirely in Perl, XML::RSS isn't included with Perl by default, and you must install it from <a href="http://search.cpan.org/%7Ekellan/XML-RSS-1.05/lib/RSS.pm" target="_blank">CPAN</a>.
Detailed installation instructions are provided in the download
archive, but by far the simplest way to install it is to use the CPAN
shell, as follows:
</p><p>
<span class="code">
shell&gt; perl -MCPAN -e shell<br>
cpan&gt; install XML::RSS
</span>
</p><p>If you use the CPAN shell, dependencies will be automatically
downloaded for you (unless you told the shell not to download dependent
modules). If you manually download and install the module, you may need
to download and install the XML::Parser module before XML::RSS can be
installed. The examples in this tutorial also need the LWP::Simple
package, so you should download and install that one too if you don't
already have it.
</p><p>
<span class="subhead1">Basic usage</span>
<br>For our example, we'll assume that you're interested in displaying
the latest geek news from Slashdot on your site. The URL for Slashdot's
RSS feed is located <a href="http://slashdot.org/index.rss" target="_blank">here</a>. The script in <b>Listing A</b> retrieves this feed, parses it, and turns it into a human-readable HTML page using XML::RSS:
</p><p>
<b>Listing A</b>
</p><p>
</p><pre>#!/usr/bin/perl<br><br># import packages<br>use XML::RSS;<br>use LWP::Simple;<br><br># initialize object<br>$rss = new XML::RSS();<br><br># get RSS data<br>$raw = get('http://www.slashdot.org/index.rss');<br><br># parse RSS feed<br>$rss-&gt;parse($raw);<br><br># print HTML header and page<br>print "Content-Type: text/html\n\n";<br>print "<basefont face="Arial" size="8">"; print ""; print "";<br>print "";<br>print "<table border="1" cellpadding="5" cellspacing="0" width="300"><tbody><tr><td align="center" bgcolor="Silver">" . $rss-&gt;channel('title') .
"</td></tr><tr><td>";

# print titles and URLs of news items
foreach my $item (@{$rss-&gt;{'items'}})
{
        $title = $item-&gt;{'title'};
        $url = $item-&gt;{'link'};
        print "<a href="http://www.builderau.com.au/architect/webservices/%5C%22$url%5C%22">$title</a><p \="">"; }

# print footers
print "</p></td></tr></tbody></table>";<br>print "";<br><br></pre>
<p>
Place the script in your Web server's cgi-bin/ directory/. Remember to
make it executable, and then browse to it using your Web browser. After
a short wait for the RSS file to download, you should see something
like <b>Figure A</b>.
</p><p>
</p><center>
<b>Figure A</b>
<p>

<img src="http://www.builderau.com.au/resources/images/rssfeedsa.gif"><br>
Slashdot RSS feed</p></center>
<p>
 
</p></span>
<span class="mdeck">How does the script in <b>Listing A</b> work? Well,
the first task is to get the RSS feed from the remote system to the
local one. This is accomplished with the LWP::Simple package, which
simulates an HTTP client and opens up a network connection to the
remote site to retrieve the RSS data. An XML::RSS object is created,
and this raw data is then passed to it for processing.
<p>
The various elements of the RSS feed are converted into Perl structures, and a <i>foreach()</i>
loop is used to iterate over the array of items. Each item contains
properties representing the item name, URL and description; these
properties are used to dynamically build a readable list of news items.
Each time Slashdot updates its RSS feed, the list of items displayed by
the script above will change automatically, with no manual intervention
required.
</p><p>
The script in <b>Listing A</b> will work with other RSS feeds as well—simply alter the URL passed to the LWP's <i>get()</i> method, and watch as the list of items displayed by the script changes.
</p><p>
</p><hr width="100%">

<b>Here are some RSS feeds to get you started</b>
<p>
</p><ul><li><a href="http://www.builderau.com.au/feeds.htm" target="_blank">Builder AU</a>
</li><li><a href="http://www.thinkgeek.com/thinkgeek.rdf" target="_blank">Thinkgeek</a> 
</li><li><a href="http://www.cnet.com/4520-6022-5115113.html" target="_blank">CNET</a> 
</li><li><a href="http://www.syndic8.com/" target="_blank">Syndic8</a> 
</li><li><a href="http://www.weatherclicks.com/cgi-bin/weather/hw3.cgi?config=&amp;forecast=pass&amp;pass=tafINT" target="_blank">Local weather forecasts</a>
</li></ul>
<p>

<b>Tip:</b> Notice that the RSS channel name (and description) can be obtained with the object's <i>channel()</i> method, which accepts any one of three arguments (title, description or link) and returns the corresponding channel value.
 </p><hr width="100%">
<p>
<span class="subhead1">Adding multiple sources and optimising performance</span>
<br>
So that takes care of adding a feed to your Web site. But hey, why limit yourself to one when you can have many? <b>Listing B</b>, a revision of the <b>Listing A</b>,
sets up an array containing the names of many different RSS feeds, and
iterates over the array to produce a page containing multiple channels
of information.
</p><p>
<b>Listing B</b>
</p><p>
</p><pre>#!/usr/bin/perl<br><br># import packages<br>use XML::RSS;<br>use LWP::Simple;<br><br># initialize object<br>$rss = new XML::RSS();<br><br># get RSS data<br>$raw = get('http://www.slashdot.org/index.rss');<br><br># parse RSS feed<br>$rss-&gt;parse($raw);<br><br># print HTML header and page<br>print "Content-Type: text/html\n\n";<br>print "<basefont face="Arial" size="8">"; print ""; print "";<br>print "";<br>print "<table border="1" cellpadding="5" cellspacing="0" width="300"><tbody><tr><td align="center" bgcolor="Silver">" . $rss-&gt;channel('title') .
"</td></tr><tr><td>";

# print titles and URLs of news items
foreach my $item (@{$rss-&gt;{'items'}})
{
        $title = $item-&gt;{'title'};
        $url = $item-&gt;{'link'};
        print "<a href="http://www.builderau.com.au/architect/webservices/%5C%22$url%5C%22">$title</a><p \="">"; }

# print footers
print "</p></td></tr></tbody></table>";<br>print "";<br></pre>
<p>
<b>Figure B</b> shows you what it looks like.
</p><p>
</p><center>
<b>Figure B</b>
<p>

<img src="http://www.builderau.com.au/resources/images/rssfeedsb.gif"><br>
Several RSS feeds 
</p></center>
<p>
You'll notice, if you're sharp-eyed, that <b>Listing B</b> uses the <i>parsefile()</i>
method to read a local version of the RSS file, instead of using LWP to
retrieve it from the remote site. This revision results in improved
performance, because it does away with the need to generate an internal
request for the RSS data source every time the script is executed.
Fetching the RSS file on each script run not only causes things to go
slow (because of the time taken to fetch the RSS file), but it's also
inefficient; it's unlikely that the source RSS file will change on a
minute-by-minute basis, and by fetching the same data over and over
again, you're simply wasting bandwidth. A better solution is to
retrieve the RSS data source once, save it to a local file, and use
that local file to generate your page.
</p><p>Depending on how often the source file gets updated, you can
write a simple shell script to download a fresh copy of the file on a
regular basis.
</p><p>
Here's an example of such a script:
</p><p>
<span class="code">
#!/bin/bash<br>

/bin/wget http://www.freshmeat.net/backend/fm.rdf -O freshmeat.rdf
</span>
</p><p> 
This script uses the <i>wget</i> utility (included with most Linux distributions) to download and save the RSS file to disk. Add this to your system <i>crontab</i>, and set it to run on an hourly or daily basis.
</p><p>If you find performance unacceptably low even after using local
copies of RSS files, you can take things a step further, by generating
a static HTML snapshot from the script above, and sending that to
clients instead. To do this, comment out the line printing the
"Content-Type" header in the script above and then run the script from
the console, redirecting the output to an HTML file. Here's how:
</p><p>
<span class="code">
$ ./rss.cgi &gt; static.html</span>
</p><p>Now, simply serve this HTML file to your users. Since the file
is a static file and not a script, no server-side processing takes
place before the server transmits it to the client. You can run the
command-line above from your <i>crontab</i>
to regenerate the HTML file on a regular basis. Performance with a
static file should be noticeably better than with a Perl script.
</p><p>
Looks easy? What are you waiting for—get out there and start hooking your site up to your favorite RSS news feeds.
</p></span></span><img src ="http://www.blogjava.net/pyguru/aggbug/1268.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/pyguru/" target="_blank">pyguru</a> 2005-02-17 03:04 <a href="http://www.blogjava.net/pyguru/archive/2005/02/17/1268.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Lilina：RSS聚合器构建个人门户(Write once, publish anywhere)</title><link>http://www.blogjava.net/pyguru/archive/2005/02/17/1267.html</link><dc:creator>pyguru</dc:creator><author>pyguru</author><pubDate>Wed, 16 Feb 2005 19:00:00 GMT</pubDate><guid>http://www.blogjava.net/pyguru/archive/2005/02/17/1267.html</guid><wfw:comment>http://www.blogjava.net/pyguru/comments/1267.html</wfw:comment><comments>http://www.blogjava.net/pyguru/archive/2005/02/17/1267.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/pyguru/comments/commentRss/1267.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/pyguru/services/trackbacks/1267.html</trackback:ping><description><![CDATA[<h3>Lilina：<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>聚合器构建个人门户(Write once, publish anywhere)</h3>


<p>最近搜集<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>解析工具中找到了<a href="http://magpie.sourceforge.net/">MagPieRSS</a> 和基于其设计的<a href="http://lilina.sourceforge.net/">Lilina</a>；Lilina的主要功能：</p>


<p>1 基于WEB界面的<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>管理：添加，删除，OPML导出，<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>后台缓存机制（避免对数据源服务器产生过大压力），ScriptLet: 类似于Del.icio.us it的收藏夹即时订阅JS脚本；</p>


<p>2 前台发布：将自己的首页改成了用Lilina发布我常看的几个朋友的网志，也省去了很多更新自己网页的工作，需要<strong>php 4.3 + mbstring iconv</strong><br>
<img alt="lilina.png" src="http://www.chedong.com/blog/archives/lilina.png" height="441" width="394"><br>
开源软件对i18n的支持越来越好了，php 4.3.x，'--enable-mbstring' '--with-iconv'后比较好的同时处理了UTF-8和其他中文字符集发布的<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>。<br>
<a href="http://minutillo.com/steve/weblog/2004/6/17/php-xml-and-character-encodings-a-tale-of-sadness-rage-and-data-loss">需要感谢Steve在PHP进行转码方面</a>对<a href="http://magpierss.sourceforge.net/">MagPieRSS</a>进行和XML Hacking工作。至少目前为止：<a href="http://weblog.chedong.com/archives/000496.html">Add to my yahoo还不能很好的处理utf-8字符集的<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>收藏</a>。</p>


<p>记得年初<a href="http://blog.timetide.net/">Wen Xin</a>在CNBlog的研讨会上介绍了<a href="http://www.wen-xin.net/document/blog-socialnetwork-personal-portal-wenxin.ppt">个人门户</a>的概念，随着<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>在CMS技术中的成熟，越来越多的服务可以让个人用户根据自己需求构建门户，也算是符合了互联网的<a href="http://www.google.com/search?q=define%3Adecentralization">非中心化</a>趋势吧，比如利用Add to My Yahoo!功能，用户可以轻松的实现自己从更多数据源进行新闻订阅。想象一下把你自己的del.icio.us书签收藏 / flickr图片收藏 / Yahoo!新闻都通过这样一个<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>聚合器聚合/发布起来。其传播效率将有多快。</p>


<p>好比软件开发通过中间平台/虚拟机实现：一次写成，随处运行（Write once, run anywhere），通过<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>/XML这个中间层，信息发布也实现了：一次写成，随处发布（Write once, publish anywhere...）</p>


<div id="a000027more"><div id="more">
<p>安装Lilina需要PHP 4.3 以上，并带有iconv mbstring等函数的支持，请确认一下<a href="http://www.chedong.com/phpMan.php/phpinfo">PHP模块的支持</a>：'--enable-mbstring' '--with-iconv'</p>

<p>另外就是一个需要能通过服务器端向外部服务器发送RPC请求，这点51.NET不支持。感觉<a href="http://signup.powweb.com/powweb-bin/referer.cgi?account_id=100811">PowWeb的服务</a>很不错，很多缺省的包都安装好了：</p>

<p>iconv<br>
iconv support  enabled<br>
iconv implementation  unknown<br>
iconv library version  unknown</p>

<p>Directive Local Value Master Value<br>
iconv.input_encoding ISO-8859-1 ISO-8859-1<br>
iconv.internal_encoding ISO-8859-1 ISO-8859-1<br>
iconv.output_encoding ISO-8859-1 ISO-8859-1</p>

<p>mbstring<br>
Multibyte Support  enabled<br>
Japanese support  enabled<br>
Simplified chinese support  enabled<br>
Traditional chinese support  enabled<br>
Korean support  enabled<br>
Russian support  enabled<br>
Multibyte (japanese) regex support  enabled</p>

<p>将安装包解包（下载文件扩展名是.gz 其实是.tgz，需要重命名一下）：上传到服务器相应目录下，注意：相应cache目录和当前目录的可写入属性设置，然后配置一下conf.php中的参数即可开始使用。</p>

<p>何东给我的建议：<br>
1）右边的一栏，第一项的sources最好跟hobby、友情链接一样，加个图片。<br>
2）一堆检索框在那儿，有些乱，建议只有一个，其它的放到一个二级页面上。<br>
3）把联系方式及cc,分别做成一条或一个图片，放在右边一栏中，具体的内容可以放到二级页面上，因为我觉得好象没有多少人会细读这些文字。<br>
4）如果可能，把lilina的头部链接汉化一下吧？</p>

<p>一些改进计划：<br>
1 删除过长的摘要，可以通过寻找第2个"</p><p>"  实现；<br>
2 分组功能：将<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>进行组输出；</p>

<p>修改默认显示实现：Lilina缺省显示最近1天发表的文章，如果需要改成其他时间周期可以找到：<br>
$TIMERANGE = ( $_REQUEST['hours'] ? $_REQUEST['hours']*3600 : 3600*24 ) ;</p>

<p>进行改动。</p>

<p><b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>是一个能将自己的所有资源：WIKI / BLOG / 邮件聚合起来的轻量级协议，以后无论你在何处书写，只要有<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>接口就都可以通过一定方式进行再次的汇聚和发布起来，从而大大提高了个人知识管理和发布/传播效率。</p>

<p>以前对<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>理解非常浅：不就是一个DTD嘛，真了解起解析器来，才知道namespace的重要性，一个好的协议也应该是这样的：并非没有什么可加的，但肯定是没有什么可“减”的了，而真的要做到这个其实很难很难……。</p>

<p>我会再尝试一下JAVA的相关解析器，将其扩展到<a href="http://sourceforge.net/projects/weblucene/">WebLucene</a>项目中，更多<a href="http://java-source.net/open-source/rss-rdf-tools">Java相关Open Source <b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>解析器资源</a>。</p>

<p>另外找到的2个使用<b style="color: black; background-color: rgb(255, 255, 102);">Perl</b>进行<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>解析的包：<br>
使用<a href="http://search.cpan.org/%7Eebosrup/RSS-Parser-Lite/">XML::<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>::<b style="color: black; background-color: rgb(153, 255, 153);">Parser</b>::Lite</a>和<a href="http://search.cpan.org/%7Etima/XML-RSS-Parser-2.15/">XML::<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>::<b style="color: black; background-color: rgb(153, 255, 153);">Parser</b> </a>解析<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b></p>

<p>XML::<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>::<b style="color: black; background-color: rgb(153, 255, 153);">Parser</b>::Lite的代码样例如下：</p>

<p>#!/usr/bin/<b style="color: black; background-color: rgb(255, 255, 102);">perl</b> -w<br>
# $Id$<br>
# XML::<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>::<b style="color: black; background-color: rgb(153, 255, 153);">Parser</b>::Lite sample</p>

<p>use strict;<br>
use XML::<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>::<b style="color: black; background-color: rgb(153, 255, 153);">Parser</b>::Lite;<br>
use LWP::Simple;</p>

<p><br>
my $xml = get("http://www.klogs.org/index.xml");<br>
my $rp = new XML::<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>::<b style="color: black; background-color: rgb(153, 255, 153);">Parser</b>::Lite;<br>
$rp-&gt;parse($xml);</p>

<p># print blog header<br>
print "&lt;a href=\"".$rp-&gt;get('url')."\"&gt;" . $rp-&gt;get('title') . " - " . $rp-&gt;get('description') . "&lt;/a&gt;\n";</p>

<p># convert item to &lt;li&gt;<br>
print "&lt;ul&gt;";<br>
for (my $i = 0; $i &lt; $rp-&gt;count(); $i++) {<br>
        my $it = $rp-&gt;get($i);<br>
        print "&lt;li&gt;&lt;a href=\"" . $it-&gt;get('url') . "\"&gt;" . $it-&gt;get('title') . "&lt;/a&gt;&lt;/li&gt;\n";<br>
}<br>
print "&lt;/ul&gt;";</p>

<p>安装：<br>
    需要SOAP-Lite</p>

<p>优点：<br>
    方法简单，支持远程抓取；</p>

<p>缺点：<br>
    只支持title, url, description这3个字段，不支持时间字段，</p>

<p>计划用于简单的抓取<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>同步服务设计：每个人都可以出版自己订阅的<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>。</p>

<p><br>
 XML::<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>::<b style="color: black; background-color: rgb(153, 255, 153);">Parser</b>代码样例如下：<br>
#!/usr/bin/<b style="color: black; background-color: rgb(255, 255, 102);">perl</b> -w<br>
# $Id$<br>
# XML::<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>::<b style="color: black; background-color: rgb(153, 255, 153);">Parser</b> sample with Iconv charset convert</p>

<p>use strict;<br>
use XML::<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>::<b style="color: black; background-color: rgb(153, 255, 153);">Parser</b>;<br>
use Text::Iconv;<br>
my $converter = Text::Iconv-&gt;new("utf-8", "gbk");</p>

<p><br>
my $p = new XML::<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>::<b style="color: black; background-color: rgb(153, 255, 153);">Parser</b>;<br>
my $feed = $p-&gt;parsefile('index.xml');</p>

<p># output some values<br>
my $title = XML::<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>::<b style="color: black; background-color: rgb(153, 255, 153);">Parser</b>-&gt;ns_qualify('title',$feed-&gt;rss_namespace_uri);<br>
# may cause error this line: print $feed-&gt;channel-&gt;children($title)-&gt;value."\n";<br>
print "item count: ".$feed-&gt;item_count()."\n\n";<br>
foreach my $i ( $feed-&gt;items ) {<br>
   map { print $_-&gt;name.": ".$converter-&gt;convert($_-&gt;value)."\n" } $i-&gt;children;<br>
   print "\n";<br>
}</p>

<p>优点：<br>
    能够直接将数据按字段输出，提供更底层的界面；</p>

<p>缺点：<br>
    不能直接解析远程<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>，需要下载后再解析；</p>

<p>2004-12-14: <br>
从cnblog的Trackback中了解到了<a href="http://planetplanet.org/">Planet <b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>聚合器</a></p>

<p>Planet的安装：解包后，直接在目录下运行：python planet.py examples/config.ini 就可以在output目录中看到缺省样例FEED中的输出了index.html，另外还有opml.xml和<b style="color: black; background-color: rgb(160, 255, 255);">rss</b>.xml等输出（这点比较好）</p>

<p>我用几个<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>试了一下，UTF-8的没有问题，但是GBK的全部都乱码了，planetlib.py中和XML字符集处理的只有以下代码：看来所有的非UTF-8都被当作iso8859_1处理了：<br>
        try:<br>
            data = unicode(data, "utf8").encode("utf8")<br>
            logging.debug("Encoding: UTF-8")<br>
        except UnicodeError:<br>
            try:<br>
                data = unicode(data, "iso8859_1").encode("utf8")<br>
                logging.debug("Encoding: ISO-8859-1")<br>
            except UnicodeError:<br>
                data = unicode(data, "ascii", "replace").encode("utf8")<br>
                logging.warn("Feed wasn't in UTF-8 or ISO-8859-1, replaced " +<br>
                             "all non-ASCII characters.")</p>

<p>近期学习一下Python的unicode处理，感觉是一个很简洁的语言，有比较好的try ... catch 机制和logging</p>

<p>关于MagPieRSS性能问题的疑虑：<br>
对于Planet和MagPieRSS性能的主要差异在是缓存机制上，关于使用缓存机制加速WEB服务可以参考：<a href="http://www.chedong.com/tech/cache.html">可缓存的cms设计</a>。</p>

<p>可以看到：Lilina的缓存机制是每次请求的时候遍历缓存目录下的<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>文件，如果缓存文件过期，还要动态向<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>数据源进行请求。因此不能支持后台太多的<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>订阅和前端大量的并发访问（会造成很多的I/O操作）。</p>

<p>Planet是一个后台脚本，通过脚本将订阅的<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>定期汇聚成一个文件输出成静态文件。</p>

<p>其实只要在MagPieRSS前端增加一个wget脚本定期将index.php的数据输出成index.html，然后要求每次访问先访问index.html缓存，这样不就和Planet的每小时生成index.html静态缓存一样了吗。</p>

<p>所以在不允许自己配置服务器脚本的虚拟主机来说Planet根本是无法运行的。</p>

<p>更多关于PHP中处理GBK的XML解析问题请参考：<br>
<a href="http://weblog.chedong.com/archives/000598.html">MagPieRSS中UTF-8和GBK的<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>解析分析</a></p>

<p>2004-12-19 <br>
正如在SocialBrain 2005年的讨论会中，Isaac Mao所说：<strong>Blog is a 'Window', also could be a 'Bridge'</strong>，Blog是个人/组织对外的“窗口”，而<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>更方便你将这些窗口组合起来，成为其间的“桥梁”，有了这样的中间发布层，Blog不仅从单点发布，更到P2P自助传播，越来越看到了<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>在网络传播上的重要性。</p>
</div></div>


<p class="posted">Posted by chedong at December 11, 2004 12:34 AM
<a href="http://www.chedong.com/cgi-bin/mt3/mt.cgi?__mode=view&amp;_type=entry&amp;id=27&amp;blog_id=1">Edit</a>
<br>
Last Modified at December 19, 2004 04:40 PM
</p>


<ul>

相关文章:
<li><a href="http://www.chedong.com/blog/archives/000048.html">2005改变你生活的50种方法</a> 2005-01-31</li><li><a href="http://www.chedong.com/blog/archives/000047.html">首尔之行</a> 2005-01-25</li><li><a href="http://www.chedong.com/blog/archives/000045.html">+1 rel="nofollow" = 互联网为超链戴上的安全套?! ;-)</a> 2005-01-21</li><li><a href="http://www.chedong.com/blog/archives/000044.html">可读性和更新性: <b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>模板的atom化改造</a> 2005-01-20</li><li><a href="http://www.chedong.com/blog/archives/000043.html">让搜索引擎Spider告诉你：什么时间，从哪里，用什么身份抓取了你的网站</a> 2005-01-17</li>
</ul>



<script type="text/javascript"><!--
google_ad_client = "pub-1309797784693300";
google_ad_width = 468;
google_ad_height = 60;
google_ad_format = "468x60_as";
google_ad_channel ="";
//--></script>
<script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script><iframe name="google_ads_frame" src="http://pagead2.googlesyndication.com/pagead/ads?client=ca-pub-1309797784693300&amp;dt=1108574215339&amp;lmt=1108574205&amp;format=468x60_as&amp;output=html&amp;url=http%3A%2F%2F64.233.161.104%2Fsearch%3Fq%3Dcache%3AeqhzNn1G3bcJ%3Awww.chedong.com%2Fblog%2Farchives%2F000027.html%2Bperl%2Brss%2Bparser%26hl%3Den%26lr%3Dlang_zh-CN%26client%3Dfirefox-a&amp;ref=http%3A%2F%2Fwww.google.com%2Fsearch%3Fq%3Dperl%2Brss%2Bparser%26start%3D0%26start%3D0%26ie%3Dutf-8%26oe%3Dutf-8%26client%3Dfirefox-a%26rls%3Dorg.mozilla%3Aen-US%3Aofficial&amp;u_h=768&amp;u_w=1024&amp;u_ah=740&amp;u_aw=1024&amp;u_cd=32&amp;u_tz=-300&amp;u_his=1&amp;u_java=true&amp;u_nplug=12&amp;u_nmime=38" marginwidth="0" marginheight="0" vspace="0" hspace="0" allowtransparency="true" frameborder="0" height="60" scrolling="no" width="468">&amp;lt;img&amp;gt;</iframe>
<h2 id="trackbacks">Trackback Pings</h2>

<p class="techstuff">TrackBack URL for this entry:<br>
http://www.chedong.com/cgi-bin/mt3/mt-tb.cgi/27</p>


<p>Listed below are links to weblogs that reference <a href="http://www.chedong.com/blog/archives/000027.html">Lilina：<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>聚合器构建个人门户(Write once, publish anywhere)</a>:</p>


<p>
» <a href="http://weblog.chedong.com/archives/000598.html" rel="nofollow">MagPieRSS中UTF-8和GBK的<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>解析分析（附：php中的面向字符编程详解）</a> from 车东BLOG<br>
第一次尝试MagpieRSS，因为没有安装iconv和mbstring，所以失败了，今天在服务器上安装了iconv和mtstring的支持，我今天仔细看了一下lilina中的rss_fetch的用法：最重要的是制定<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>的输出格式为'MAGPIE_OU... <a href="http://weblog.chedong.com/archives/000598.html" rel="nofollow">[Read More]</a>
</p>

<p>Tracked on December 19, 2004 12:37 AM</p>

<p>
» <a href="http://weblog.realmarsnova.net/index.php?op=ViewArticle&amp;articleId=69&amp;blogId=2" rel="nofollow">用 lilina 和 blogline 来看 blog</a> from Philharmania's Weblog<br>
看到一篇<a href="http://www.chedong.com/blog/archives/000027.html" rel="nofollow">介绍 lilina 的文章</a>后就自己<a href="http://realmarsnova.net/lilina/" rel="nofollow">安装了一个</a>试了下。<a href="http://lilina.sourceforge.net/" rel="nofollow">lilina</a> 是一个用 PHP 语 <a href="http://weblog.realmarsnova.net/index.php?op=ViewArticle&amp;articleId=69&amp;blogId=2" rel="nofollow">[Read More]</a>
</p>

<p>Tracked on December 26, 2004 01:57 PM</p>

<p>
» <a href="http://blog.cnblog.org/archives/2004/12/cnblogosserssoo.html" rel="nofollow">CNBlog作者群<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>征集中</a> from CNBlog: Blog on Blog<br>
在CNBLOG上搭建了<a href="http://blog.cnblog.org/lilina/" rel="nofollow">Lilina <b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>聚合器</a>，请各位志愿者将各自网志或者和与cnblog相关专栏的<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>提交给我 — 直接在评论中回复即可。

推广使用<b style="color: black; background-color: rgb(160, 255, 255);">RSS</b>聚合工具主要的目的   . <a href="http://blog.cnblog.org/archives/2004/12/cnblogosserssoo.html" rel="nofollow">[Read More]</a>
</p>

<p>Tracked on December 26, 2004 07:42 PM</p>

<p>
» <a href="http://weblog.dalouis.com/archives/2005/01/ae_lilina_cecae.html" rel="nofollow">关于加快 lilina 显示速度的一些设置</a> from Kreny's Blog<br>
我的 lilina 在设定了几位朋友的 blog 和一些 news 以后，发现打开速度异常的慢，于是请教了车东，解决了问题。
解决的关键在于：</p>
<blockquote>直接将以下语句加入到 index.php 头部即可，LILINA中你   .</blockquote>
 <a href="http://weblog.dalouis.com/archives/2005/01/ae_lilina_cecae.html" rel="nofollow">[Read More]</a>

<p>Tracked on January 14, 2005 06:14 PM</p>

<p>
» <a href="http://weblog.chedong.com/archives/000009.html" rel="nofollow">MT的模板修改和界面皮肤设置</a> from 车东BLOG<br>
分类索引： 首页缺省有按月归档的索引，没有分类目录的索引，看了手册里面也没有具体的参数定义，只好直接看SOURCE：尝试着把Monthly改成Category，居然成了 :-) 还到了Movable Style的MT样式站，... <a href="http://weblog.chedong.com/archives/000009.html" rel="nofollow">[Read More]</a>
</p>

<p>Tracked on January 17, 2005 01:25 PM</p>







<h2 id="comments">Comments</h2>


<div id="c100">
<p>请问如果更改默认显示7天的新闻，谢谢。</p>
</div>

<p class="posted">Posted by: <a href="mailto:honren@tom.com" rel="nofollow">honren</a>  at December 12, 2004 10:20 PM</p>

<div id="c102">
<p>我使用lilina已经一段时间了。<br>
<a href="http://news.yanfeng.org/" rel="nofollow">http://news.yanfeng.org</a><br>
稍微改了一点UI。<br>
如果你能改进它，那就好了。</p>
</div>

<p class="posted">Posted by: <a href="http://yanfeng.org/blog" rel="nofollow">mulberry</a>  at December 13, 2004 09:24 AM</p>

<div id="c138">
<p>老车同志，没觉得你使用lilina以来，主页的访问速度具慢吗？放弃吧，至少没必要当作首页，lilina还在技术还不成熟`~ </p>
</div>

<p class="posted">Posted by: <a href="http://www.oaspro.com/" rel="nofollow">kalen</a>  at December 16, 2004 10:33 AM</p>

<div id="c156">
<p>可以考虑一下用drupal</p>
</div>

<p class="posted">Posted by: <a href="http://shunz.8866.org/" rel="nofollow">shunz</a>  at December 28, 2004 06:46 PM</p>

<div id="c185">
<p>可以试试我做的：<a href="http://blog.terac.com/" rel="nofollow">http://blog.terac.com</a></p>

<p>每3小时抓取blog,然后每个选5条最新的，排序，聚合，生成静态xml，用xsl格式化显示。。。</p>
</div>

<p class="posted">Posted by: <a href="http://blog.terac.com/go/andy" rel="nofollow">andy</a>  at January  6, 2005 12:53 PM</p>

<div id="c253">
<p>车东同志，这样做不好：P<br>
<b style="color: black; background-color: rgb(160, 255, 255);">rss</b>本来就在网上，你聚合它在你的网页上不仅损害了你自己主页的质量，而且迷惑了搜索引擎，造成你痛斥的“门户网站损害创作热情”的效果。还是不要聚合的好！</p>
</div>
<img src ="http://www.blogjava.net/pyguru/aggbug/1267.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/pyguru/" target="_blank">pyguru</a> 2005-02-17 03:00 <a href="http://www.blogjava.net/pyguru/archive/2005/02/17/1267.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Using RSS News Feeds with Perl</title><link>http://www.blogjava.net/pyguru/archive/2005/02/17/1266.html</link><dc:creator>pyguru</dc:creator><author>pyguru</author><pubDate>Wed, 16 Feb 2005 18:59:00 GMT</pubDate><guid>http://www.blogjava.net/pyguru/archive/2005/02/17/1266.html</guid><wfw:comment>http://www.blogjava.net/pyguru/comments/1266.html</wfw:comment><comments>http://www.blogjava.net/pyguru/archive/2005/02/17/1266.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/pyguru/comments/commentRss/1266.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/pyguru/services/trackbacks/1266.html</trackback:ping><description><![CDATA[<h2>Abstract</h2>

<p>The Rich Site Summary (RSS) format, previously known as the RDF Site
Summary, has quietly become the dominant format for distributing news
headlines on the Web.</p>

 <!--   Don't believe me? Well, go visit
<a href="http://www.xmltree.com" target="_new">http://www.xmltree.com</a>,
a superb online directory for XML, and search for <B>RSS</B>.   -->
<p>In this Mother of Perl tutorial, we will write a short Perl script
(less than 100 lines) that retrieves an XML RSS file from the Web or
local file system and converts it to HTML. Using a Server Side Include
(SSI) or similar method, you can easily add news headlines from any
number of sources to your Web site.</p>


<h2>History</h2>

<p>Where did RSS come from you ask? Netscape invented the RSS format for "channels" on Netscape Netcenter (<a href="http://my.netscape.com/" target="_new">http://my.netscape.com</a>). It was released to the public in March of 1999. The first non-Netscape Web site to incorporate the new format was <a href="http://www.scripting.com/" target="_new">Scripting News</a>, a popular technology news site run by Dave Winer, president of <a href="http://www.userland.com/" target="_new">Userland Software</a>
(think Frontier). Interestingly enough, Scripting News had been using
its own XML format, scriptingNews, since December of 1997.
</p>
<p>In May of 1999, Dave Winer released a new version of the
scriptingNews XML format, which added new content-rich elements.
Netscape followed suit by adopting most of the new scriptingNews
elements into RSS 0.91, which was released in July of 1999.
</p>
<p>Userland Software also rolled out their own flavor of my.netscape.com. If you haven't already guessed, it's available at <a href="http://my.userland.com/" target="_new">http://my.userland.com</a>.

</p>
<p> As far as I know, RSS is the most widely used XML format on the
Web today. RSS headlines are available for many popular news sites like
<a href="http://slashdot.org/" target="_new">Slashdot</a>,
<a href="http://www.forbes.com/" target="_new">Forbes</a>, and <a href="http://news.com.com/" target="_new">CNET News.com</a>, and the list is growing daily.

</p>
<p>In a time when "stickiness" is a good, displaying news headlines
on your Web site can really help give it the extra "umph" that will
encourage users to return. After all, users can only read your
president's bio but so many times.
</p>
<h2>Required Modules</h2>

<p>For rss2html.pl to work on your system, you should have a recent
version of Perl installed, 5.003 or better. 5.005 is recommended. You
will also need the XML::Parser and XML::RSS modules installed.</p>


<p>To install the modules on a *nix system, type:<br>
<code><b>perl -MCPAN -e "install XML::Parser"</b></code><br>
<code><b>perl -MCPAN -e "install XML::RSS"</b></code>

</p>
<p>If you're using a win32 machine (Win95/98/NT), you have a recent
installation of Activestate Perl. If you don't have a recent version,
visit <a href="http://www.activestate.com/" target="_new">http://www.activestate.com</a>.</p>


<p>To install XML::Parser on a win32 machine type:<br>
<code><b>ppm install XML-Parser</b></code></p>


<p>To install XML::RSS on a win32 machine (you must have a C compiler and nmake):
</p>
<ul>
<li>Download the module from: <a href="http://search.cpan.org/dist/XML-RSS/">http://search.cpan.org/dist/XML-RSS/</a>
</li><li>Uncompress the zip file and cd to the XML-RSS-0.5 directory
</li><li>type: <b>perl Makefile.PL</b>
</li><li>type: <b>nmake</b>
</li><li>type: <b>nmake install</b>
</li>
</ul>


<p>Next, we'll examine the RSS format in more detail.

</p>
<p>
      <table align="center" bgcolor="#0033ff" border="0" cellpadding="1" width="250"><tbody><tr><td>
      <table bgcolor="#000000" border="0" cellpadding="3" cellspacing="0" width="100%">
      <tbody><tr align="center" bgcolor="#0033ff"><td><font color="#ffffff"><b>rss2html.pl</b></font></td>
      <td bgcolor="#eeeeee"><a href="http://www.webreference.com/perl/tutorial/8/rss2html.pl" target="_new">Get the source</a></td></tr><tr>
      <td colspan="2" bgcolor="#ffffff">This script converts an RSS file on the Web or local file system to HTML.</td></tr></tbody></table>
      </td></tr></tbody></table></p>






<img src="http://www.webreference.com/art/t.gif" alt="" height="1" width="1">
<h2>RSS 0.9</h2>

<p>The first public version of RSS, 0.9, includes basic headline information. 
Below is an example RSS file for Freshmeat.net, a popular news site for 
Linux software:

</p>
<p><table bgcolor="#eeeeee" border="0" cellpadding="5">
<tbody><tr><td>
<code></code><pre>&lt;?xml version="1.0"?&gt;<br>&lt;rdf:RDF<br>xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"<br>xmlns="http://my.netscape.com/rdf/simple/0.9/"&gt;<br>  <br>&lt;channel&gt;<br>&lt;title&gt;freshmeat.net&lt;/title&gt;<br>&lt;link&gt;http://freshmeat.net&lt;/link&gt;<br>&lt;description&gt;the one-stop-shop for all your Linux softwar needs&lt;/description&gt;<br>&lt;/channel&gt;<br> <br>&lt;image&gt;<br>&lt;title&gt;freshmeat.net&lt;/title&gt;<br>&lt;url&gt;http://freshmeat.net/images/fm.mini.jpg&lt;/url&gt;<br>&lt;link&gt;http://freshmeat.net&lt;/link&gt;<br>&lt;/image&gt;<br>  <br>&lt;item&gt;<br>&lt;title&gt;Geheimnis 0.59&lt;/title&gt;<br>&lt;link&gt;http://freshmeat.net/news/1999/06/21/930004162.html&lt;/link&gt;<br>&lt;/item&gt;<br>  <br>&lt;item&gt;<br>&lt;title&gt;Firewall Manager 1.3 PRO&lt;/title&gt;<br>&lt;link&gt;http://freshmeat.net/news/1999/06/21/930004148.html&lt;/link&gt;<br>&lt;/item&gt;<br>  <br>&lt;textinput&gt;<br>&lt;title&gt;quick finder&lt;/title&gt;<br>&lt;description&gt;Use the text input below to search the fresh<br>meat application database&lt;/description&gt;<br>&lt;name&gt;query&lt;/name&gt;<br>&lt;link&gt;http://core.freshmeat.net/search.php3&lt;/link&gt;<br>&lt;/textinput&gt;<br><br>&lt;/rdf:RDF&gt;<br></pre>
</td></tr></tbody></table>

</p>
<p>The first major element is <b><code>channel</code></b> which contains
the following elements:
</p>
<ul>
<li><b><code>title</code></b> - the title of the channel
</li><li><b><code>link</code></b> - the link to the channel Web site
</li><li><b><code>description</code></b> - short description of the channel
</li>
</ul>


<p>An RSS channel may also contain an <b><code>image</code></b>
 element as in the example above which contains the following elements:
</p>
<ul>
<li><b><code>title</code></b> - the text describing the image
</li><li><b><code>url</code></b> - the URL of the image
</li><li><b><code>link</code></b> - the URL that the image is linked to
</li>
</ul>


<p>The <b><code>item</code></b> element contains the real channel
content which is comprised of a <b><code>title</code></b> and a
<b><code>link</code></b> element. An RSS file may contain up to 
15 items.

</p>
<p>An RSS 0.9 file may alternatively contain a <b><code>textinput</code></b>
element which allows users to type a string into a HTML text input field and
submit it via the HTTP GET method to the URL specified in the
<b><code>link</code></b> element.

</p>
<p>Next, we will examine RSS 0.91 which was released by Netscape in July
of 1999.

 
</p>



<img src="http://www.webreference.com/art/t.gif" alt="" height="1" width="1">
<h2>RSS 0.91</h2>

<p>The latest version of RSS added a few new elements. Below is a
sample RSS file from <a href="http://www.xml.com/" target="_new">XML.com</a>,
an excellent XML resource site:

</p>
<p>
</p>
<p><table bgcolor="#eeeeee" border="0" cellpadding="5">
<tbody><tr><td>
<code></code><pre>&lt;?xml version="1.0"?&gt;<br><br>&lt;!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd"&gt;<br><br>&lt;rss version="0.91"&gt;<br><br>  &lt;channel&gt;<br>    &lt;title&gt;XML News and Features from XML.com&lt;/title&gt;<br>    &lt;description&gt;XML.com features a rich mix of information and services for the XML community.&lt;/description&gt;<br>    &lt;language&gt;en-us&lt;/language&gt;<br>    &lt;link&gt;http://xml.com/pub&lt;/link&gt;<br>    &lt;copyright&gt;Copyright 1999, O'Reilly and Associates and Seybold Publications&lt;/copyright&gt;<br>    &lt;managingEditor&gt;dale@xml.com (Dale Dougherty)&lt;/managingEditor&gt;<br>    &lt;webMaster&gt;peter@xml.com (Peter Wiggin)&lt;/webMaster&gt;<br><br>  &lt;image&gt;<br>    &lt;title&gt;XML News and Features from XML.com&lt;/title&gt;<br>    &lt;url&gt;http://xml.com/universal/images/xml_tiny.gif&lt;/url&gt;<br>    &lt;link&gt;http://xml.com/pub&lt;/link&gt;<br>    &lt;width&gt;88&lt;/width&gt;<br>    &lt;height&gt;31&lt;/height&gt;<br>  &lt;/image&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;Issue: XML Data Servers&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;Although not everyone agrees that XML should become a full-fledged data-management discipline, object-database vendors are busy repositioning their object-database products as XML data servers. Jon Udell looks at one of these, Object Design's eXcelon and finds it a solid product.&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;O'Reilly Labs Review: Object Design's eXcelon 1.1&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/1999/08/excelon/index.html?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;Jon Udell takes a look at eXcelon, Object Design's XML data servers, and explains its user interface and general approach to XML.   &lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;Report from Montreal&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/1999/08/excelon/montreal.html?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;Lisa Rein reports from MetaStructures 99 and XML Developers' Day.&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;Reviews: Bluestone Software's XML Suite: Promising App, Rough Around the Edges&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/1999/08/bluestone/index.html?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;Our reviewer tested Bluestone's XML Suite   (XML Server and Visual XML) on the   Windows NT platform, simulating a two-way exchange of   business information between a book publisher and   book stores. The results were encouraging (with a   few caveats).&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;Interviews: CBL: Ecommerce Componentry&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/1999/08/glushko/glushko.html?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;In this audio interview, Bob Glushko of Commerce One talks about the Common Business Library (CBL) as a set of building blocks for XML document types and schemas used in ecommerce.&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;Backends Sharing Data&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/1999/08/rpc/index.html?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;What if you could script remote procedure calls between web sites as easily as you can between programs? Edd Dumbill shows how it can be done in PHP.&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;Back Issue: XML Suite&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/1999/08/18/index.html?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt; Barry Nance runs Bluestone's   XML Suite through the paces. The tools show promise   for passing data between databases and XML. But there are still a   few kinks to be worked out.&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;Back Issue: XML-RPC&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/1999/08/11/index.html?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;A major promise of XML is its ability to pass data simply from one place to another, regardless of platform. In this issue, Edd Dumbill shows how to use XML-RPC in PHP to pass data from a web site to a PDA.&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;News: InDelv XML/XSL Client Version 0.4.&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/coverpage/newspage.html#ni1999-08-27-a?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;      A posting from Rob Brown reports on the public availability of the new InDelv XML Client version 0.4.  This version represent an upgrade to InDelv's previously released XML Browser, but "it has been renamed as a 'Client' to reflect the fact that it now contains both an XML/XSL browser and an XML/XSL editor.  The browser is available free for all uses. The editor comes packaged with the browser as a demo, which can later be upgraded to a full commercial version. This is a 100% Java appl...<br>&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;News: OpenJade Development Team Releases OpenJade 1.3pre1 (Beta).&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/coverpage/newspage.html#ni1999-08-27-g?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;      A recent posting from Avi Kivity and the OpenJade Development Team announced the release of OpenJade 1.3pre1 (Beta).  "OpenJade is the DSSSL user community's open source implementation of DSSSL, Document Style Semantics and Specification Language, an ISO standard for rendering SGML and XML documents.  OpenJade is based on James Clark's widely used Jade.  OpenJade 1.3pre1 is a more complete implementation of the DSSSL standard, and introduces many new features, including (1) Implementat...<br>&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;News: IBM XML Parser Update: XML4C2 Version 2.3.1 Released.&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/coverpage/newspage.html#ni1999-08-27-b?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;      Dean Roddey posted an announcement for the update of XML4C.  IBM's XML for C++ parser (XML4C) "is a validating XML parser written in a portable subset of C++.  XML4C makes it easy to give an application the ability to read and write XML data. Its two shared libraries provide classes for parsing, generating, manipulating, and validating XML documents.  XML4C is faithful to the XML 1.0 Recommendation and associated standards (DOM 1.0, SAX 1.0). Source code, samples and API documentation ...<br>&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;News: Platform for Privacy Preferences (P3P) Specification Working Draft.&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/coverpage/newspage.html#ni1999-08-27-h?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;     As part of the W3C P3P Activity, a fifth public working draft of the Platform for Privacy Preferences (P3P) Specification has been published for review by W3C members.  The working draft "describes the Platform for Privacy Preferences (P3P).  P3P enables Web sites to express their privacy practices and enables users to exercise preferences over those practices. P3P compliant products will allow users to be informed of site practices (in both machine and human readable formats), to deleg...<br>&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;News: Extended XLink with XSLT.&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/coverpage/newspage.html#ni1999-08-27-c?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;      Nikita Ogievetsky (President, Cogitech, Inc.) posted an announcement for the availability of slides from the Metastructures '99 presentation "HTML Form Templates with XML.  All in One and One for All.  XSLT template library for WEB applications."  The paper describes building XSLT template library for web applications.  The goal was to "demonstrate data processing on the web made easy with XSL transformations: Generate a data maintenance web with data-structure controlled by XML, scree...<br>&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;News: HyBrick Web Site Reopens.&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/coverpage/newspage.html#ni1999-08-27-d?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;      A posting from Toshimitsu Suzuki (Fujitsu Laboratories Ltd.) to the XLXP-DEV mailing list recently announced the reopening of the HyBrick Web site.   'HyBrick' is "an advanced SGML/XML browser developed by Fujitsu Laboratories, the research arm of Fujitsu.  HyBrick is based on an architecture that supports advanced linking and formatting capabilities.  HyBrick includes a DSSSL renderer and XLink/XPointer engine running on top of James Clark's SP and Jade.   HyBrick supports: (1) Both v...<br>&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;item&gt;<br>    &lt;title&gt;News: Extended DocBook Synopses Version 1.0.&lt;/title&gt;<br>    &lt;link&gt;http://xml.com/pub/coverpage/newspage.html#ni1999-08-27-e?wwwrrr_rss&lt;/link&gt;<br>    &lt;description&gt;      Norman Walsh has posted an announcement for a preliminary release of 'Extended DocBook Synopses'.  Extended DocBook Synopses is a customization layer that extends DocBook, "adding a function synopsis element, ClassSynopsis for modern, mostly object-oriented, programming languages such as Java, C++, Perl, and IDL."  DocBook is an SGML [and XML] DTD maintained by the DocBook Technical Committee of OASIS that particularly well suited to books and papers about computer hardware and softwar...<br>&lt;/description&gt;<br>  &lt;/item&gt;<br><br>  &lt;/channel&gt;<br>&lt;/rss&gt;<br></pre>
</td></tr></tbody></table>

</p>
<p>Notice that there are more descriptive elements for the channel, image,
amd items elements. These are referred to as "fat elements" because they
contain a more detailed description of each channel item.

 
</p>



<table>
<tbody><tr><td><img src="http://www.webreference.com/art/t.gif" alt="" height="1" width="1"></td></tr></tbody>
</table>
<h2>The XML::RSS Module</h2>

<p>Now that you've had a change to glance at two RSS examples, it's time to 
introduct the XML::RSS module. XML::RSS is a subclass of XML::Parser,
a Perl module maintained by Clark Cooper that utilizes James Clark's
Expat C library. XML::RSS was developed to simplify the task of 
manipulating and parsing RSS files. A deep understanding of XML is not 
a prerequisite for using XML::RSS since the XML details are hidden 
inside the class interface.

</p>
<p>While XML::RSS is capable of creating RSS files, we will be
focusing on parsing existing RSS files in this column. You can read
more about the capabilities of XML::Parser in the module's
documentation or by typing:<br>
<b><code>perldoc XML::RSS</code></b>

</p>
<h2>The Code</h2>

<p>Well, let's look at the code shall we? 
<a href="http://www.webreference.com/perl/tutorial/8/source.html#16" target="_new">Lines 16-17</a> load the XML::RSS
and LWP::Simple modules. We've already talked about XML::RSS in brief, but
what does LWP::Simple do? Good question! The answer is simple (puns intended).
It's a procedural interface for interacting with a Web server. It's
also the little cousin of LWP::UserAgent, a fuller object oriented interface.
We'll be using one of the library's subroutines later in the code to fetch
an RSS file from the Web.

</p>
<p>In <a href="http://www.webreference.com/perl/tutorial/8/source.html#20" target="_new">lines 20-21</a> we initialize two
variables that we're going to use later.

</p>
<p><a href="http://www.webreference.com/perl/tutorial/8/source.html#25" target="_new">Line 25</a> starts the main
code body. The first thing we do is verify that the user
typed exactly one command-line parameter. This parameter is then assigned
to the <b><code>$arg</code></b> variable in 
<a href="http://www.webreference.com/perl/tutorial/8/source.html#28" target="_new">line 28</a>.

</p>
<p>Next we create a new instance of the XML::RSS class and assign the 
reference to the <b><code>$rss</code></b> variable on 
<a href="http://www.webreference.com/perl/tutorial/8/source.html#31" target="_new">line 31</a>.

</p>
<p>Now we must determine whether the command-line parameter the user
entered is an HTTP URL or a file on the local file system 
(<a href="http://www.webreference.com/perl/tutorial/8/source.html#34" target="_new">lines 34-46</a>). On 
<a href="http://www.webreference.com/perl/tutorial/8/source.html#34" target="_new">line 34</a>, we us a
regular expression to look for the characters <b><code>http:</code></b>.

</p>
<p>If the command-line argument starts with these characters, we can safely
assume that the user intends to retrieve an RSS file from a Web server.
On <a href="http://www.webreference.com/perl/tutorial/8/source.html#35" target="_new">line 35</a> we pass the
argument to the <b><code>get()</code></b> function, which is a part of
LWP::Simple, and assign the results to the <b><code>$content</code></b>
variable. On <a href="http://www.webreference.com/perl/tutorial/8/source.html#36" target="_new">line 36</a> we call
<code>die()</code> if <code>$content</code> is empty. If this happens,
it means there was an error retrieving the RSS file. If the RSS file
was downloaded successfully, <code>$rss-&gt;parse($content)</code> is called
which parses the RSS file and stores the results in the object's internal
structure (<a href="http://www.webreference.com/perl/tutorial/8/source.html#38" target="_new">line 38</a>).

</p>
<p>If the command-line argument does not contain the <code>http:</code>
characters, we assume the argument is a file instead of a URL on
<a href="http://www.webreference.com/perl/tutorial/8/source.html#41" target="_new">lines 41-46</a>. The
first thing we do is assign the value of <b><code>$arg</code></b>
to the <b><code>$file</code></b> variable and test for the existence of 
the file (<a href="http://www.webreference.com/perl/tutorial/8/source.html#42" target="_new">lines 42-43</a>).

</p>
<p>Then we call <b><code>$rss-&gt;parsefile($file)</code></b>
 (<a href="http://www.webreference.com/perl/tutorial/8/source.html#41" target="_new">line 45</a>), which parses
the RSS file and stores the results in the object's internal structure.
The <code>parsefile()</code> method parses a file, whereas the 
<code>parse()</code> method parses the string that's passed to it.

</p>
<p>Lastly, we call the <b><code>print_html</code></b> subroutine on 
<a href="http://www.webreference.com/perl/tutorial/8/source.html#49" target="_new">line 49</a>, which converts
the RSS object in nicely formatted HTML.

</p>
<h2>print_html</h2>

<p>As you examine this subroutine, you will begin to understand
the internal structure of the XML::RSS object. The critical portion
of the subroutine is contained on
<a href="http://www.webreference.com/perl/tutorial/8/source.html#76" target="_new">lines 76-79</a>. In this 
<code>foreach</code> loop, we iterate over each of the RSS items.

</p>
<p>Next, let's take a look at rss2html.pl in action.<br>
</p>
<h2>rss2html.pl in Action</h2>

<p>I've added the following cron jobs that run once per hour on 
the Webreference server (Scheduler is the NT counterpart):

</p>
<p>
<code>rss2html.pl http://slashdot.org/slashdot.rdf &gt; slashdot.html<br>
rss2html.pl http://freshmeat.net/backend/fm.rdf &gt; freshmeat.html<br>
rss2html.pl http://www.linuxtoday.com/backend/my-netscape.rdf &gt; linuxtoday.html<br>
rss2html.pl http://www.xml.com/xml/news.rdf &gt; xmlnews.html<br>
rss2html.pl http://www.perlxml.com/rdf/moperl.rdf &gt; mop.html

</code></p>
<p>The commands above fetch the RSS files off the Web and convert them to 
HTML. Using Server-Side Includes (SSI), I've included the results below:

</p>
<p>
<table bgcolor="#000000" border="0" width="200"><tbody><tr><td>
<table bgcolor="#ffffff" border="0" cellpadding="4" cellspacing="1" width="100%">
  <tbody><tr>
  <td align="center" bgcolor="#eeeeee" valign="middle"><font color="#000000" face="Arial,Helvetica"><b><a href="http://slashdot.org/">Slashdot: </a></b></font></td></tr>
<tr><td>
<center>
<p><a href="http://slashdot.org/"><img src="http://images.slashdot.org/topics/topicslashdot.gif" alt="Slashdot: " border="0"></a></p></center><p>
</p><li><a href="http://slashdot.org/article.pl?sid=05/02/16/1354235&amp;from=rss">WiMax Technology Could Blanket the US?</a><br>
</li><li><a href="http://slashdot.org/article.pl?sid=05/02/16/1244221&amp;from=rss">Hitchhiker's Guide to the Galaxy Trailer</a><br>
</li><li><a href="http://it.slashdot.org/article.pl?sid=05/02/16/1529213&amp;from=rss">Microsoft Anti-Spyware to Be Free of Charge</a><br>
</li><li><a href="http://science.slashdot.org/article.pl?sid=05/02/16/1514241&amp;from=rss">ACM to Honor TCP/IP Creators with Turing Award</a><br>
</li><li><a href="http://yro.slashdot.org/article.pl?sid=05/02/16/1343233&amp;from=rss">New Rules Proposed on Electronic Evidence</a><br>
</li><li><a href="http://slashdot.org/article.pl?sid=05/02/16/1330218&amp;from=rss">Intel From Behind the Curtain</a><br>
</li><li><a href="http://science.slashdot.org/article.pl?sid=05/02/16/1250234&amp;from=rss">Kyoto Protocol Comes Into Force</a><br>
</li><li><a href="http://slashdot.org/article.pl?sid=05/02/16/0451246&amp;from=rss">Cory Doctorow's 'I, Robot' Posted</a><br>
</li><li><a href="http://slashdot.org/article.pl?sid=05/02/16/0312249&amp;from=rss">Straczynski Offers To Re-Boot Star Trek</a><br>
</li><li><a href="http://linux.slashdot.org/article.pl?sid=05/02/16/0157239&amp;from=rss">Building The MareNostrum COTS Supercomputer</a><br>
<form method="get" action="http://slashdot.org/search.pl">
Search Slashdot stories<br> 
<input name="query" type="text"><br>
<input value="Search Slashdot" type="submit">
</form>
</li></td>
</tr>
</tbody></table>
</td></tr></tbody></table>

<table bgcolor="#000000" border="0" width="200"><tbody><tr><td>
<table bgcolor="#ffffff" border="0" cellpadding="4" cellspacing="1" width="100%">
  <tbody><tr>
  <td align="center" bgcolor="#eeeeee" valign="middle"><font color="#000000" face="Arial,Helvetica"><b><a href="http://freshmeat.net/">freshmeat.net announcements (Global)</a></b></font></td></tr>
<tr><td>
<li><a href="http://freshmeat.net/releases/188020/">Zolera SOAP Infrastructure 1.7 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187991/">XBible 3.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188019/">PDFdirectory 0.2.04 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188018/">XC-AST 0.7.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188017/">Imagero Reader 1.73 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188016/">GNU ccAudio2 0.4.0 (Testing branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188013/">quisp 1.27 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188011/">shsql 1.27 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188010/">samhain 2.0.4 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188008/">CANDIDv2 2.40 (Default branch)</a><br>
</li><li><a href="http://ad.doubleclick.net/clk;11713626;10469167;e?http://infoworld.com/spotlights/sbc/main.html?lpid0101035400730403idlp">ADV: Dialing for Dollars</a><br>
</li><li><a href="http://freshmeat.net/releases/188007/">libferris 1.1.46 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188006/">FUDforum 2.6.10 (Stable branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188005/">HORRORss 1.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188003/">Roxen WebServer 4.0.325-release 4 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188001/">Configuration File Library 1.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/188000/">Goggles 0.7.11 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187998/">Pluto DCE library 2.0.0.9 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187997/">Pluto Bi-Directional Comm library 2.0.0.9 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187995/">zen Platform 2.0.4 (Default branch)</a><br>
</li><li><a href="http://ad.doubleclick.net/clk;11713776;10469304;d?http://infoworld.com/spotlights/sbc/main.html?lpid0101035400730402idlp">ADV: Gimme Shelter</a><br>
</li><li><a href="http://freshmeat.net/releases/187994/">MIME Email message class 2005.02.15 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187993/">ELF statifier 1.6.3 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187992/">SekHost 1.2 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187989/">ulogd 1.21 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187987/">Journaled Files LIBrary 0.1.0-0.0.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187986/">FastTemplate.php3 1.2.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187985/">iptables 1.3.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187983/">Very Simple Control Protocol Daemon 0.1.4 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187967/">C Parameters 0.9.0 (Default branch)</a><br>
</li><li><a href="http://ad.doubleclick.net/clk;11713626;10469167;e?http://infoworld.com/spotlights/sbc/main.html?lpid0101035400730403idlp">ADV: Dialing for Dollars</a><br>
</li><li><a href="http://freshmeat.net/releases/187965/">eXtreme Project Management Tool 0.7beta1 (Development branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187954/">gccc 1.099 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187982/">Magellan Metasearch 1.00-RC3 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187981/">CAN Abstraction Layer 0.1.4 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187980/">TreeLine 0.11.1 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187979/">GNOME Sensors Applet 0.6.1 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187977/">iODBC Driver Manager and SDK 3.52.2 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187976/">DISLIN 8.3 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187975/">Pluto Home 2.0.0.9 (Default branch)</a><br>
</li><li><a href="http://ad.doubleclick.net/clk;11713626;10469167;e?http://infoworld.com/spotlights/sbc/main.html?lpid0101035400730403idlp">ADV: Dialing for Dollars</a><br>
</li><li><a href="http://freshmeat.net/releases/187973/">Expense Report Software 1.07 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187971/">Yzis M3 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187970/">Q Light Controller 2.4.1 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187968/">Menc 0.3 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187963/">Another File Integrity Checker 2.7-0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187962/">BibShelf 1.4.0-1 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187961/">Eleven 1.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187959/">Linice 2.5 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187957/">JDirt 1.3 (Default branch)</a><br>
</li><li><a href="http://ad.doubleclick.net/clk;11713626;10469167;e?http://infoworld.com/spotlights/sbc/main.html?lpid0101035400730403idlp">ADV: Dialing for Dollars</a><br>
</li><li><a href="http://freshmeat.net/releases/187956/">Nazghul 0.4.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187955/">Rush 2005 0.4.10 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187953/">Monesa 0.24.1 (Stable branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187952/">Persist.NET 0.9.1 beta (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187950/">Roundup 0.8 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187949/">Aquarium Web Application Framework 2.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187948/">sn9c102 Video Grabber 1.7.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187947/">GRAVEMAN 0.3.8 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187946/">viewurpmi 0.2 (Default branch)</a><br>
</li><li><a href="http://ad.doubleclick.net/clk;11713626;10469167;e?http://infoworld.com/spotlights/sbc/main.html?lpid0101035400730403idlp">ADV: Dialing for Dollars</a><br>
</li><li><a href="http://freshmeat.net/releases/187945/">NuFW 1.0-rc1 (Stable branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187944/">OpenSceneGraph Editor 0.6.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187943/">HPGS - HPGl Script 0.6.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187942/">lustre 1.4.1-rc1 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187938/">IBM HeapAnalyzer 1.3.3 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187937/">CANDIDv2 2.3.6 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187935/">NetSPoC 2.5 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187932/">Metal Mech 0.0.3 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187931/">radmind 1.5.0 (Default branch)</a><br>
</li><li><a href="http://ad.doubleclick.net/clk;11713626;10469167;e?http://infoworld.com/spotlights/sbc/main.html?lpid0101035400730403idlp">ADV: Dialing for Dollars</a><br>
</li><li><a href="http://freshmeat.net/releases/187930/">iPodBackup 1.4 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187907/">db4o 4.3 (Mono branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187927/">web2ldap 0.15.9 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187926/">Mantissa 5.6 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187925/">Drone IRC Bot 1.2 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187923/">NoFuss POS 0.06 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187922/">xlog 1.1 (Stable branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187921/">ActiveBPEL 1.0.7 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187920/">Java Embedded Python 1.1 (Default branch)</a><br>
</li><li><a href="http://ad.doubleclick.net/clk;11713626;10469167;e?http://infoworld.com/spotlights/sbc/main.html?lpid0101035400730403idlp">ADV: Dialing for Dollars</a><br>
</li><li><a href="http://freshmeat.net/releases/187919/">Neveredit 0.8 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187917/">The friendly interactive shell 1.1 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187915/">Webmatic 2.0.3 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187914/">JTMOS Operating System Build 7700 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187913/">BIRD 1.0.10 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187916/">Tune in 2 Me 050215 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187912/">HMSCalc 3.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187911/">Information Currency Web Services 0.0.4 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187908/">Nitro + Og 0.10.0 (Default branch)</a><br>
</li><li><a href="http://ad.doubleclick.net/clk;11713626;10469167;e?http://infoworld.com/spotlights/sbc/main.html?lpid0101035400730403idlp">ADV: Dialing for Dollars</a><br>
</li><li><a href="http://freshmeat.net/releases/187910/">Just For Fun Network Management System 0.8.0 (Stable branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187906/">rxvt-unicode 5.1 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187905/">PHPEmaillist 0.3 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187904/">ulogd-php 1.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187903/">mod_access_rbl2 1.0 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187886/">5lack10.1 0.8 (Default branch)</a><br>
</li><li><a href="http://freshmeat.net/releases/187585/">profusemail 0.9.1 (Default branch)</a><br>
</li></td>
</tr>
</tbody></table>
</td></tr></tbody></table>

<table bgcolor="#000000" border="0" width="200"><tbody><tr><td>
<table bgcolor="#ffffff" border="0" cellpadding="4" cellspacing="1" width="100%">
  <tbody><tr>
  <td align="center" bgcolor="#eeeeee" valign="middle"><font color="#000000" face="Arial,Helvetica"><b><a href="http://linuxtoday.com/">Linux Today</a></b></font></td></tr>
<tr><td>
<center>
<p><a href="http://linuxtoday.com/"><img src="http://linuxtoday.com/pics/ltnet.png" alt="Linux Today" border="0"></a></p></center><p>
</p><li><a href="http://linuxtoday.com/news_story.php3?ltsn=2005-02-16-012-26-NW-CY">LWN.net: FSF Announces New Executive Director</a><br>
</li><li><a href="http://linuxtoday.com/news_story.php3?ltsn=2005-02-16-010-26-NW-EV-NV">LinuxPlanet: Novell Takes Enterprise Security Focus</a><br>
</li><li><a href="http://linuxtoday.com/news_story.php3?ltsn=2005-02-16-019-26-OS-BZ-LL">CNET News: HP: Don't Like Software Patents? Learn to Deal</a><br>
</li><li><a href="http://linuxtoday.com/news_story.php3?ltsn=2005-02-16-018-26-NW-BZ-EV">internetnews.com: CA Chief: Innovate, Cooperate</a><br>
</li><li><a href="http://linuxtoday.com/news_story.php3?ltsn=2005-02-16-011-26-NW-EV">Boston Herald: Linux Show Plans BCEC Move</a><br>
<form method="get" action="http://linuxtoday.com/search.php3">
Search Linux Today:<br> 
<input name="query" type="text"><br>
<input value="Search" type="submit">
</form>
</li></td>
</tr>
</tbody></table>
</td></tr></tbody></table>

<table bgcolor="#000000" border="0" width="200"><tbody><tr><td>
<table bgcolor="#ffffff" border="0" cellpadding="4" cellspacing="1" width="100%">
  <tbody><tr>
  <td align="center" bgcolor="#eeeeee" valign="middle"><font color="#000000" face="Arial,Helvetica"><b><a href="http://www.xml.com/">XML.com</a></b></font></td></tr>
<tr><td>
<center>
<p><a href="http://www.xml.com/"><img src="http://www.xml.com/universal/images/xml_tiny.gif" alt="XML.com" border="0"></a></p></center><p>
</p><li><a href="http://www.xml.com/pub/a/2005/02/09/xml-http-request.html">Features: Very Dynamic Web Interfaces</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/02/09/cssorxsl.html">Features: Comparing CSS and XSL: A Reply from Norm Walsh</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/02/09/xforms.html">Features: Top 10 XForms Engines</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/02/02/tmapi.html">Features: An Introduction to TMAPI</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/02/02/silent.html">XML Tourist: The Silent Soundtrack</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/02/02/xpath2.html">Transforming XML: The XPath 2.0 Data Model</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/01/26/simile.html">Features: SIMILE: Practical Metadata for the Semantic Web</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/01/26/hacking-ooo.html">Features: Hacking Open Office</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/01/26/formtax.html">Features: Formal Taxonomies for the U.S. Government</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/01/19/review.html">Features: Reviewing the Architecture of the World Wide Web</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/01/19/print.html">Features: Printing XML: Why CSS Is Better than XSL</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/01/19/amara.html">Python and XML: Introducing the Amara XML Toolkit</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/01/12/comega.html">Features: Introducing Comega</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/01/12/saml2.html">Features: SAML 2: The Building Blocks of Federated Identity</a><br>
</li><li><a href="http://www.xml.com/pub/a/2005/01/05/restful.html">The Restful Web: Amazon's Simple Queue Service</a><br>
<p><sub>Copyright 2004, O'Reilly Media, Inc.</sub></p>
</li></td>
</tr>
</tbody></table>
</td></tr></tbody></table>



 
</p>



<table>
<tbody><tr><td><img src="http://www.webreference.com/art/t.gif" alt="" height="1" width="1"></td></tr></tbody>
</table>


<h2>Conclusion</h2>

<p>Well, we've shown in this column that Perl can really pack a wallop
in a short amount of code. With rss2html.pl, anyone can automatically
add a news feed to their Web site.

</p>
<p>For more information on RSS, you might try visiting the following sites:
</p>
<ul>
<li><a href="http://my.userland.com/" target="_new">http://my.userland.com</a>
</li><li><a href="http://www.scripting.com/" target="_new">http://www.scripting.com</a>
</li><li><a href="http://www.perlxml.com/" target="_new">http://www.perlxml.com</a>
</li>
</ul>


<p>
      <table align="center" bgcolor="#0033ff" border="0" cellpadding="1" width="250"><tbody><tr><td>
      <table bgcolor="#000000" border="0" cellpadding="3" cellspacing="0" width="100%">
      <tbody><tr align="center" bgcolor="#0033ff"><td><font color="#ffffff"><b>rss2html.pl</b></font></td>
      <td bgcolor="#eeeeee"><a href="http://www.webreference.com/perl/tutorial/8/rss2html.pl" target="_new">Get the source</a></td></tr><tr>
      <td colspan="2" bgcolor="#ffffff">This script converts an RSS file on the Web or local file system to HTML.</td></tr></tbody></table>
      </td></tr></tbody></table></p>






<img src="http://www.webreference.com/art/t.gif" alt="" height="1" width="1">

<img src ="http://www.blogjava.net/pyguru/aggbug/1266.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/pyguru/" target="_blank">pyguru</a> 2005-02-17 02:59 <a href="http://www.blogjava.net/pyguru/archive/2005/02/17/1266.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>The Python Web services developer: RSS for Python</title><link>http://www.blogjava.net/pyguru/archive/2005/02/17/1265.html</link><dc:creator>pyguru</dc:creator><author>pyguru</author><pubDate>Wed, 16 Feb 2005 18:48:00 GMT</pubDate><guid>http://www.blogjava.net/pyguru/archive/2005/02/17/1265.html</guid><wfw:comment>http://www.blogjava.net/pyguru/comments/1265.html</wfw:comment><comments>http://www.blogjava.net/pyguru/archive/2005/02/17/1265.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/pyguru/comments/commentRss/1265.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/pyguru/services/trackbacks/1265.html</trackback:ping><description><![CDATA[<span class="atitle2">Content syndication for the Web</span><br>
<table border="0" cellpadding="0" cellspacing="0">
<tbody><tr align="left" valign="top"><td><p>Level: Introductory</p></td></tr></tbody>
</table>
<table border="0" cellpadding="0" cellspacing="0">
<tbody><tr align="left" valign="top"><td>





      <br>
</td></tr></tbody>
</table>
<p><a href="http://www-106.ibm.com/developerworks/webservices/library/ws-pyth11.html#author1"><name>Mike Olson</name></a> (<a href="mailto:mike.olson@fourthought.com?cc=&amp;subject=RSS%20for%20Python">mike.olson@fourthought.com</a>), Principal Consultant, Fourthought, Inc.<br><a href="http://www-106.ibm.com/developerworks/webservices/library/ws-pyth11.html#author2"><name>Uche Ogbuji</name></a> (<a href="mailto:uche.ogbuji@fourthought.com?cc=&amp;subject=RSS%20for%20Python">uche.ogbuji@fourthought.com</a>), Principal Consultant, Fourthought, Inc.<br><br> 13 Nov  2002</p>
<blockquote><img src="http://www-106.ibm.com/developerworks/i/c-pythondev.gif" alt="Column icon" align="left" border="0" height="38" width="38">RSS
is one of the most successful XML services ever. Despite its chaotic
roots, it has become the community standard for exchanging content
information across Web sites. Python is an excellent tool for RSS
processing, and Mike Olson and Uche Ogbuji introduce a couple of
modules available for this purpose.</blockquote>

<p>RSS is an abbreviation with several expansions: "RDF Site Summary,"
"Really Simple Syndication," "Rich Site Summary," and perhaps others.
Behind this confusion of names is an astonishing amount of politics for
such a mundane technological area. RSS is a simple XML format for
distributing summaries of content on Web sites. It can be used to share
all sorts of information including, but not limited to, news flashes,
Web site updates, event calendars, software updates, featured content
collections, and items on Web-based auctions.</p>


<p>RSS was created by Netscape in 1999 to allow content to be gathered
from many sources into the Netcenter portal (which is now defunct). The
UserLand community of Web enthusiasts became early supporters of RSS,
and it soon became a very popular format. The popularity led to strains
over how to improve RSS to make it even more broadly useful. This
strain led to a fork in RSS development. One group chose an approach
based on RDF, in order to take advantage of the great number of RDF
tools and modules, and another chose a more stripped-down approach. The
former is called RSS 1.0, and the latter RSS 0.91. Just last month the
battle flared up again with a new version of the non-RDF variant of
RSS, which its creators are calling "RSS 2.0."</p>


<p>RSS 0.91 and 1.0 are very popular, and used in numerous portals and
Web logs. In fact, the blogging community is a great user of RSS, and
RSS lies behind some of the most impressive networks of XML exchange in
existence. These networks have grown organically, and are really the
most successful networks of XML services in existence. RSS is a XML
service by virtue of being an exchange of XML information over an
Internet protocol (the vast majority of RSS exchange is simple HTTP GET
of RSS documents). In this article, we introduce just a few of the many
Python tools available for working with RSS. We don't provide a
technical introduction to RSS, because you can find this in so many
other articles (see <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www-106.ibm.com/developerworks/webservices/library/ws-pyth11.html#resources">Resources</a>).
We recommend first that you gain a basic familiarity with RSS, and that
you understand XML. Understanding RDF is not required.</p>

<p>[We consider RSS an 'XML service' rather than a 'Web service' due to
the use of XML descriptions but the lack of use of WSDL. -- Editors]</p>

<p><a name="h1"><span class="atitle2">RSS.py</span></a><br>
Mark Nottingham's RSS.py is a Python library for RSS processing. It is
very complete and well-written. It requires Python 2.2 and PyXML 0.7.1.
Installation is easy; just download the Python file from Mark's home
page and copy it to somewhere in your <code>PYTHONPATH</code>.
</p>


<p>Most users of RSS.py need only concern themselves with two classes it provides: <code>CollectionChannel</code> and <code>TrackingChannel</code>.  The latter seems the more useful of the two.  <code>TrackingChannel</code> is a data structure that contains all the RSS data indexed by the key of each item.  <code>CollectionChannel</code>
is a similar data structure, but organized more as RSS documents
themselves are, with the top-level channel information pointing to the
item details using hash values for the URLs. You will probably use the
utility namespace declarations in the <code>RSS.ns</code> structure.  <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www-106.ibm.com/developerworks/webservices/library/ws-pyth11.html#code1">Listing 1</a>
is a simple script that downloads and parses an RSS feed for Python
news, and prints out all the information from the various items in a
simple listing.</p>


<table bgcolor="#cccccc" border="1" cellpadding="5" cellspacing="0" width="100%">
<tbody><tr><td><pre><code><br>  <br>from RSS import ns, CollectionChannel, TrackingChannel<br><br>#Create a tracking channel, which is a data structure that<br>#Indexes RSS data by item URL<br>tc = TrackingChannel()<br><br>#Returns the RSSParser instance used, which can usually be ignored<br>tc.parse("http://www.python.org/channews.rdf")<br><br>RSS10_TITLE = (ns.rss10, 'title')<br>RSS10_DESC = (ns.rss10, 'description')<br><br>#You can also use tc.keys()<br>items = tc.listItems()<br>for item in items:<br>    #Each item is a (url, order_index) tuple<br>    url = item[0]<br>    print "RSS Item:", url<br>    #Get all the data for the item as a Python dictionary<br>    item_data = tc.getItem(item)<br>    print "Title:", item_data.get(RSS10_TITLE, "(none)")<br>    print "Description:", item_data.get(RSS10_DESC, "(none)")<br><br><br><br></code></pre></td></tr></tbody>
</table>


<p>We start by creating a <code>TrackingChannel</code> instance, and then populate it with data parsed from the RSS feed at <code>http://www.python.org/channews.rdf</code>.
RSS.py uses tuples as the property names for RSS data. This may seem an
unusual approach to those not used to XML processing techniques, but it
is actually a very useful way of being very precise about what was in
the original RSS file. In effect, an RSS 0.91 <code>title</code>
element is not considered to be equivalent to an RSS 1.0 one. There is
enough data for the application to ignore this distinction, if it
likes, by ignoring the namespace portion of each tuple; but the basic
API is wedded to the syntax of the original RSS file, so that this
information is not lost. In the code, we use this property data to
gather all the items from the news feed for display. Notice that we are
careful not to assume which properties any particular item might have.
We retrieve properties using the safe form as seen in the code below.</p>


<table bgcolor="#cccccc" border="1" cellpadding="5" cellspacing="0" width="100%">
<tbody><tr><td><pre><code><br><br>    print "Title:", item_data.get(RSS10_TITLE, "(none)")<br><br></code></pre></td></tr></tbody>
</table>


<p>Which provides a default value if the property is not found, rather than this example.</p>


<table bgcolor="#cccccc" border="1" cellpadding="5" cellspacing="0" width="100%">
<tbody><tr><td><pre><code><br><br>    print "Title:", item_data[RSS10_TITLE]<br><br></code></pre></td></tr></tbody>
</table>


<p>This precaution is necessary because you never know what elements are used in an RSS feed.  <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www-106.ibm.com/developerworks/webservices/library/ws-pyth11.html#code2">Listing 2</a>shows the output from <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www-106.ibm.com/developerworks/webservices/library/ws-pyth11.html#code1">Listing 1</a>.</p>


<table bgcolor="#cccccc" border="1" cellpadding="5" cellspacing="0" width="100%">
<tbody><tr><td><pre><code><br><br>$ python listing1.py <br>RSS Item: http://www.python.org/2.2.2/<br>Title: Python 2.2.2b1<br>Description: (none)<br>RSS Item: http://sf.net/projects/spambayes/<br>Title: spambayes project<br>Description: (none)<br>RSS Item: http://www.mems-exchange.org/software/scgi/<br>Title: scgi 0.5<br>Description: (none)<br>RSS Item: http://roundup.sourceforge.net/<br>Title: Roundup 0.4.4<br>Description: (none)<br>RSS Item: http://www.pygame.org/<br>Title: Pygame 1.5.3<br>Description: (none)<br>RSS Item: http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/<br>Title: Pyrex 0.4.4.1<br>Description: (none)<br>RSS Item: http://www.tundraware.com/Software/hb/<br>Title: hb 1.88<br>Description: (none)<br>RSS Item: http://www.tundraware.com/Software/abck/<br>Title: abck 2.2<br>Description: (none)<br>RSS Item: http://www.terra.es/personal7/inigoserna/lfm/<br>Title: lfm 0.9<br>Description: (none)<br>RSS Item: http://www.tundraware.com/Software/waccess/<br>Title: waccess 2.0<br>Description: (none)<br>RSS Item: http://www.krause-software.de/jinsitu/<br>Title: JinSitu 0.3<br>Description: (none)<br>RSS Item: http://www.alobbs.com/pykyra/<br>Title: PyKyra 0.1.0<br>Description: (none)<br>RSS Item: http://www.havenrock.com/developer/treewidgets/index.html<br>Title: TreeWidgets 1.0a1<br>Description: (none)<br>RSS Item: http://civil.sf.net/<br>Title: Civil 0.80<br>Description: (none)<br>RSS Item: http://www.stackless.com/<br>Title: Stackless Python Beta<br>Description: (none)<br><br></code></pre></td></tr></tbody>
</table>


<p>Of course, you would expect somewhat different output because the
news items will have changed by the time you try it. The RSS.py channel
objects also provide methods for adding and modifying RSS information.
You can write the result back to RSS 1.0 format using the <code>output()</code> method.  Try this out by writing back out the information parsed in <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www-106.ibm.com/developerworks/webservices/library/ws-pyth11.html#code1">Listing 1</a>.  Kick off the script in interactive mode by running: <code>python -i listing1.py </code>.  At the resuting Python prompt, run the following example.</p>


<table bgcolor="#cccccc" border="1" cellpadding="5" cellspacing="0" width="100%">
<tbody><tr><td><pre><code><br><br>&gt;&gt;&gt; result = tc.output(items)<br>&gt;&gt;&gt; print result<br><br></code></pre></td></tr></tbody>
</table>


<p>The result is an RSS 1.0 document printed out. You must have RSS.py,
version 0.42 or more recent for this to work. There is a bug in the <code>output()</code> method in earlier versions.</p>


<p><a name="h2"><span class="atitle2">rssparser.py</span></a><br>
Mark Pilgrim offers another module for RSS file parsing. It doesn't
provide all the features and options that RSS.py does, but it does
offer a very liberal parser, which deals well with all the confusing
diversity in the world of RSS. To quote from the rssparser.py page:
</p>


<blockquote xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/">You
see, most RSS feeds suck. Invalid characters, unescaped ampersands
(Blogger feeds), invalid entities (Radio feeds), unescaped and invalid
HTML (The Register's feed most days). Or just a bastardized mix of RSS
0.9x elements with RSS 1.0 elements (Movable Type feeds).</blockquote>


<blockquote xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/">Then
there are feeds, like Aaron's feed, which are too bleeding edge. He
puts an excerpt in the description element but puts the full text in
the content:encoded element (as CDATA). This is valid RSS 1.0, but
nobody actually uses it (except Aaron), few news aggregators support
it, and many parsers choke on it. Other parsers are confused by the new
elements (guid) in RSS 0.94 (see Dave Winer's feed for an example). And
then there's Jon Udell's feed, with the <code>fullitem</code> element that he just sort of made up.</blockquote>


<p>It's funny to consider this in the light of the fact that XML and
Web services are supposed to increase interoperability. Anyway,
rssparser.py is designed to deal with all the madness.</p>


<p>Installing rssparser.py is also very easy. You download the Python
file (see Resources), rename it from "rssparser.py.txt" to
"rssparser.py", and copy it to your <code>PYTHONPATH</code>. I also
suggest getting the optional timeoutsocket module which improves the
timeout behavior of socket operations in Python, and thus can help
getting RSS feeds less likely to stall the application thread in case
of error.</p>


<p><a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www-106.ibm.com/developerworks/webservices/library/ws-pyth11.html#code3">Listing 3</a> is a script that is the equivalent of <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www-106.ibm.com/developerworks/webservices/library/ws-pyth11.html#code1">Listing 1</a>, but using rssparser.py, rather than RSS.py.</p>


<table bgcolor="#cccccc" border="1" cellpadding="5" cellspacing="0" width="100%">
<tbody><tr><td><pre><code><br>  <br>import rssparser<br>#Parse the data, returns a tuple: (data for channels, data for items)<br>channel, items = rssparser.parse("http://www.python.org/channews.rdf")<br><br>for item in items:<br>    #Each item is a dictionary mapping properties to values<br>    print "RSS Item:", item.get('link', "(none)")<br>    print "Title:", item.get('title', "(none)")<br>    print "Description:", item.get('description', "(none)")<br><br><br><br></code></pre></td></tr></tbody>
</table>


<p>As you can see, the code is much simpler. The trade-off between
RSS.py and rssparser.py is largely that the former has more features,
and maintains more syntactic information from the RSS feed. The latter
is simpler, and a more forgiving parser (the RSS.py parser only accepts
well-formed XML).</p>


<p>The output should be the same as in <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www-106.ibm.com/developerworks/webservices/library/ws-pyth11.html#code2">Listing 2</a>.</p>


<p><a name="h1"><span class="atitle2">Conclusion</span></a><br>
There are many Python tools for RSS, and we don't have space to cover
them all. Aaron Swartz's page of RSS tools is a good place to start
looking if you want to explore other modules out there. RSS is easy to
work with in Python, because of all the great modules available for it.
The modules hide all the chaos brought about by the history and
popularity of RSS. If your XML services needs mostly involve the
exchange of descriptive information for Web sites, we highly recommend
using the most successful XML service technology in employment.
</p>


<p>Next month, we will explain how to use e-mail packages for Python for writing Web services over SMTP.</p>


<p><a name="resources"><span class="atitle2">Resources</span></a></p>
<ul>
<li>Participate in the <a href="javascript:void forumWindow()" xmlns:fo="http://www.w3.org/1999/XSL/Format">discussion forum</a> on this article.  (You can also click <b xmlns:fo="http://www.w3.org/1999/XSL/Format">Discuss</b> at the top or bottom of the article to access the forum.)<br><br></li><li>Check out the previous installments of <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www-106.ibm.com/developerworks/library/ws-pythcol.html">The Python Web services developer</a> columns.<br><br></li><li>There are several resources on RSS in IBM developerWorks.  
  <ul xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/"><li><a href="http://www-106.ibm.com/developerworks/library/w-rss.html">An introduction to RSS news feeds</a>, by James Lewin, is older, but a good place to start.  It covers RSS 0.91 and 1.0, and Perl interfaces. (<i>developerWorks</i>, November 2000) </li><li><a href="http://www-106.ibm.com/developerworks/library/x-tiphdln.html">Grab headlines from a remote RDF file</a>, by Nicholas Chase, shows some XSLT and JSP code for processing RSS 0.91 and 1.0.  (<i>developerWorks</i>, April 2002) </li></ul><br></li><li>XML.com also has several articles on RSS.  Read <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www.xml.com/pub/a/2000/07/17/syndication/rss.html">RSS: Lightweight Web Syndication</a>, by Rael Dornfest, for a good general introduction.  In <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www.xml.com/pub/a/2001/05/02/semanticwebsite.html">Building a Semantic Web Site</a>, Eric van der Vlist provides an great technical introduction based on very practical examples.  <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www.xml.com/pub/a/2000/07/05/deviant/rss.html">RSS Modularization</a>, by Leigh Dodds, follows some very interesting conversation at a crucial juncture in RSS development.<br><br></li><li>Mark Nottingham is the author of <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www.mnot.net/python/RSS.py">RSS.py</a>, and has a lot of other handy stuff on his <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www.mnot.net/">home page</a>, including an excellent <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www.mnot.net/rss/tutorial/">RSS Tutorial for Content Publishers and Webmasters</a>.<br><br></li><li>Mark Pilgrim is the author of <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://diveintomark.org/archives/2002/08/13.html#ultraliberal_rss_parser">rssparser.py</a>, an "ultra liberal" RSS parser.  The code is available as a <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://diveintomark.org/projects/misc/rssparser.py.txt">text download</a>.  If you install it, I also recommend getting <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www.timo-tasi.org/python/timeoutsocket.py">timeoutsocket.py</a>.<br><br></li><li>Fredrik Lundh, the author of xmlrpclib.py and soaplib.py, is working on <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://effbot.org/zone/effnews.htm">The EffNews Project: Building an RSS Newsreader</a>, a python project for creating a GUI front end for reading news from RSS feeds.<br><br></li><li><a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://www.oreillynet.com/%7Erael/lang/python/peerkat/">Peerkat</a> is a resource aggregator written in Python that allows people to use RSS to manage the Web content they follow.<br><br></li><li>Aaron Swartz maintains a <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://blogspace.com/rss/tools">list of RSS tools</a> for all languages and platforms.<br></li>
</ul>
<table border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody><tr><td><a name="author1"></a><span class="atitle2">About the authors</span><br><img xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" alt="Photo of Mike Olson" src="http://www-106.ibm.com/developerworks/i/p-olson.jpg" align="left" border="0" height="80" width="64">
 Mike Olson is a consultant and co-founder of <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://fourthought.com/">Fourthought Inc.</a>,
a software vendor and consultancy specializing in XML solutions for
enterprise knowledge management applications. Fourthought develops <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://4suite.org/">4Suite</a>, an open source
platform for XML middleware.  You can contact Mr. Olson at <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="mailto:mike.olson@fourthought.com">mike.olson@fourthought.com</a>.</td></tr><tr><td><p><a name="author2"><br></a><img xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" alt="Photo of Uche Ogbuji" src="http://www-106.ibm.com/developerworks/i/p-uche.jpg" align="left" border="0" height="80" width="64">
 Uche Ogbuji is a consultant and co-founder of <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://fourthought.com/">Fourthought Inc.</a>,
a software vendor and consultancy specializing in XML solutions for
enterprise knowledge management applications. Fourthought develops <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="http://4suite.org/">4Suite</a>,
an open source
platform for XML middleware. Mr. Ogbuji is a Computer Engineer and
writer born in Nigeria, living and working in Boulder, Colorado, USA.
You can contact Mr. Ogbuji at <a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerWorks/" href="mailto:uche.ogbuji@fourthought.com">uche.ogbuji@fourthought.com</a>.</p></td></tr></tbody>
</table>
<img src ="http://www.blogjava.net/pyguru/aggbug/1265.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/pyguru/" target="_blank">pyguru</a> 2005-02-17 02:48 <a href="http://www.blogjava.net/pyguru/archive/2005/02/17/1265.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>"universal" RSS feed parser</title><link>http://www.blogjava.net/pyguru/archive/2005/02/17/1264.html</link><dc:creator>pyguru</dc:creator><author>pyguru</author><pubDate>Wed, 16 Feb 2005 18:40:00 GMT</pubDate><guid>http://www.blogjava.net/pyguru/archive/2005/02/17/1264.html</guid><wfw:comment>http://www.blogjava.net/pyguru/comments/1264.html</wfw:comment><comments>http://www.blogjava.net/pyguru/archive/2005/02/17/1264.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/pyguru/comments/commentRss/1264.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/pyguru/services/trackbacks/1264.html</trackback:ping><description><![CDATA[<h2>Feed Parser</h2>

<p>This is a "universal" feed parser, suitable for reading syndicated
feeds as produced by weblogs, news sites, wikis, and many other types
of sites. It handles Atom feeds, CDF, and the <a href="http://diveintomark.org/archives/2004/02/04/incompatible-rss">nine different versions of RSS</a>.</p>


<p>This project is now <a href="http://sourceforge.net/projects/feedparser/">hosted at SourceForge</a>.  Please check there for updates.  This page contains old news and is no longer updated.  (2004-06-21)</p>
<img src ="http://www.blogjava.net/pyguru/aggbug/1264.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/pyguru/" target="_blank">pyguru</a> 2005-02-17 02:40 <a href="http://www.blogjava.net/pyguru/archive/2005/02/17/1264.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>How to replace a string in multiple files?</title><link>http://www.blogjava.net/pyguru/archive/2005/02/16/1243.html</link><dc:creator>pyguru</dc:creator><author>pyguru</author><pubDate>Tue, 15 Feb 2005 18:18:00 GMT</pubDate><guid>http://www.blogjava.net/pyguru/archive/2005/02/16/1243.html</guid><wfw:comment>http://www.blogjava.net/pyguru/comments/1243.html</wfw:comment><comments>http://www.blogjava.net/pyguru/archive/2005/02/16/1243.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/pyguru/comments/commentRss/1243.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/pyguru/services/trackbacks/1243.html</trackback:ping><description><![CDATA[
perl&nbsp;-pi&nbsp;-e&nbsp;'s/str1/str2/g'&nbsp;urfiles&nbsp;
<img src ="http://www.blogjava.net/pyguru/aggbug/1243.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/pyguru/" target="_blank">pyguru</a> 2005-02-16 02:18 <a href="http://www.blogjava.net/pyguru/archive/2005/02/16/1243.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>CVS的常用命令速查手册</title><link>http://www.blogjava.net/pyguru/archive/2005/02/15/1236.html</link><dc:creator>pyguru</dc:creator><author>pyguru</author><pubDate>Tue, 15 Feb 2005 15:39:00 GMT</pubDate><guid>http://www.blogjava.net/pyguru/archive/2005/02/15/1236.html</guid><wfw:comment>http://www.blogjava.net/pyguru/comments/1236.html</wfw:comment><comments>http://www.blogjava.net/pyguru/archive/2005/02/15/1236.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/pyguru/comments/commentRss/1236.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/pyguru/services/trackbacks/1236.html</trackback:ping><description><![CDATA[<font size="5"><b><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>的常用命令速查手册</b></font>
<div align="right"><font face="verdana,arial" size="2">蓝森林 http://www.lslnet.com 2002年9月2日 11:08</font></div>
<br>

<p>作 者： 车东<br>
</p>
<p><a href="mailto:chedong@bigfoot.com">chedong@bigfoot.com</a>&nbsp;                 
</p>
<p>
</p>
<p>最后更新：2002-08-30 13:18:41
</p>

<p>版权声明：可以任意转载，转载时请务必标明原始出处和作者信息<br>
</p>

<p>概述：<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>是一个C/S系统，多个开发人员通过一个中心版本控制系统来记录文件版本，从而达到保证文件同步的目的。
</p>

<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>服务器（文件版本库）<br>      
&nbsp;&nbsp;&nbsp;&nbsp; /&nbsp;&nbsp;&nbsp;&nbsp;       
|&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \<br>      
（版 本 同 步）<br>     
&nbsp;&nbsp; /&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;       
|&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \&nbsp;&nbsp;<br>      
开发者1&nbsp; 开发者2&nbsp;&nbsp; 开发者3       
</p>

<p>以下是本文主要内容：开发人员可以主要挑选2, 6看就可以了，<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>的管理员则更需要懂的更多一些       
</p>

<ol>
<li><a href="http://www.lslnet.com/linux/docs/linux-3874.htm#init"><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>环境初始化</a>：<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>环
境的搭建&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
管理员</li><li><a href="http://www.lslnet.com/linux/docs/linux-3874.htm#daily"><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>的日常使用</a>：日常开发中最常用的<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>命令，&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;       
      开发人员&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 管理员</li><li><a href="http://www.lslnet.com/linux/docs/linux-3874.htm#branch"><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>的分支开发</a>：
项目按照不同进度和目标并发进行&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
管理员</li><li><a href="http://www.lslnet.com/linux/docs/linux-3874.htm#ssh"><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>的用户认证</a>：
通过SSH的远程用户认证，安全，简单&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
管理员</li><li><a href="http://www.lslnet.com/linux/docs/linux-3874.htm#cvsweb">CVSWEB</a>：<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>的WEB
访问界面大大提高代码版本比较的效率&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 管理员</li><li><a href="http://www.lslnet.com/linux/docs/linux-3874.htm#tag"><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b> TAG</a>：将$Id$加入代码注释中，方便开发过程的跟踪&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;       
      开发人员</li><li><a href="http://www.lslnet.com/linux/docs/linux-3874.htm#vss"><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b> vs VSS</a>: <b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>和Virsual SourceSafe的比较</li>
</ol>

<p>一个系统20%的功能往往能够满足80%的需求，<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>也不例外，以下是<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>最常用的功能，可能用到的还不到它全部命令选项的10%，更多的功能请在实际应用过程中体会，学习过程中应该是用多少，学多少，用到了再学也不迟。
</p>

<p><br>
<a name="init"><b><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>环境初始化<br>
</b></a><b>============</b></p>

<p>环境设置：指定<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>库的路径CVSROOT<br>
tcsh<br>
setenv CVSROOT /path/to/cvsroot<br>               
bash<br>
CVSROOT=/path/to/cvsroot ; export CVSROOT</p>
       
<p>后面还提到远程<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>服务器的设置：<br>
CVSROOT=:ext:$USER@test.server.address#port:/path/to/cvsroot CVS_RSH=ssh; export                
CVSROOT CVS_RSH<br>               
<br>
初始化：<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>版本库的初始化。<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> init</p>
    
<p>一个项目的首次导入<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> import -m "write some comments here" project_name vendor_tag    
release_tag<br>
执行后：会将所有源文件及目录导入到/path/to/cvsroot/project_name目录下<br>
<i>vender_tag: 开发商标记<br>   
release_tag: 版本发布标记</i></p>
   
<p>
项目导出：将代码从<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>库里导出<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> checkout project_name<br>    
<i><b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> 将创建project_name目录，并将最新版本的源代码导出到相应目录中。这个checkout和Virvual  
SourceSafe中的check out不是一个概念，相对于Virvual SourceSafe的check  
out是<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> update， check in是<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> commit。</i><br>             
<br>
<a name="daily"><b><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>的日常使用</b></a><b>&nbsp;&nbsp;</b><br>
<b>=============</b></p>

<p><b>注意：第一次导出以后，就不是通过<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>     
checkout来同步文件了，而是要进入刚才<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> checkout project_name导出的project_name目录下进行具体文件的版本同步（添加，修改，删除）操作。</b></p>
   
<p>将文件同步到最新的版本：<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> update<br>   
<i>不制定文件名，<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>将同步所有子目录下的文件，也可以制定某个文件名/目录进行同步<br>
</i><b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> update file_name<br>   
<i>最好每天开始工作前或将自己的工作导入到<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>库里前都要做一次，并养成“先同步                
后修改”的习惯，和Virvual SourceSafe不同，<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>里没有文件锁定的概念，所有的冲突是在commit之前解决，如果你修改过程中，有其他人修改并commit到了<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>库中，<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>会通知你文件冲突，并自动将冲突部分用<br> 
&gt;&gt;&gt;&gt;&gt;&gt;<br>
content on <b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> server<br> 
&lt;&lt;&lt;&lt;&lt;&lt;<br>
content in your file<br> 
&gt;&gt;&gt;&gt;&gt;&gt;<br>
标记出来，由你确认冲突内容的取舍。<br>
版本冲突一般是在多个人修改一个文件造成的，但这种项目管理上的问题不应该指望由<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>来解决。</i></p>

<p>确认修改写入到<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>库里：<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> commit -m "write some comments here" file_name</p>
               
<p><i>注意：<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>的很多动作都是通过<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> commit进行最后确认并修改的，最好每次只修改一个文件。在确认的前，还需要用户填写修改注释，以帮助其他开发人员了解修改的原因。如果不用写-m 
"comments"而直接确认`<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> commit file_name` 的话，<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>会自动调用系统缺省的文字编辑器(一般是vi)要求你写入注释。<br>
注释的质量很重要：所以不仅必须要写，而且必须写一些比较有意义的内容：以方便其他开发人员能够很好的理解<br>
不好的注释，很难让其他的开发人员快速的理解：比如： -m 
"bug fixed" 甚至 -m ""<br>
好的注释，甚至可以用中文: -m "在用户注册过程中加入了Email地址校验"</i>
<br>
<br>
修改某个版本注释：每次只确认一个文件到<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>库里是一个很好的习惯，但难免有时候忘了指定文件名，把多个文件以同样注释commit到<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>库里了，以下命令可以允许你修改某个文件某个版本的注释：<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> admin -m 1.3:"write some comments here" file_name<br>               
<br>
添加文件<br>
创建好新文件后，比如：touch new_file<br>               
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> add new_file<br>  
<i>注意：对于图片，Word文档等非纯文本的项目，需要使用<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>
add -b选项，否则有可能出现文件被破坏的情况<br>
比如：<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> add -kb new_file.gif<br> 
</i> 
然后确认修改并注释
<br>          
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> ci -m "write some comments here"</p>
               
<p>
删除文件：<br>
将某个源文件物理删除后，比如：rm file_name<br>               
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> rm file_name<br>    
然后确认修改并注释<br>          
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> ci -m "write some comments here"<br>      
以上面前2步合并的方法为：<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> rm -f file_name<br>     
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> ci -m "why delete file"<br>     
</p>
          
<p><i>注意：很多<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>命令都有缩写形式：commit=&gt;ci; update=&gt;up;               
checkout=&gt;co; remove=&gt;rm;</i></p>
            
<p>          
<br>
添加目录：<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> add dir_name<br>              
<br>
查看修改历史：<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> log file_name<br>              
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> history file_name<br>              
<br>
查看当前文件不同版本的区别<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> diff -r1.3 -r1.5 file_name<br>              
查看当前文件（可能已经修改了）和库中相应文件的区别<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> diff file_name<br>      
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>的web界面提供了更方便的定位文件修改和比较版本区别的方法，具体安装设置请看后面的cvsweb使用</p>
       
<p>正确的通过<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>恢复旧版本的方法：<br>
如果用<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> update -r1.2 file.name<br>             
这个命令是给file.name加一个STICK TAG： "1.2"              
，虽然你的本意只是想将它恢复到1.2版本<br>
正确的恢复版本的方法是：<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> update -p -r1.2 file_name &gt;file_name<br>             
如果不小心已经加成STICK TAG的话：用<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> update -A 解决</p>
             
<p>移动文件：文件重命名<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>里没有<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> move或<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> rename，因为这两个操作是先<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> remove    
old_file_name，然后<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> add new_file_name实现的。</p>
            
<p>
删除，移动目录：<br>
最方便的方法是让管理员直接移动，删除CVSROOT里相应目录（因为<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>一个项目下的子目录都是独立的，移动到$CVSROOT目录下都可以作为新的独立项目：好比一颗树，其实砍下任意一枝都能独立存活），对目录进行了修改后，要求其开发人员重新导出项目<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>              
checkout project_name 或者用<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> update -dP同步。</p>
            
<p><a name="branch"><b><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b> Branch：项目多分支同步开发<br>            
</b></a><b>=============================</b></p>

<p>确认版本里程碑：多个文件各自版本号不一样，项目到一定阶段，可以给所有文件统一指定一个阶段里程碑版本号，方便以后按照这个阶段里程碑版本号导出项目，同时也是项目的多个分支开发的基础。<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> tag release_1_0</p>
             
<p>开始一个新的里程碑：<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> commit -r 2 标记所有文件开始进入2.x的开发</p>
            
<p><i>注意：<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>里的revsion和软件包的发布版本可以没有直接的关系。但所有文件使用和发布版本一致的版本号比较有助于维护。</i></p>
        
<p>在开发项目的2.x版本的时候发现1.x有问题，但2.x又不敢用，则从先前标记的里程碑：release_1_0导出一个分支release_1_0_patch<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> rtag -b -r release_1_0 release_1_0_patch proj_dir</p>
            
<p>一些人先在另外一个目录下导出release_1_0_patch这个分支：解决1.0中的紧急问题，<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> checkout -r release_1_0_patch<br>    
而其他人员仍旧在项目的主干分支2.x上开发</p>
        
<p>在release_1_0_patch上修正错误后，标记一个1.0的错误修正版本号<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> tag release_1_0_patch_1</p>
             
<p>如果2.0认为这些错误修改在2.0里也需要，也可以在2.0的开发目录下合并release_1_0_patch_1中的修改到当前代码中：<br>
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> update -j release_1_0_patch_1</p>
             
<p><b><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>的远程认证：</b><a name="ssh"><b>通过SSH远程访问<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b><br>
</b></a><b>================================</b></p>

<p>使用<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>本身的远程认证很麻烦,需要定义服务器和用户组，用户名，设置密码等，而且不安全，因此和系统本地帐号认证并通过SSH传输是比较好的办法，通过在客户机的/etc/profile里设置一下内容：<br>
CVSROOT=:ext:$USER@test.server.address#port:/path/to/cvsroot CVS_RSH=ssh; export             
CVSROOT CVS_RSH<br>            
所有客户机所有本地用户都可以映射到<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>服务器相应同名帐号了。<br>
<br>
如果<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>所在服务器的SSH端口不在缺省的22，或者和客户端与<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>服务器端SSH缺省端口不一致，有时候设置了：<br>
:ext:$USER@test.server.address#port:/path/to/cvsroot&nbsp;<br>
<br>
仍然不行，比如有以下错误信息：<br>
ssh: test.server.address#port: Name or service not known<br>            
<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> [checkout aborted]: end of file from server (consult above messages if any)<br>            
<br>
解决的方法是做一个脚本指定端口转向（不能使用alias，会出找不到文件错误）：<br>
创建一个/usr/bin/ssh_cvs文件：<br>
#!/usr/bin/sh<br>
/path/to/ssh -p 34567 "$@"<br>           
然后：chmod +x /usr/bin/ssh_cvs<br>            
并CVS_RSH=ssh_cvs; export CVS_RSH</p>
    
<p>注意：port是指相应服务器SSH的端口，不是<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b> pserver的端口<br>          
<br>
<a name="cvsweb"><b>CVSWEB：提高程序员比较文件修改效率<br>
</b></a><b>================================</b></p>

<p>CVSWEB就是<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>的WEB界面，可以大大提高程序员定位修改的效率:<br>
使用的样例可以看：<a href="http://www.freebsd.org/cgi/cvsweb.cgi">http://www.freebsd.org/cgi/cvsweb.cgi</a></p>

<p>CVSWEB的下载：CVSWEB从最初的版本已经演化出很多功能界面更丰富的版本，这个是个人感觉觉得安装设置比较方便的：<br>
<a href="http://www.spaghetti-code.de/software/linux/cvsweb/">http://www.spaghetti-code.de/software/linux/cvsweb/</a><br>
<br>
下载解包：<br>
tar zxf cvsweb.tgz<br>          
把配置文件cvsweb.conf放到安全的地方（比如和apache的配置放在同一个目录下），<br>
修改：cvsweb.cgi让CGI找到配置文件：<br>
$config = $ENV{'CVSWEB_CONFIG'} || '/path/to/apache/conf/cvsweb.conf';<br>         
<br>
转到/path/to/apache/conf下并修改cvsweb.conf：</p>

<ol>
<li>
修改CVSROOT路径设置：<br>
%CVSROOT = (<br>          
'Development' =&gt; '/path/to/cvsroot', #&lt;==修改指向本地的CVSROOT<br>       
);</li><li>
缺省不显示已经删除的文档：<br>
"hideattic" =&gt; "1",#&lt;==缺省不显示已经删除的文档</li><li>在配置文件cvsweb.conf中还可以定制页头的描述信息，你可以修改$long_intro成你需要的文字</li>
</ol>

<p>
CVSWEB可不能随便开放给所有用户，因此需要使用WEB用户认证：<br>
先生成 passwd:<br>          
/path/to/apache/bin/htpasswd -c cvsweb.passwd user<br>          
<br>
修改httpd.conf: 增加<br>          
&lt;Directory "/path/to/apache/cgi-bin/cvsweb/"&gt;<br>          
AuthName "<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b> Authorization"<br>          
AuthType Basic<br>          
AuthUserFile /path/to/cvsweb.passwd<br>          
require valid-user<br>          
&lt;/Directory&gt;<br>
<br>
<a name="tag"><b><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b> TAGS: who? when?<br>
</b></a><b>====================</b></p>

<p>将$Id$ 加在程序文件开头的注释里是一个很好的习惯，<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>能够自动解释更新其中的内容成：file_name        
version time user_name 的格式，比如：cvs_card.txt,v 1.1 2002/04/05        
04:24:12 chedong Exp，可以这些信息了解文件的最后修改人和修改时间<br>       
<br>
几个常用的缺省文件：<br>
default.php<br>
&lt;?php<br>
/*<br>
* Copyright (c) 2002 Company Name.<br>       
* $Header$<br>      
*/<br>
<br>
?&gt;
</p>

<p>====================================
<br>
Default.java: 注意文件头一般注释用 /* 开始 JAVADOC注释用 /**        
开始的区别<br>
/*<br>
* Copyright (c) 2002 Company Name.<br>       
* $Header$<br>      
*/<br>
<br>
package com.netease;<br>       
<br>
import java.io;<br>       
<br>
/**<br>
* comments here<br>       
*/<br>
public class Default {<br>       
&nbsp;&nbsp;&nbsp; /**<br>       
&nbsp;&nbsp;&nbsp; *<br>       
&nbsp;&nbsp;&nbsp; * @param<br>       
&nbsp;&nbsp;&nbsp; * @return<br>       
&nbsp;&nbsp;&nbsp; */<br>       
&nbsp;&nbsp;&nbsp; public toString() {<br>       
<br>
&nbsp;&nbsp;&nbsp; }<br>      
}
</p>

<p>====================================
<br>
default.pl:<br>
#!/usr/bin/perl -w<br>    
# Copyright (c) 2002 Company Name.<br>     
# $Header$<br>    
<br>
# file comments here<br>     
<br>
use strict;<br>     
</p>

<p><a name="vss"><b><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b> vs VSS</b></a>　<br>
===========
</p>

<p><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>没有文件锁定模式，VSS在check out同时，同时记录了文件被导出者锁定。
</p>

<p><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>是update commit， VSS是check out check in
</p>

<p>在<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>中，标记自动更新功能缺省是打开的，这样也带来一个潜在的问题，就是不用-kb方式添加binary文件的话在<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>自动更新时可能会导致文件失效。
</p>

<p>Virsual  
SourceSafe中这个功能称之为Keyword Explaination，缺省是关闭的，需要通过OPITION打开，并指定需要进行源文件关键词扫描的类型：*.txt,*.java,*.html...
</p>

<p>对于Virsual  
SourceSafe和<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>都通用的TAG有：<br>
$Header$<br>
$Author$<br>
$Date$    
<br>
$Revision$
</p>

<p>尽量使用通用的关键词保证代码在<b style="color: black; background-color: rgb(255, 255, 102);">CVS</b>和VSS都能方便的跟踪。
</p>

<p>　
</p>

<p>相关资源：</p>

<p><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b> HOME：<br>   
<a href="http://www.cvshome.org/">http://www.cvshome.org</a></p>

<p><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b> FAQ：<br>   
<a href="http://www.loria.fr/%7Emolli/cvs-index.html">http://www.loria.fr/~molli/<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>-index.html</a><br>
<br>
相关网站:<br>
<a href="http://directory.google.com/Top/Computers/Software/Configuration_Management/Tools/Concurrent_Versions_System/">http://directory.google.com/Top/Computers/Software/Configuration_Management/Tools/Concurrent_Versions_System/</a></p>

<p><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b> 免费书:<br>   
<a href="http://cvsbook.red-bean.com/">http://cvsbook.red-bean.com/</a></p>

<p><b style="color: black; background-color: rgb(255, 255, 102);">CVS</b> 命令的速查卡片：<br>  
<a href="http://www.refcards.com/about/cvs.html">http://www.refcards.com/about/<b style="color: black; background-color: rgb(255, 255, 102);">cvs</b>.html</a></p>


<br>
<p>摘自：<a href="http://www.chedong.com/tech/cvs_card.html" target="_blank">http://www.chedong.com/tech/cvs_card.html</a></p>
<img src ="http://www.blogjava.net/pyguru/aggbug/1236.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/pyguru/" target="_blank">pyguru</a> 2005-02-15 23:39 <a href="http://www.blogjava.net/pyguru/archive/2005/02/15/1236.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>WEB/APPLICATION/DATABASE服务器硬件DB配置</title><link>http://www.blogjava.net/pyguru/archive/2005/02/15/1194.html</link><dc:creator>pyguru</dc:creator><author>pyguru</author><pubDate>Mon, 14 Feb 2005 18:09:00 GMT</pubDate><guid>http://www.blogjava.net/pyguru/archive/2005/02/15/1194.html</guid><wfw:comment>http://www.blogjava.net/pyguru/comments/1194.html</wfw:comment><comments>http://www.blogjava.net/pyguru/archive/2005/02/15/1194.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/pyguru/comments/commentRss/1194.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/pyguru/services/trackbacks/1194.html</trackback:ping><description><![CDATA[<font color="#408080">: &nbsp;想搞一个WEB/APPLICATION/DATABASE服务器。准备用LINUX。预计用户量大概有2000个&nbsp;</font><br>

同&nbsp;<br>

<font color="#408080">: &nbsp;时在线吧(CONCURRENT&nbsp;TRANSACTION能到200就行)。我只用过REDHAT&nbsp;LINUX做一般开发&nbsp;</font><br>

用&nbsp;<br>

<font color="#408080">: &nbsp;的平台，没有用它当过大用户量的服务器。机器准备自己装：2&nbsp;PROCESSOR&nbsp;</font><br>

<font color="#408080">: &nbsp;2.8GXEON，2G-4G的MEMORY，120G&nbsp;-&nbsp;300G的硬盘(SATA或者SCSI)，问题是：&nbsp;</font><br>

<font color="#408080">: &nbsp;1、这个硬件配置行不行？&nbsp;</font><br>

<font color="#408080">: &nbsp;2、用什么LINUX好？REDHAT、FREEBSD、SUSE、其它的？&nbsp;</font><br>

<font color="#408080">: &nbsp;</font><br>

3、用什么DB好，PREGRESQL还是MYSQL？MYSQL现在也支持TRANSACTION了，但POSTGRESQL&nbsp;<br>

<font color="#408080">: &nbsp;好象还有很多跟ORACLE很接近的功能，但从来没用过这个DB。&nbsp;</font><br>

<font color="#408080">: &nbsp;4、APPLICATION&nbsp;SERVER准备用TOMCAT5.0&nbsp;+&nbsp;JDK1.5，以前知道TOMCAT不能支持大用户&nbsp;</font><br>

量&nbsp;<br>

<font color="#408080">: &nbsp;，不知道现在还是不是。&nbsp;</font><br>

<font color="#408080">: &nbsp;5、还有什么建议？&nbsp;</font><br>

<font color="#408080">: &nbsp;&nbsp;</font><br>

<font color="#408080">: &nbsp;多谢。&nbsp;</font><br>

<font color="#408080">: &nbsp;&nbsp;</font><br>

&nbsp;<br>

主要取决于这些transaction的复杂程度.一般来说应该还可以.但如果有很多&nbsp;<br>

varchar,blob之类的数据，就比较玄。&nbsp;<br>

&nbsp;<br>

至于OS，推荐商业Linux，我们用RHAS比较多。SuSE也不错。考虑到要用&nbsp;<br>

Java等，不要用FreeBSD。商业Linux的好处是你不用太费心去关心软件升级和维护。&nbsp;<br>

&nbsp;<br>

DB之类，能用商业Oracle或DB2，性能要好得多。但如果省钱，建议还是&nbsp;<br>

MySQL，但要好好tune，并且在Business&nbsp;Logic设计是，尽量减少和DB之间&nbsp;<br>

的交互。MySQL的缺点还有，不支持Store&nbsp;Procedure。但你可把那些Business&nbsp;<br>

Logic放到数据库外。&nbsp;<br>

&nbsp;<br>

Application&nbsp;Server可能是最大的问题。Tomcat基本上是个轻型的Web/Servlet&nbsp;<br>

Server,&nbsp;大用户量,由于缺乏一些支持,会比较困难.另外,你有大量Transactions,&nbsp;<br>

Tomcat本身没有Persistent的支持，你如果想在这一层上实现transaction，&nbsp;<br>

恐怕得装其他container，如EJB，或者Hibernate之类。在这一层上cache的数据&nbsp;<br>

越多，对MySQL的以来就越少。有些量不大的系统数据，可以通过一些技巧&nbsp;<br>

事先load到这一层，那么和数据库的交互就小得多。&nbsp;<br>

&nbsp;<br>

J2SE&nbsp;5.0据说性能有提高，但我以为用它太冒进。不够Stable。如果没有transaction，&nbsp;<br>

倒不是问题。另外，只有Tomcat&nbsp;5.5以后的版本才能运行在J2SE&nbsp;5.0上。&nbsp;<br>

做服务器，BEA的JRockit&nbsp;VM不错。 <br>
<br>

如果是普通的服务，同时在线人数最多也就一两百个人，配置稍微好一点的pc就能行。&nbsp;<br>

人数如果多，最关键是内存一定要大，越大越好。<br>
<img src ="http://www.blogjava.net/pyguru/aggbug/1194.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/pyguru/" target="_blank">pyguru</a> 2005-02-15 02:09 <a href="http://www.blogjava.net/pyguru/archive/2005/02/15/1194.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title> Virtual hosts in Apache</title><link>http://www.blogjava.net/pyguru/archive/2005/02/14/1157.html</link><dc:creator>pyguru</dc:creator><author>pyguru</author><pubDate>Sun, 13 Feb 2005 18:56:00 GMT</pubDate><guid>http://www.blogjava.net/pyguru/archive/2005/02/14/1157.html</guid><wfw:comment>http://www.blogjava.net/pyguru/comments/1157.html</wfw:comment><comments>http://www.blogjava.net/pyguru/archive/2005/02/14/1157.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/pyguru/comments/commentRss/1157.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/pyguru/services/trackbacks/1157.html</trackback:ping><description><![CDATA[<h3 class="post-title">
	 
	 Virtual hosts in Apache
	 
    </h3>

    

	         
	
in Vhosts.conf, the following is the real setting. The docroot may not
allow index, so you need to put index.html to test the virtual host<br>
<br>
===============================================<br>
<br>
# Listen for virtual host requests on all IP addresses<br>
NameVirtualHost *:80<br>
<br>
<virtualhost><br>DocumentRoot /var/www/html/web<br>ServerName web.mydomain.com<br><br># Other directives here</virtualhost><virtualhost><br>DocumentRoot /var/www/html/news<br>ServerName news.mydomain.com<br><br># Other directives here</virtualhost><br>
<br>
DocumentRoot /var/www/html/photo<br>
ServerName photo.mydomain.com<br>
<br>
# Other directives here<br>
<img src ="http://www.blogjava.net/pyguru/aggbug/1157.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/pyguru/" target="_blank">pyguru</a> 2005-02-14 02:56 <a href="http://www.blogjava.net/pyguru/archive/2005/02/14/1157.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title> map linux network drive in Windows</title><link>http://www.blogjava.net/pyguru/archive/2005/02/14/1156.html</link><dc:creator>pyguru</dc:creator><author>pyguru</author><pubDate>Sun, 13 Feb 2005 18:41:00 GMT</pubDate><guid>http://www.blogjava.net/pyguru/archive/2005/02/14/1156.html</guid><wfw:comment>http://www.blogjava.net/pyguru/comments/1156.html</wfw:comment><comments>http://www.blogjava.net/pyguru/archive/2005/02/14/1156.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/pyguru/comments/commentRss/1156.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/pyguru/services/trackbacks/1156.html</trackback:ping><description><![CDATA[<p>首先创建一个本地Unix账号： </p>
  
<table bgcolor="#cccccc" border="1" cellpadding="5" cellspacing="0" width="100%">
 <tbody><tr><td><pre><code><br>　　useradd -r myaccount<br></code></pre></td></tr></tbody> 
</table>
    
<p>　　这条命令创建了一个名为myaccount的普通Unix用户。 </p>
   
<p>　　然后根据它创建一个Samba用户： </p>
  
<table bgcolor="#cccccc" border="1" cellpadding="5" cellspacing="0" width="100%">
 <tbody><tr><td><pre><code><br>　　<b style="color: black; background-color: rgb(255, 255, 102);">smbadduser</b> myaccount:mysmbact<br></code></pre></td></tr></tbody> 
</table>
   
<p>　　或者是： </p>
  
<table bgcolor="#cccccc" border="1" cellpadding="5" cellspacing="0" width="100%">
 <tbody><tr><td><pre><code><br>　　smbpasswd -a myaccount<br></code></pre></td></tr></tbody> 
</table>
    
<p>The password in Samba is not related to the unix account password.<br></p>
 
<p><span id="ArticleContent1_ArticleContent1_lblContent">注意：一旦你更新了<b style="color: black; background-color: rgb(255, 255, 102);">samba</b>配置文件，你必须要通过<b style="color: black; background-color: rgb(160, 255, 255);">使用</b><i>/etc/init.d/<b style="color: black; background-color: rgb(255, 255, 102);">samba</b> restart</i> (debian)来重起你的<b style="color: black; background-color: rgb(255, 255, 102);">samba</b></span></p>
 Then in windows, use the username and samba's password to map network drive.<img src ="http://www.blogjava.net/pyguru/aggbug/1156.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/pyguru/" target="_blank">pyguru</a> 2005-02-14 02:41 <a href="http://www.blogjava.net/pyguru/archive/2005/02/14/1156.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>The Apache Web Server</title><link>http://www.blogjava.net/pyguru/archive/2005/02/14/1155.html</link><dc:creator>pyguru</dc:creator><author>pyguru</author><pubDate>Sun, 13 Feb 2005 18:35:00 GMT</pubDate><guid>http://www.blogjava.net/pyguru/archive/2005/02/14/1155.html</guid><wfw:comment>http://www.blogjava.net/pyguru/comments/1155.html</wfw:comment><comments>http://www.blogjava.net/pyguru/archive/2005/02/14/1155.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/pyguru/comments/commentRss/1155.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/pyguru/services/trackbacks/1155.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp; 摘要: The Apache Web Server&nbsp;= = = = = = = = = = = = = = = = = = = = = = = = = = = = = == = = = = = = = = = = = =In This ChapterChapter20The Apache Web ServerDownlo...&nbsp;&nbsp;<a href='http://www.blogjava.net/pyguru/archive/2005/02/14/1155.html'>阅读全文</a><img src ="http://www.blogjava.net/pyguru/aggbug/1155.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/pyguru/" target="_blank">pyguru</a> 2005-02-14 02:35 <a href="http://www.blogjava.net/pyguru/archive/2005/02/14/1155.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>