<rt id="bn8ez"></rt>
<label id="bn8ez"></label>

  • <span id="bn8ez"></span>

    <label id="bn8ez"><meter id="bn8ez"></meter></label>

    Python, Java, Life, etc

    A blog of technology and life.

    BlogJava 首頁 新隨筆 聯系 聚合 管理
      30 Posts :: 0 Stories :: 9 Comments :: 0 Trackbacks
    Guest Contributor, TechRepublic
    December 22, 2004
    URL: http://www.builderau.com.au/architect/webservices/0,39024590,39171461,00.htm


    TechRepublic

    Take advantage of the XML::RSS CPAN package, which is specifically designed to read and parse RSS feeds.

    You've probably already heard of RSS, the XML-based format which allows Web sites to publish and syndicate the latest content on their site to all interested parties. RSS is a boon to the lazy Webmaster, because (s)he no longer has to manually update his or her Web site with new content.

    Instead, all a Webmaster has to do is plug in an RSS client, point it to the appropriate Web sites, and sit back and let the site "update itself" with news, weather forecasts, stock market data, and software alerts. You've already seen, in previous articles, how you can use the ASP.NET platform to manually parse an RSS feed and extract information from it by searching for the appropriate elements. But I'm a UNIX guy, and I have something that's even better than ASP.NET. It's called Perl.

    Installing XML::RSS
    RSS parsing in Perl is usually handled by the XML::RSS CPAN package. Unlike ASP.NET, which comes with a generic XML parser and expects you to manually write RSS-parsing code, the XML::RSS package is specifically designed to read and parse RSS feeds. When you give XML::RSS an RSS feed, it converts the various <item>s in the feed into array elements, and exposes numerous methods and properties to access the data in the feed. XML::RSS currently supports versions 0.9, 0.91, and 1.0 of RSS.

    Written entirely in Perl, XML::RSS isn't included with Perl by default, and you must install it from CPAN. Detailed installation instructions are provided in the download archive, but by far the simplest way to install it is to use the CPAN shell, as follows:

    shell> perl -MCPAN -e shell
    cpan> install XML::RSS

    If you use the CPAN shell, dependencies will be automatically downloaded for you (unless you told the shell not to download dependent modules). If you manually download and install the module, you may need to download and install the XML::Parser module before XML::RSS can be installed. The examples in this tutorial also need the LWP::Simple package, so you should download and install that one too if you don't already have it.

    Basic usage
    For our example, we'll assume that you're interested in displaying the latest geek news from Slashdot on your site. The URL for Slashdot's RSS feed is located here. The script in Listing A retrieves this feed, parses it, and turns it into a human-readable HTML page using XML::RSS:

    Listing A

    #!/usr/bin/perl

    # import packages
    use XML::RSS;
    use LWP::Simple;

    # initialize object
    $rss = new XML::RSS();

    # get RSS data
    $raw = get('http://www.slashdot.org/index.rss');

    # parse RSS feed
    $rss->parse($raw);

    # print HTML header and page
    print "Content-Type: text/html\n\n";
    print ""; print ""; print "";
    print "";
    print "
    " . $rss->channel('title') . "
    "; # print titles and URLs of news items foreach my $item (@{$rss->{'items'}}) { $title = $item->{'title'}; $url = $item->{'link'}; print "$title

    "; } # print footers print "

    ";
    print "";

    Place the script in your Web server's cgi-bin/ directory/. Remember to make it executable, and then browse to it using your Web browser. After a short wait for the RSS file to download, you should see something like Figure A.

    Figure A


    Slashdot RSS feed

    How does the script in Listing A work? Well, the first task is to get the RSS feed from the remote system to the local one. This is accomplished with the LWP::Simple package, which simulates an HTTP client and opens up a network connection to the remote site to retrieve the RSS data. An XML::RSS object is created, and this raw data is then passed to it for processing.

    The various elements of the RSS feed are converted into Perl structures, and a foreach() loop is used to iterate over the array of items. Each item contains properties representing the item name, URL and description; these properties are used to dynamically build a readable list of news items. Each time Slashdot updates its RSS feed, the list of items displayed by the script above will change automatically, with no manual intervention required.

    The script in Listing A will work with other RSS feeds as well—simply alter the URL passed to the LWP's get() method, and watch as the list of items displayed by the script changes.


    Here are some RSS feeds to get you started

    Tip: Notice that the RSS channel name (and description) can be obtained with the object's channel() method, which accepts any one of three arguments (title, description or link) and returns the corresponding channel value.


    Adding multiple sources and optimising performance
    So that takes care of adding a feed to your Web site. But hey, why limit yourself to one when you can have many? Listing B, a revision of the Listing A, sets up an array containing the names of many different RSS feeds, and iterates over the array to produce a page containing multiple channels of information.

    Listing B

    #!/usr/bin/perl

    # import packages
    use XML::RSS;
    use LWP::Simple;

    # initialize object
    $rss = new XML::RSS();

    # get RSS data
    $raw = get('http://www.slashdot.org/index.rss');

    # parse RSS feed
    $rss->parse($raw);

    # print HTML header and page
    print "Content-Type: text/html\n\n";
    print ""; print ""; print "";
    print "";
    print "
    " . $rss->channel('title') . "
    "; # print titles and URLs of news items foreach my $item (@{$rss->{'items'}}) { $title = $item->{'title'}; $url = $item->{'link'}; print "$title

    "; } # print footers print "

    ";
    print "";

    Figure B shows you what it looks like.

    Figure B


    Several RSS feeds

    You'll notice, if you're sharp-eyed, that Listing B uses the parsefile() method to read a local version of the RSS file, instead of using LWP to retrieve it from the remote site. This revision results in improved performance, because it does away with the need to generate an internal request for the RSS data source every time the script is executed. Fetching the RSS file on each script run not only causes things to go slow (because of the time taken to fetch the RSS file), but it's also inefficient; it's unlikely that the source RSS file will change on a minute-by-minute basis, and by fetching the same data over and over again, you're simply wasting bandwidth. A better solution is to retrieve the RSS data source once, save it to a local file, and use that local file to generate your page.

    Depending on how often the source file gets updated, you can write a simple shell script to download a fresh copy of the file on a regular basis.

    Here's an example of such a script:

    #!/bin/bash
    /bin/wget http://www.freshmeat.net/backend/fm.rdf -O freshmeat.rdf

    This script uses the wget utility (included with most Linux distributions) to download and save the RSS file to disk. Add this to your system crontab, and set it to run on an hourly or daily basis.

    If you find performance unacceptably low even after using local copies of RSS files, you can take things a step further, by generating a static HTML snapshot from the script above, and sending that to clients instead. To do this, comment out the line printing the "Content-Type" header in the script above and then run the script from the console, redirecting the output to an HTML file. Here's how:

    $ ./rss.cgi > static.html

    Now, simply serve this HTML file to your users. Since the file is a static file and not a script, no server-side processing takes place before the server transmits it to the client. You can run the command-line above from your crontab to regenerate the HTML file on a regular basis. Performance with a static file should be noticeably better than with a Perl script.

    Looks easy? What are you waiting for—get out there and start hooking your site up to your favorite RSS news feeds.

    posted on 2005-02-17 03:04 pyguru 閱讀(483) 評論(0)  編輯  收藏 所屬分類: Build Website
    主站蜘蛛池模板: 免费视频精品一区二区| 亚洲愉拍一区二区三区| 亚洲香蕉久久一区二区| 一级特黄录像视频免费| 国内一级一级毛片a免费| 国产亚洲AV夜间福利香蕉149| 亚洲精品无码成人片久久不卡| 我的小后妈韩剧在线看免费高清版 | 亚洲午夜电影在线观看高清 | 两个人看的www免费视频中文| 麻豆国产人免费人成免费视频 | 亚洲人成激情在线播放| 国产一区二区三区在线免费 | 亚洲另类古典武侠| 奇米影视亚洲春色| 成年女人视频网站免费m| 亚洲国产美女福利直播秀一区二区| 99久久99久久精品免费观看| 亚洲精品少妇30p| 天黑黑影院在线观看视频高清免费| 亚洲91精品麻豆国产系列在线| 国产精品视频免费| 中文字幕 亚洲 有码 在线| 国产精品亚洲精品日韩已方| 国产99久久久国产精免费| 亚洲综合熟女久久久30p| aa级一级天堂片免费观看| 亚洲精品久久久久无码AV片软件| 日韩免费a级在线观看| 免费一级全黄少妇性色生活片 | 一级做受视频免费是看美女| 亚洲日韩一区二区一无码| 无码专区—VA亚洲V天堂| 亚洲黄色片免费看| 国产成人精品日本亚洲直接| 亚洲人成图片小说网站| 亚洲精品在线免费看| 波多野结衣免费一区视频 | 91成人免费观看在线观看| 精品亚洲成a人在线观看| 亚洲日本一区二区三区在线不卡|