kottke.org

...is a weblog about the liberal arts 2.0 edited by Jason Kottke since March 1998 (archives). You can read about me and kottke.org here. If you've got questions, concerns, or interesting links, send them along.

Automatic discovery of RSS feeds

Now that all of kottke.org is in MT, I can start worrying what to do about things like RSS feeds. I've been following the developments concerning the automatic discovery of RSS feeds, written about extensively on dive into mark (more here and here). Basically, you insert the following code:

<link rel="alternate" type="application/rss+xml" title="RSS" href="url/to/rss/file">

into your Web page and then all an RSS aggregator needs to do is check that Web page for your RSS feed instead of you having to provide the aggregator with a specific URL. Pretty slick really.

However, I have a couple of concerns about how this works:

1. My understanding is that when a Web browser loads a page, it downloads all the documents referenced in the <link> tags. That's how stylesheets work. Does this mean that every time someone loads up my Web site, they're going to get this RSS file as well, whether they want it or not? For popular sites, depending on the size of the RSS file, that could add up to several megabytes in additional bandwidth...and possible additional bandwidth charges. Does the "rel" attribute being set to "alternate" take care of this?

2. Do the aggregators need to check my Web page each time they download the RSS file or are they going to cache the location and then only check the Web page once a week or so for a possible location update? Again, serving two files when only one is called for could get costly, especially if an aggregator is calling for it multiple times a day.

Can anyone shed some light on this?

By Jason Kottke    Jun 3, 2002 at 12:08 am

There are 15 reader comments

jjg    Jun 02 2002    11:10PM

1. The browser does not load every document referenced in the tags. If the "rel" attribute is not set to "stylesheet", most browsers (IE, Netscape, Mozilla, Opera) just ignore the tag. iCab is the only browser I know of that does anything with any link tag with a "rel" attribute other than "stylesheet". Link tags have lots of intriguing possibilities, virtually all of which have gone untapped.




2. I guess it depends on what the people writing the aggregator software choose to do. I believe the original idea was that the link tag would tell the aggregator "don't subscribe to this file, go get that other one instead" if someone tried to subscribe to the wrong file. In that scenario, presumably the aggregator would never come back to your HTML page. Probably the best scenario would be that, if the RSS file goes 404, the aggregator would then go back to the HTML page to check for a new URL in the link tag.

Anil    Jun 03 2002    12:06AM

Mozilla can handle link tags, too. It was pulled from 1.0 at the last minute due to performance issues, but I'm sure by 1.1 we'll see the LINK toolbar return to Mozilla.

When activated, it appears on any page with a link tag, and provides navigation to referenced pages.

Which, incidentally, gives me a lot of hope. Because the link tag will be the savior of the web. I believe it.

Steven Garrity    Jun 03 2002    6:54AM

I've noticed that Mozilla makes another use of the LINK tag. Rather than looking for sitename.com/favicon.ico for the favourites icon, as Internet Explorer does, they use the LINK tag to link to the icons.

<LINK REL="icon" HREF="images/mozilla-16.png" TYPE="image/png">

Looking for root/favicon.ico, as IE does, suffers the same problems that Jason points out for autodetecting RSS - on some sites I manage, favicon.ico is one of the top requested files, whether it exists or not.

Mark Pilgrim    Jun 03 2002    9:34AM

To clarify: iCab does not auto-download the LINKed file, it merely displays the TITLE in a drop-down menu. As others have noted, Mozilla (used to, and will again someday) display all such LINKs in the Link Toolbar, if it was enabled in the View menu. Lynx displays them along the top of the document.

And no, news aggregators like Radio and Amphetadesk will only check the LINK tag if the user enters the main site URL to try to subscribe; it then caches the RSS feed URL and never goes back to the HTML.

Other scripts (such as my auto-linkbacks on my weblog) can take advantage of the LINK tag to expose the RSS feed for a site, given the site URL.

The LINK tag is part of the HTML 4 spec, documented here:

Other interesting uses for LINK tags: establishing the place of a page within a hierarchy, using rel="home", rel="up", rel="prevf", rel="next" LINK tags. I do this both on my weblog and in my book. View source:


Mark Pilgrim    Jun 03 2002    9:35AM

To clarify: iCab does not auto-download the LINKed file, it merely displays the TITLE in a drop-down menu. As others have noted, Mozilla (used to, and will again someday) display all such LINKs in the Link Toolbar, if it was enabled in the View menu. Lynx displays them along the top of the document.

And no, news aggregators like Radio and Amphetadesk will only check the LINK tag if the user enters the main site URL to try to subscribe; it then caches the RSS feed URL and never goes back to the HTML.

Other scripts (such as my auto-linkbacks on my weblog) can take advantage of the LINK tag to expose the RSS feed for a site, given the site URL.

The LINK tag is part of the HTML 4 spec, documented here:

http://www.w3.org/TR/REC-html40/types.html#type-links

Other interesting uses for LINK tags: establishing the place of a page within a hierarchy, using rel="home", rel="up", rel="prevf", rel="next" LINK tags. I do this both on my weblog and in my book. View source:
http://diveintomark.org/archives/2002/06/02.html
http://diveintopython.org/odbchelper_list.html

Joe    Jun 03 2002    10:27AM

"2. Do the aggregators need to check my Web page each time they download the RSS file..."

Aggie doesn't do this. It uses the web page to resolve the URL of the RSS file. It stores only the URL of the RSS file and discards the info about the web page.

jkottke    Jun 03 2002    10:55AM

Thanks guys. Sounds like my concerns aren't a problem at all.

And sorry about the lack of preview on the comments here. Still working out the kinks.

Ben Hammersley    Jun 03 2002    3:59PM

Mozilla DOES display all Link Rels in the Link toolbar. Go to View/Show-Hide/Show Navigation Bar and turn it on. Go to http://rss.benhammersley.com and click on an entry's permalink to see it in very-much-full-effect

Mark Pilgrim    Jun 03 2002    4:34PM

Ben: unfortunately, the Link toolbar has been removed from Mozilla 1.0 due to performance concerns. See: http://bugzilla.mozilla.org/show_bug.cgi?id=102992

justin    Jun 03 2002    9:50PM

jason: just in case you aren't aware, you're already publishing an RSS feed (i won't point to it though since you haven't yet made up your mind). i'm sure you can get rid of it if you so choose, but MT puts it there by default, and it is publically accessible.

jkottke    Jun 03 2002    10:24PM

Yeah, I know the feed is there. Just haven't told anyone about it yet.

Ben Hammersley    Jun 04 2002    11:15AM

Mark: Not in RC3 it's not. In fact I have it on all the time, and I'm using build 2002053104 right now. I think that's an old bug...

Shmuel    Jun 04 2002    11:44AM

What about pages (a site) that have more than one RSS file associated with them? For instance, say Jason chose to keep a list of books he was reading on the front page of his site along with the regular blog content. How would the tools respond to two links to RSS files from the same page?

Example:


Mark Pilgrim    Jun 04 2002    12:09PM

Ben: I'm running 20020529 and the option is just gone from the View menu. Maybe it's been fixed; that'd be great.

Shmuel: multiple RSS files are fine, just have one LINK tag for each of them, with different titles. I do this for the category-specific feeds on my site. View-source on diveintomark.org for an example.

mattjacob    Jun 05 2002    4:00PM

In build 2002053012 (v1.0 release), the option is also no longer there. I hope they bring it back.

This thread is closed to new comments. Thanks to everyone who responded.

kottke.org, quickly...

The best way to get a sense of what kottke.org is all about is to head to the front page or check out some random entries from the archives. Follow kottke.org via RSS or Twitter.

Want to share your something special with kottke.org's readers? Sponsor the RSS feed for a week!

Looking for work?

Recommended sites

evhead    Vulture    Omit Needless Words    Morning News    Q Daily News    FlickrBlog    tecznotes    nickbaum.com    scoboco    I did not know that yesterday!    Typographica    Play with the Machine    onfocus.com    Heavy Backpack    plasticbag.org    Cynical-C Blog    Capn Design    gladwell.com    Blackbeltjones/work    NYT Science