kottke.org

...is a weblog about the liberal arts 2.0 edited by Jason Kottke since March 1998 (archives). You can read about me and kottke.org here. If you've got questions, concerns, or interesting links, send them along.

Tags and kottke.org

A few months ago, I began tagging my remaindered links with keywords toward some still-unspecified goal. For instance, this recent post about an interview with Ruth Reichl got tagged with "nyc food restaurants ruthreichl books interviews". As I said, I haven't figured out what to do with them yet, but the other day I whipped up a little PHP script to see how the kottke.org tagspace was shaping up. Here are a few results:

# of entries tagged: 933
total # of tags: 3960
# of distinct tags: 1376
tags per entry: 4.244

Most popular tags (#):
science (80)
nyc (80)
movies (80)
business (73)
food (68)
photography (62)
funny (57)
books (53)
lists (53)
www (43)
music (43)
weblogs (40)
art (39)
design (34)
restaurants (34)
sports (34)
apple (33)
google (32)
technology (29)
nostalgia (27)

That's a fairly accurate description of both what the site is about and what I am interested in. Two of my favorite tags are "lists" and "bestof". Here's a sampling from each of those tags:

lists:
100 people who are qualified to carry the "Bad Mothafucka" wallet besides Pulp Fiction's Jules Winfield
Photo essay of the Hubble Telescope's top ten discoveries
50 Things to Do with Your iPod
Twelve ways to think differently
Pickup Lines Used by Mario [of Mario Bros. fame]
20 things gamers want from the next generation of game consoles
Money Magazine on the 50 smartest things you can do with your money
40 things that only happen in the movies
24 different ways to lace your shoes

bestof:
Is Shaq the greatest NBA player of all time?
Spin names Radiohead's OK Computer the best album from the last 20 years
BusinessWeek Design Award winners for 2005
BBC Radio 4 poll results for Greatest Philosopher Ever!!
New bookmark: interesting Flickr photos from the last 24 hours, automagically determined

The dream is to go back and tag every single entry on the site -- currently ~8700 -- but it would take me approximately forever and I'm not sure it's worth the time and debilitating injuries to my wrists and fingers from all the typing. I've thought about a few alternative approaches (and their associated downsides):

  • Feed all my URLs into del.icio.us via the API and scrape out the tags most commonly associated with those links and posts. I literally haven't looked at the API, so I don't know if this is even possible. Also, I'm not sure I want to trust the del.icio.us community to collaboratively tag my posts and links...there would probably be a significant amount of correction and addition of tags by hand.
  • Use Yahoo's Term Extraction service to build a list of keywords based on an analysis of my posts and the content of the pages I point to within a post or remaindered links. I have no idea how well this would work in practice, especially in returning terms that make good tags. Probably a lot of hand-correction here too.
  • Getting my readers (that's you!) to tag them for me using the list of tags I've already used as a guideline. Unfortunately, you should never trust anyone over 30 or anyone who has access to a HTML textarea into which they can type anything they want. Given enough time, I could probably come up with a system that minimizes the damage a particular malcontent could do, but as with the other two options, I'm still left with a fair amount of correction by hand. A bigger problem I have with this option is there's a lot in it for me (and the site), but I'm not sure there's any real incentive for any of you to spend 20 minutes tagging kottke.org posts (I believe this chore would be the first entry in the dictionary under "mindless busywork"), so I'd feel weird about asking.
  • Some combination of the above approaches.

So yeah, that's where I am with the tagging.

By Jason Kottke    Aug 17, 2005 at 12:26 pm    kottke.org   tags

kottke.org, quickly...

The best way to get a sense of what kottke.org is all about is to head to the front page or check out some random entries from the archives. Follow kottke.org via RSS or Twitter.

Want to share your something special with kottke.org's readers? Sponsor the RSS feed for a week!

Looking for work?

See more on the Job Board.

Recommended sites

David Archer    Matthew Paul Thomas    Rebecky    greg.org    jimr(ay)    evhead    panopticist    strange maps    Nivi    Type for you.    Airbag    Ikeepadiary    The Pop!Tech Blog    Eater    tremble.com    Frumination    Personism    NYT Science    Idle Words    The Laboratorium