kottke.org posts about tags
While not as extensive as Rex's collection Dec 10 2007
Made some slight tweaks to the site, Aug 08 2007
Made some slight tweaks to the site, mostly on the front page. Might need to refresh the stylesheet (shift-reload or cmd-reload usually works) to see them properly. The left column is a bit wider and tags now appear on posts. The monthly archive pages are a bit busted, but I'm hoping to have them running again soon.
kottke.org tags Jun 08 2007
After working on this -- on again and off again, mostly off -- for much too long, I'm pleased to say that a significant chunk of kottke.org now has tags (around 5,100 entries are tagged, out of ~13,000). Right now, the only way to access them is through individual tag pages, but after all the bugs are ironed out, I'll be putting them in different places around the site (front page, main archive page, etc.).
Each tag page lists all the entries1 on the site that are tagged with that particular word...some good examples to start you off are: photography, economics, lists, infoviz, food, nyc, cities, restaurants, video, timelapse, interviews, language, maps, and fashion. Each page also has a list of tags related to that particular tag and further down in the sidebar, you'll find lists of recently popular tags, all-time popular tags, a few favorite tags of mine, and some random tags...lots of stuff to explore.
I've tweaked the design as well: the main column is a little wider, the post metadata look/feel is consistent among short posts and long posts, faint dotted lines now separate all entries, and per-entry tags were added to the post metadata. I'm testing all that out for eventual site-wide use. Questions, comments, bug reports, etc. are welcome...send them on in.
Update: I almost forgot, the nsfw tag.
 Not all the entries exactly. Until I figure out how to do some pagination, I've limited the number of entries to 100 for each tag page. The movies page was more than 1 Mb when all the entries were listed. ↩
Fotolog overtaking Flickr? Jan 31 2007
Fotolog launched in May 2002 and grew quite quickly at first. They'd clearly hit upon a good idea: sharing photos among groups of friends. As Fotolog grew, they ran into scaling problems...the site got slow and that siphoned off resources that could have been used to add new features to the site, etc. Problems securing funding for online businesses during the 3-4 years after the dot com bust didn't help matters either.
Flickr launched in early 2004. By the end of their first year of operation, they had a cleaner design than Fotolog, more features for finding and organizing photos, and most of the people I knew on Fotolog had switched to Flickr more or less exclusively. They also had trouble with scaling issues and downtime. Flickr got the scaling issues under control and the site became one of the handful of companies to exemplify the so-called Web 2.0 revitalization of the web. The founders landed on tech magazine covers, news magazine covers, and best-of lists, the folks who built the site gave talks at technology conferences, and the company eventually sold to Yahoo! for a reported $30 million.
So. Then. Here's where it gets puzzling. According to Alexa1, Fotolog is now the 26th most popular site on the web and recently became more popular than Flickr (currently #39). Here's the comparison between the two over the last 3 years:
This is a somewhat stunning result because by all of the metrics held in high esteem by the technology media, Web 2.0 pundits, and those selling technology and design products & services, Flickr should be kicking Fotolog's ass. Flickr has more features, a better design, better implementation of most of Fotolog's features, more free features, critical praise, a passionate community, and access to the formidable resources & marketing power of Yahoo! And yet, Fotolog is right there with them. Perhaps this is a sign that those folks trapped in the Web 2.0 bubble are not being critical enough about what is responsible for success on the Web circa-2007. (As an aside, MySpace didn't really fit the Web 2.0 mold either, nobody really talked about it until after it got huge, and yet here it is. And then there's Craigslist, which is more Web 0.5 than 2.0, and is one of the most popular sites on the web. Google too.)
What's going on here then? I can think of three possibilities (there are probably more):
1. Fotolog is very popular with Portugese and Spanish speakers, especially in Brazil. According to Wikipedia, almost 1/3rd of all Fotolog users are from Brazil and Chile. In comparing the two sites, what could account for this difference? Fotolog has a Spanish language option while Flickr does not (although I'm not sure when the Spanish version of Fotolog launched). Flickr is more verbose and text-intensive than Fotolog and much of Flickr's personality & utility comes from the text while Fotolog is almost text-free; as a non-Spanish speaker, I could navigate the Spanish-language version quite easily. Gene Smith noted that a presentation made by a Brazilian internet company said that "Flickr is unappealing to Brazilians because they want to the customize the interface to express their individual identities".
Cameron Marlow noticed that Orkut is set to pass MySpace as the world's most popular social networking site (Orkut is also very popular in Brazil), saying that "Orkut's growth reinforces the fact that the value of social networking services, and social software in general, comes from the base of active users, not the set of features they offer". Marlow also notes that Alexa's non-US reporting has improved over the past year, which might be the reason for Fotolog's big jump in early 2006. If Alexa's global reporting had been robust from the beginning, Fotolog may have been neck and neck with Flickr the whole time.
2. Flickr is more editorially controlled than Fotolog. The folks who run Flickr subtly and indirectly discourage poor quality photo contributions. Yes, upload your photos, but make them good. And the community reinforces that constraint to the point where it might seem restricting to some. Fotolog doesn't celebrate excellence like that...it's more about the social aspect than the photos.
3. Maybe tags, APIs, and Ajax aren't the silver bullets we've been led to believe they are. Fotolog, MySpace, Orkut, YouTube, and Digg have all proven that you can build compelling experiences and huge audiences without heavy reliance on so-called Web 2.0 technologies. Whatever Web 2.0 is, I don't think its success hinges on Ajax, tags, or APIs.
Update: You can see how much Fotolog depends on international usage for its traffic from this graph from Compete. They only use US statistics to compile their data. I don't have access to the Comscore ratings, but they only count US usage and, like Alexa, undercount Firefox and Safari users. (thx, walter)
 Usual disclaimers about Alexa's correctness apply. The point is that among some large amount of users, Fotolog is as popular (or even more) than Flickr. Whether those users are representative of the web as a whole, I dunno. ↩
Tag frequency and popularity acceleration Nov 03 2006
As many of you don't know, I've been working less-than-diligently1 on a project with the eventual goal of adding tags to kottke.org. I posted some early results back in August of 2005. The other day, I started thinking about how tags could help people get a sense of what's been talked about recently on the site, like Flickr's listing of hot tags. I started by compiling a list of tags from the last 200 entries and ordering them by how many times they were used over that period. Here is the top 20 (with # of instances in parentheses)
photography (33), books (26), art (26), science (22), tv (21), movies (21), lists (20), video (17), nyc (16), weblogs (15), design (14), interviews (13), bestof (13), business (12), thewire (12), food (11), sports (11), games (10), language (10), music (9)
The items in bold also appear in the top 50 of the all-time popular tags, so obviously this list isn't telling us anything new about what's going on around here. To weed those always-popular tags from the list, I compared the recent frequency of each tag with its all-time frequency and came up with a list of tags that are freakishly popular right now compared to how popular they usually are. Call this list a measure of the popularity acceleration of each tag. The top 20:
blindside (3), pablopicasso (3), ghostmap (3), davidsimon (5), poptech2006 (4), thewire (12), andywarhol (3), michaellewis (4), education (4), youtube (4), richarddawkins (5), realestate (3), crime (8), working (8), school (3), dvd (4), georgewbush (4), stevenjohnson (5), writing (4), photoshop (3)
(Note: I also removed tags with less than three instances from this list and the ones below.) Now we're getting somewhere. None of these appear in the top 50 all-time list. But it's still not that accurate a list of what's been going on here recently. I've posted 3 times about Photoshop, but you can't discount entirely the 33 posts about photography. What's needed is a mix of the two lists: generally popular tags that are also popular right now (first list) + generally unpopular tags that are popular right now (second list). So I blended the two lists together in different proportions:
75% recent / 25% all-time:
davidsimon (5), poptech2006 (4), ghostmap (3), pablopicasso (3), blindside (3), thewire (12), andywarhol (3), michaellewis (4), education (4), photography (33), art (26), youtube (4), tv (21), richarddawkins (5), books (26), crime (8), video (17), working (8), realestate (3), science (22)
67% recent / 33% all-time:
davidsimon (5), poptech2006 (4), pablopicasso (3), ghostmap (3), blindside (3), thewire (12), andywarhol (3), photography (33), art (26), michaellewis (4), education (4), tv (21), books (26), youtube (4), video (17), science (22), richarddawkins (5), crime (8), movies (21), lists (20)
50% recent / 50% all-time:
thewire (12), davidsimon (5), photography (33), poptech2006 (4), blindside (3), ghostmap (3), pablopicasso (3), art (26), books (26), tv (21), science (22), movies (21), lists (20), andywarhol (3), video (17), michaellewis (4), education (4), nyc (16), weblogs (15), crime (8)
25% recent / 75% all-time:
photography (33), art (26), books (26), tv (21), science (22), movies (21), lists (20), thewire (12), video (17), nyc (16), weblogs (15), davidsimon (5), poptech2006 (4), design (14), interviews (13), bestof (13), blindside (3), ghostmap (3), pablopicasso (3), business (12)
The 75%-66% recent lists look like a nice mix of the newly & perenially popular and a fairly accurate representation of the last 3 weeks of posts on kottke.org.
Digression for programmers and math enthusiastists only: I'm curious to know how others would have handled this issue. I approached the problem in the most straighforward manner I could think of (using simple algebra) and the results are pretty good, but it seems like an approach that makes use an equation that approximates the distribution of the popularity of the tags (which roughly follows a power law curve) would work better. Here's what I did for each tag (using the nyc tag as an example):
# of recent entries: 300
# of total entries: 3399
# of recent instances of the nyc tag: 16
# of total instances of the nyc tag: 247
# of instances of the most frequent recent tag: 33
# of instances of the most frequent tag, all-time: 272
Calculate the recent and all-time frequencies of the nyc tag:
16/300 = 0.0533
247/3399 = 0.0726
Then divide the recent frequency by the all-time frequency to get the popularity acceleration:
0.0533/0.0726 = 0.7342
That's how much more popular the nyc tag is now than it has been all-time. In other words, the nyc tag is 0.7342 times as popular over the last 300 entries as it has been overall...~1/4 less popular than it usually is. To get the third list with the 75% emphasis on population acceleration and 25% on all-time popularity, I stated by normalizing the popularity acceleration and all-time frequency by dividing the nyc tag values by the top value of the group in each case (11.33 is the popularity acceleration of the blindside tag and 0.11 is the recent frequency of the photography tag (33/300)):
0.7342/11.33 = 0.0647
0.0533/0.11 = 0.4845
So, the nyc tag has a popularity acceleration of 0.0647 times that of the blindside tag and has a recent frequency that is 0.4845 times that of the most popular recent tag. Then:
0.0647*0.75 + 0.4845*0.25 = 0.1696
Calculate this number for each recent tag, rank them from highest to lowest, and you get the third list above. Now, it seems to me that I may have fudged something in the last two steps, but I'm not exactly sure. And if I did, I don't know what got fudged. Any help or insight would be appreciated.
 Great artists ship. Mediocre artists ship slowly. ↩
Five suggested Flickr tags. Merlin brings the Apr 14 2006
Five suggested Flickr tags. Merlin brings the funny. "Rows Of Seated White Men Typing At Conferences".
David Weinberger has some rough notes from Nov 01 2005
The funny thing about TagTagger is that Oct 31 2005
The funny thing about TagTagger is that it probably would be useful to tag tags; it could help tag ecosystems like del.icio.us and Flickr better determine how tagged items are related. Think of it as defining tags...the tag "andywarhol" could be metatagged something like "andy warhol nyc artist person art popculture modernart".
A series of art projects based on Oct 28 2005
A series of art projects based on Flickr. The Flickr tag cloud tshirt is clever; the printing on the shirts is reversed so that you can read them in the mirror..."the [Flickr user's] narrative is actually addressing himself while claiming to address others". (via ia)
Tags and kottke.org Aug 17 2005
A few months ago, I began tagging my remaindered links with keywords toward some still-unspecified goal. For instance, this recent post about an interview with Ruth Reichl got tagged with "nyc food restaurants ruthreichl books interviews". As I said, I haven't figured out what to do with them yet, but the other day I whipped up a little PHP script to see how the kottke.org tagspace was shaping up. Here are a few results:
# of entries tagged: 933
total # of tags: 3960
# of distinct tags: 1376
tags per entry: 4.244
Most popular tags (#):
That's a fairly accurate description of both what the site is about and what I am interested in. Two of my favorite tags are "lists" and "bestof". Here's a sampling from each of those tags:
100 people who are qualified to carry the "Bad Mothafucka" wallet besides Pulp Fiction's Jules Winfield
Photo essay of the Hubble Telescope's top ten discoveries
50 Things to Do with Your iPod
Twelve ways to think differently
Pickup Lines Used by Mario [of Mario Bros. fame]
20 things gamers want from the next generation of game consoles
Money Magazine on the 50 smartest things you can do with your money
40 things that only happen in the movies
24 different ways to lace your shoes
Is Shaq the greatest NBA player of all time?
Spin names Radiohead's OK Computer the best album from the last 20 years
BusinessWeek Design Award winners for 2005
BBC Radio 4 poll results for Greatest Philosopher Ever!!
New bookmark: interesting Flickr photos from the last 24 hours, automagically determined
The dream is to go back and tag every single entry on the site -- currently ~8700 -- but it would take me approximately forever and I'm not sure it's worth the time and debilitating injuries to my wrists and fingers from all the typing. I've thought about a few alternative approaches (and their associated downsides):
- Feed all my URLs into del.icio.us via the API and scrape out the tags most commonly associated with those links and posts. I literally haven't looked at the API, so I don't know if this is even possible. Also, I'm not sure I want to trust the del.icio.us community to collaboratively tag my posts and links...there would probably be a significant amount of correction and addition of tags by hand.
- Use Yahoo's Term Extraction service to build a list of keywords based on an analysis of my posts and the content of the pages I point to within a post or remaindered links. I have no idea how well this would work in practice, especially in returning terms that make good tags. Probably a lot of hand-correction here too.
- Getting my readers (that's you!) to tag them for me using the list of tags I've already used as a guideline. Unfortunately, you should never trust anyone over 30 or anyone who has access to a HTML textarea into which they can type anything they want. Given enough time, I could probably come up with a system that minimizes the damage a particular malcontent could do, but as with the other two options, I'm still left with a fair amount of correction by hand. A bigger problem I have with this option is there's a lot in it for me (and the site), but I'm not sure there's any real incentive for any of you to spend 20 minutes tagging kottke.org posts (I believe this chore would be the first entry in the dictionary under "mindless busywork"), so I'd feel weird about asking.
- Some combination of the above approaches.
So yeah, that's where I am with the tagging.
flickrTagFight pits tag against tag in a Aug 16 2005
flickrTagFight pits tag against tag in a folksonomic battle to the death. fTF has already started a conflict in my household...results of the kottke(145)/megnut(34) tag smackdown are being hotly disputed.
Keyword Assistant plug-in fixes iPhoto's stupid ass keyword-adding interface. Software developers, say it with me: "auto complete, auto complete, auto complete!"