Weblogs and power laws FEB 09 2003
Many systems and phenomena are distributed according to a power law distribution. A power law applies to a system when large is rare and small is common. The distribution of individual wealth is a good example of this: there are a very few rich men and lots & lots of poor folks. A familiar way to think about power laws is the 80/20 rule: 80% of the wealth is controlled by 20% of the population.
It's been shown that the distribution of links on the web scales according to a power law, so it comes as no surprise that the distribution of links to weblogs does as well. Taking the top 100 most linked to weblogs on Technorati as a data set (specifically from 1/24/03), I used Excel to plot and fit a curve to the data:
The data conforms quite well to a power law curve. The R-squared value, a measure of how well the curve fits the data (1.0 is a perfect fit), is 0.9918. I ran a similar analysis of the distribution of the top 200 inbound referers to kottke.org and observed a fit of the data to a power law curve (R-squared = ~0.95). Clay Shirky showed that the distribution of the number of outbound links in the LiveJournal community follows a power law. Paul Hammond has observed a similar pattern with his outgoing links.**
This NEC study reveals that the deviation of a set of data from the power law correlates to how much competition is present in the system. The better the fit, the more competitive the environment is. Again, no surprise that the system of weblogs is a highly competitive one.
But what are weblogs competing for? Matt Webb posits that power laws arise due to scarcity. Links themselves can't be scarce (a page can have as many links as it can hold without running out), but they are a measure of something that is: people.
More specifically, the time that people have for visiting sites and linking to sites is limited. Mary only has so much time for visiting weblogs; if she goes to BoingBoing, she doesn't have time for MetaFilter. Some visitors are linkers and they link what they visit. Similarly, linkers have only so much time for linking. Sam can link to 20 sites about airplanes, but he can't link to 5000. The scarcity of people's time results in the distribution of links that can be described using power laws.
** Other places you *might* find power laws in the weblog world if you took the time to look: Daypop Top 40, Blogdex top links, the Blogging Ecosystem (in both "most linked" and "most prolific linkers" data sets), average # of posts per weblog, average # of words per post, average # of smileys per post, # of visitors per weblog, # of comments per post per weblog, and so on...
Further reading on weblogs, power laws, small worlds, the 80/20 rule, the rich get richer phenomena, Zipf's Law, Pareto's Law, etc.:
Small worlds & LiveJournal (Matt Webb)
Like bloggers link like bloggers (Steve Himmer)
The weblog them, the weblog us (Tom Coates)
Internet Navigators Think Small (MSNBC)
Scarcity and power laws (Matt Webb)
Ecosystems, Power Laws, Counters (N.Z. Bear)
Power Laws, Weblogs, and Inequality (Clay Shirky)
Small Worlds (Duncan Watts)
Linked: The New Science of Networks (Albert-László Barabási)
Nexus: Small Worlds and the Groundbreaking Science of Networks (Mark Buchanan)
Ubiquity: The Science of History, or Why the World Is Simpler Than We Think (Mark Buchanan)
Six Degrees: The Science of a Connected Age