26 May 2004
How would you sample home pages and weblogs in the UK? My definition would be: “sites that are not primarily in furtherance of professional goals (eg online CVs, galleries of art from artists etc), are not explicitly temporary, are substantially the work of a single individual, and are not closed to the public either explicitly (through a password) or implicitly (for example collections of photos from an event without an accompanying narrative that are only meant to be accessed by a small group for a short time even if they are openly available online).”

If I had a long list of random UK home pages I could weed out the ones that didn’t belong myself, however.

I thought about sampling randomly from directories compiled by Geocities or Freeserve/Wanadoo but I looked and it seems they no longer index their pages. Do they have directories somewhere I missed?

Using Yahoo or DMoz would introduce obvious biases because submission is not automatic.

Tripod still does have “directories of its UK users”:http://www.tripod.lycos.co.uk/directory/homepages/ and it seems like the best bet so far but how representative would Tripod users be of all users? Searching for ‘personal home page uk’ in Google gets me nowhere.

How should I balance blogs with home pages? Using the stats from Pew suggests I should include about one blog for every four home pages. What do you think is the best way to randomly sample weblogs? There used to be a master directory of Blogger ones. Is there still? Is there any up to date info on the relative popularity of the various weblogging platforms?

Jill Walker mentions how people visiting a post on the “Noetech blog”:http://blog.noetech.com/archives/2004/04/13/overhaulin.shtml which mentioned watching a TV show seem to think that the blogger actually runs that TV show.

It’s bizarre but this sort of thing has happened to me, too. I “mentioned”:http://blog.org/archives/000617.html months ago that Philip Pullman’s trilogy was being streamed by the BBC and I received several comments (since removed to avoid confusion) that clearly suggested the commenters thought that Pullman was reading or even writing the blog. It’s as if readers just skimmed looking for their keywords, ignored the context and blurted out whatever was in their heads…

I guess if I want readers “all I need to do”:http://www.google.com/press/zeitgeist.html is talk about how I enjoyed American Idol during spring break and when I finished I listened to Howard Stern talk about Iraq with Halle Berry and Lindsay Lohan.

Thanks to “Lila”:http://blog.mathemagenic.com/ for the link