How would you sample home pages and weblogs in the UK? My definition would be: sites that are not primarily in furtherance of professional goals (eg online CVs, galleries of art from artists etc), are not explicitly temporary, are substantially the work of a single individual, and are not closed to the public either explicitly (through a password) or implicitly (for example collections of photos from an event without an accompanying narrative that are only meant to be accessed by a small group for a short time even if they are openly available online).
If I had a long list of random UK home pages I could weed out the ones that didn’t belong myself, however.
I thought about sampling randomly from directories compiled by Geocities or Freeserve/Wanadoo but I looked and it seems they no longer index their pages. Do they have directories somewhere I missed?
Using Yahoo or DMoz would introduce obvious biases because submission is not automatic.
Tripod still does have “directories of its UK users”:http://www.tripod.lycos.co.uk/directory/homepages/ and it seems like the best bet so far but how representative would Tripod users be of all users? Searching for ‘personal home page uk’ in Google gets me nowhere.
How should I balance blogs with home pages? Using the stats from Pew suggests I should include about one blog for every four home pages. What do you think is the best way to randomly sample weblogs? There used to be a master directory of Blogger ones. Is there still? Is there any up to date info on the relative popularity of the various weblogging platforms?