How would you sample home pages and weblogs in the UK? My definition would be: sites that are not primarily in furtherance of professional goals (eg online CVs, galleries of art from artists etc), are not explicitly temporary, are substantially the work of a single individual, and are not closed to the public either explicitly (through a password) or implicitly (for example collections of photos from an event without an accompanying narrative that are only meant to be accessed by a small group for a short time even if they are openly available online).
If I had a long list of random UK home pages I could weed out the ones that didn’t belong myself, however.
I thought about sampling randomly from directories compiled by Geocities or Freeserve/Wanadoo but I looked and it seems they no longer index their pages. Do they have directories somewhere I missed?
Using Yahoo or DMoz would introduce obvious biases because submission is not automatic.
Tripod still does have “directories of its UK users”:http://www.tripod.lycos.co.uk/directory/homepages/ and it seems like the best bet so far but how representative would Tripod users be of all users? Searching for ‘personal home page uk’ in Google gets me nowhere.
How should I balance blogs with home pages? Using the stats from Pew suggests I should include about one blog for every four home pages. What do you think is the best way to randomly sample weblogs? There used to be a master directory of Blogger ones. Is there still? Is there any up to date info on the relative popularity of the various weblogging platforms?
The way that Herring et al. randomly sample weblogs is by using http://blo.gs/ I think these cataloging websites are perhaps the best way of doing it. As for homepages, it’s more of a toughie. Homepage webrings might be another route, although, again it’s not automatic registration. Also, I would question your definition of homepages, since many blogs are used primarily to to further professional goals (e.g. academic weblogs discussing research).
Comment by mark brady — 27 May 2004 @ 10:40 am
I think this is a really hard question. I gave up on a project back in 1996 because I just couldn’t figure out how you could possibly get a random sample of personal homepages. Are you looking for a random sample? I’d say that’s just about impossible. And are you looking for any and all types of UK personal pages? Here is an idea. It may not work depending on what you’re interested in.
You could do a search on Google (or whatever search engine) for something like this:
site:.uk friends family homepage
The first few hits may not be relevant, but you could skip down to the 200th result quickly and start looking around there. (Don’t bother hitting the arrow that many times, just alter the number in the URL to skip ahead.) By the way, as I have noted earlier you can’t go past the 1000th result in Google, which is quite a bummer in such cases.
How many pages are you looking for?
Of course, if you are in particular interested in seeing whether people mention family and friends on their homepages then obviously this method doesn’t work. But perhaps it could be tweaked depending on your particular interest (or better yet, non-interest for sampling purposes).
Comment by eszter — 27 May 2004 @ 4:27 pm