22 November 2004
Google announced a few days ago going from 4bn to 8bn pages.

Which made me wonder what proportion of the web it covers now and how much I was missing before. The last major survey of search engine coverage was in 1999 (Lawrence, Steve, and C. Lee Giles. 1999. “Accessibility and Distribution of Information on the Web”:http://www.wwwmetrics.com/ Nature, 1999, 107-109.) and concluded that no one search engine covered more than 16% of the visible web (it’s probably better now). And what of the (much larger) invisible web? See Bergman, M. K. (2001) “The Deep Web: Surfacing Hidden Value”:http://www.press.umich.edu/jep/07-01/bergman.html, The Journal of Electronic Publishing, 7 (1) for more on that…


  1. The un-indexed part of the web is mostly pr0n sites, warez sites, and SEO pages. I don’t think we’re missing anything 🙂

    Comment by Harald — 22 November 2004 @ 2:59 am

  2. I came upon your site while surfing Blog Faces. You certainly have some interesting post(s). I enjoyed my time here. Happy Thanksgiving. 🙂

    Comment by Dariana — 23 November 2004 @ 7:59 pm

