Weblog on the Internet and public policy, journalism, virtual community, and more from David Brake, a Canadian academic, consultant and journalist

Archive forJuly, 2004 | back to home

21 July 2004
Filed under:Academia,Search Engines at10:08 am

As I “posted earlier”:http://blog.org/archives/001126.html I would like to find a way to make a random sample of home pages from the UK. As it turns out if you search for “personal home page” and specify you are only interested in UK pages, Google and Yahoo will give you a selection that includes lots of home pages (the UK versions of both understand whether sites are UK or not though the algorithm is not perfect). But I worry a little that there are lots of home pages that do not include the text ‘home page’ prominently and that they might actually tend to be a different kind of home page (so excluding them tacitly might skew the results).

I also found that the two largest ISPs in the UK (I think) – “AOL UK”:http://hometown.aol.co.uk/mt.ssp?c=9011000 and “Wanadoo”:http://www.wanadoo.co.uk/sitebuilder/search.htm (was Freeserve) have pages where you can search home pages created by their members. If you search these for a common word like “the” you can also get a seemingly random sample but this might be tainted by any demographic skew in the kind of people who choose to use those tools. What do you think of using that as a method?

Are there other ways of sampling by keyword you could suggest? Any articles about web page sampling you can recommend?

P.S. I came across an “attempt at automating page classification”:http://students.iiit.ac.in/~kranthi/professional/papers/ieee_wpcds_1.shtml which the authors claimed works but unless I could somehow run it myself on a collection of UK URLs (and defend its reliability) it probably wouldn’t be of much help. I also ran across a second paper on “automated web page classification”:http://csdl.computer.org/comp/trans/tk/2004/01/k0070abs.htm but I couldn’t access it and it didn’t look as if it could help in any case unless I was trying to build my own search engine.

20 July 2004

A European pundit, Thierry Chervel, complains that key European newspapers and ‘cultural journals’ are not available online and suggests this impoverishes Europe’s public sphere. To prove his point he cites the failure of an initiative by Jurgen Habermas, who wanted to launch his “Kerneuropa -initiative” against the Iraq war and the “new Europe” via various European newspapers:

He published his own article in the Frankfurter Allgemeine Zeitung, and assigned his colleagues to the Suddeutsche Zeitung , to the El Pais and in the Corriere della Serra. None of these papers however published the articles online. An interested intellectual in Madrid, Paris or Berlin would have had to go the main train station and purchase four newspapers from three different countries. A few days later, the debate was quickly forgotten.

Had Habermas invested a few thousand Euros to build his small website, had he published his article and those of his colleagues simultaneously in English, the sensation would have been big.

Well, it is not clear that this would have happened (and it seems that Habermas’ statement “actually is available online”:http://www.faz.net/s/Rub117C535CDF414415BB243B181B8B60AE/Doc~ECBE3F8FCE2D049AE808A3C8DBD3B2763~ATpl~Ecommon~Scontent.html), but the general point is an interesting one. It would certainly be nice if the major non-English=language European newspapers and magazines published their articles online for free and translated them into English – it would give a much broader perspective to the online audience but is unlikely to happen, alas.

Mark Liberman “posted”:http://itre.cis.upenn.edu/~myl/languagelog/archives/001168.html his own interesting comment and critique about this article asserting (correctly I suspect) that the root cause of this problem is not so much economic conservatism on the part of European newspapers but a larger “Internet illiteracy” on the part of many mainstream European intellectuals (including Thierry Chervel who does not have a website of his own). Hopefully this will change over time…

19 July 2004

The Cybergypsies : A True Tale of Lust, War, & Betrayal on the Electronic Frontier by Indra Sinha is yet another book about unusual experiences online but with several key points of interest. Many such books were written by over-excited US journalists who just dipped into that world. This was written by someone based in the UK who had a life outside the online world (a responsible job, wife and child) but who got very involved in online communities. It’s also of some historical interest because he was writing about the pre-Internet online world where being online 24/7 wouldn’t just cost you time but a considerable amount of money.

He gives an interesting, colourful and personal glimpse of what life online was like back then for some but though the book appears to be an autobiography it is written in a deliberately poetical/impressionistic style leaving the reader uncertain how much of what they’ve read they can believe.

If there is someone out there reading this blog who was around online in the UK back in the early to mid 90s, hung out on Shades or the Vortex, met ‘bear’ there and has read the book I would be interested in your comments (public or private). How was he seen in those communities after he published? I have a feeling I have met one or two people who were there…

18 July 2004
Filed under:London,Personal at11:37 am

Thanks in part to the lobbying of the “Newington Green Action Group”:http://ngag.org/ which I helped to run for several years the council has given my local park and its surroundings a “substantial facelift”:http://community.webshots.com/album/164183779IQuNzm.

In the seven years since I moved in the neighborhood has already changed significantly – we have gained a “genuine French patisserie”:http://www.n16mag.com/issue20/p13i20.htm, a several restaurants and a “vegetarian deli”:http://www.myhackney.co.uk/hackney/restaurants-newingtongreen.htm) among other amenities. With the boost that the newly laid-out park will give, I hope what was once a neglected traffic roundabout will become the neighborhood focus it always should have been and the benefits will be felt by all who live here for generations to come.

17 July 2004

“David Huffaker”:http://www.eyec.com/’s masters thesis, “Gender Similarities and Differences in Online Identity and Language Use among Teenage Bloggers”:http://cct.georgetown.edu/thesis/DavidHuffaker.pdf has received some attention from BBC news because of its findings that (surprise surprise) teens tended to reveal more personal details on blogs than in chatrooms and forums. This chimes immediately with the Daily Mail-reader paranoia about cyber-stalkers…

16 July 2004

My supervisors have been active in the “CRIS”:http://www.crisinfo.org/ ( Communication Rights in the Information Society) programme and have called my attention to its work. On this year’s CRIS agenda is:

The CRIS Global Governance Project, sponsored by the Ford Foundation. The project’s aim is to support the emergence at national level of the concept of communication rights … advocacy on governance issues including civil society participation in governance structures… and in various global governance fora.

If you’re an academic interested in the connection between media participation and civil society take a look and join in!

15 July 2004

As a Canadian (generally big-housian) living in the UK (generally small-housian) I have been struck by the different attitudes toward space in different countries and (I believe) through that towards possessions. “Donella H. Meadows”:http://www.pcdf.org/meadows/ (now deceased) of the “Sustainability Institute”:http://www.sustainer.org/ wrote about a calendar showing the “lifestyles of people around the world”:http://www.menzelphoto.com/gallery/mw.htm including their homes. She notes that on the calendar the number of family members living together varies between 4 and 13 and the houses ranged in size from 200 square feet (a six person yurt in Mongolia) to 4850 sq ft for eight people (in Kuwait City).


Bhutanese family with all their possessions in front of their house from the book “Material World”:http://www.menzelphoto.com/gallery/mw.htm

Our ‘two bedroom’ flat is tiny by Canadian standards but a moderate size for Londoners (something like 600sq feet?). It has become apparent to me that we don’t have room for any more things (whether in storage or in active use) but it doesn’t bother me all that much yet – it helps us to keep our lives simple. Thank goodness I have almost limitless digital storage available at least!

Has anyone tried the experiment where you put a sticker on anything you use in the course of a year and at the end of the year you throw away anything without a sticker?

12 July 2004
Filed under:Search Engines,Software reviews at8:18 am

I’ve heard for a while that Microsoft plans to produce a single search tool that finds data on your hard disk and on the Internet but I have always assumed they meant to deliver it in their next operating system (Longhorn) in 2006. Now according to Yusuf Mehdi, head of Microsoft’s MSN division it seems this technology will be released within 12 months. Apple also plans to incorporate this kind of search in its OS but “as with Windows”:http://blog.org/archives/cat_search_engines.html#001061 third party apps for Mac OS X are “already available”:http://www.wired.com/news/mac/0,2125,64070,00.html to search your hard disk.

11 July 2004

Here’s something truly hair-raising I’m glad I didn’t know about at the time. Remember in all those movies where the nuclear missiles require a top-secret code to launch? It turns out for about a decade in the US the secret code was 00000000. Apparently, ‘Strategic Air Command remained far less concerned about unauthorized launches than about the potential of these safeguards to interfere with the implementation of wartime launch orders.’

10 July 2004

An article from the Chicago Tribune about how a neighbourhood email list helped bring neighbours together. “Keith Hampton”:http://mysocialnetwork.net/’s new research appears to “show the same thing”:http://web.mit.edu/giving/spectrum/spring04/internet-connection.html but with a caveat (not noted in the article about it I just linked to). From what I remember of a presentation he gave a while ago online-enhanced networking only seems to take place in areas already conducive to neighbor to neighbor contact – when for example it was tried in an urban apartment block it didn’t take off.

There was a similar earlier article about a virtual community in Orange Country I “blogged about”:http://blog.org/archives/cat_virtual_communities.html#000949 but the LA Times’ link no longer works (curse them!).

? Previous PageNext Page ?