Weblog on the Internet and public policy, journalism, virtual community, and more from David Brake, a Canadian academic, consultant and journalist

Archive for the 'Search Engines' Category | back to home

21 May 2004

Regular readers will know (archive item 1, “archive item 2”:https://blog.org/archives/001061.html) that I am keen to find search tools for the files on my own hard disk (and email). So far I have been dissatisfied but it seems Google is about to enter this market if you believe the rumours about Project Puffin. “According to the New York Times”:http://news.com.com/2100-1011_3-5215707.html Google’s desktop search software has been in use within the company for about a year.

Don’t expect Googling your hard disk to be as effective as Googling the web though – Google’s web searching relies heavily on the ubiquitous cross-linking in web pages to indicate the importance of one page over another for a given search and most people’s hard disks don’t contain that kind of handy cross-referencing.

Microsoft is also looking at this kind of thing of course but I’m not sure I want to wait for the next version of MS’s operating system and upgrade to it in order to take advantage of their new search features.

5 May 2004

I can’t improve on the Berkman Centre’s blog entry:

An international team of researchers has launched a new program to map censorship of the Internet.  The Open Net Initiative — a partnership of the Berkman Center, the University of Cambridge, and the University of Toronto — has formally begun tracking international filtering of the Internet.  As the Berkman Center’s Jonathan Zittrain explains, “The aim of the ONI is to excavate, analyze, and report censorship and surveillance practices in a rigorous, ongoing fashion.”  Read more about the project in this News Release.

14 April 2004
Filed under:Search Engines at2:26 pm

Before they redesigned, when you did a search using “Google”:http://www.google.com/ and some of the results were also found in Google Directory (which is really a re-skinning of the “Open Directory”:http://dmoz.org/ project) you would see a link straight to the relevant category page alongside the search results. The redesign they did recently seems to have removed this feature. So if you want to see what categories a search of yours fits into you have to search the google directory separately. And there is no way in the preferences to change this. Can I get my old Google back please?

4 April 2004

Disappointingly, the top entry if you search for “jew” in Google is an awful anti-semitic site. Fortunately, a weblog campaign has emerged and they are encouraging people to link “jew”:http://en.wikipedia.org/wiki/Jew to the relevant Wikipedia entry. Please do likewise – if enough people do this, we can drive the anti-semitic site to number two. It’s a pity the Wikipedia entry, informative as it is, does not contain links to material explicitly challenging the lies peddled on ‘Jew Watch’ but I’m sure there is something around one could link to. I had a quick look at the “Anti-Defamation League”:http://www.adl.org site but didn’t find anything there and I have a dim recollection that they are themselves ideologically dubious anyway.

Thanks to “Crooked Timber”:http://www.crookedtimber.org/archives/001631.html for the link.

24 March 2004

I continue to look for a good cheap way of searching my local hard disk as easily as I search the web. Jeremy Wagstaff has just produced a handly master list of hard disk indexers. I am still toying with all of them. All I want is decent Boolean search and Acrobat support. DTSearch has this but it also has a crappy interface and costs too much for consumer use.

80-20 doesn’t integrate with non-Outlook email (I use Eudora) – indeed if you don’t use Outlook it really really doesn’t want to install at all. X1’s price seems to have gone up from free to $50 to $100 and it doesn’t offer Boolean search. The latest entry, “HotBot Desktop”:http://www.infotoday.com/newsbreaks/nb040322-1.shtml doesn’t offer Boolean search either though they say they are using DTSearch’s technology which should have been able to provide this function. I’ll still be taking a good look at it though.

26 February 2004

Edward Felten posted defending the way Google search results are delivered, suggesting the ‘votes of web authors’ is a fair way to determine website prominence in search. This touched off an interesting argument. As one poster pointed out, “think about how you will feel when a search on evolution brings up creationist sites explaining why evolution is wrong and evil. That’s a widespread view in the U.S., currently under-represented online but that may well change as the net penetrates more deeply into society.”

I was also struck by the comment by “Armature”:http://abstractfactory.blogspot.com/ in which he points out, “One interesting side effect of Vivisimo’s clustering of search results is that Vivisimo’s less vulnerable to tyranny of the majority than Googlocracy.”

2 February 2004

Most of the way down an article in the New York Times – The Coming Search Wars (MS vs Google) comes an interesting revelation:

“an ambitious secret effort known as Project Ocean, according to a person involved with the operation. With the cooperation of Stanford University, Google now plans to digitize the entire collection of the vast Stanford Library published before 1923, which is no longer limited by copyright restrictions. The project could add millions of digitized books that would be available exclusively via Google.”

It’s just a pity the number of years we have to wait to get ahold of copyright material keeps lengthening…

29 January 2004

I have never understood why with all the advances there have been in browsers neither Mozilla nor IE has developed a proper database for managing bookmarks. If you have more than a few dozen bookmarks it becomes increasingly difficult to keep track of them all. The “Keeping Found Things Found”:http://kftf.ischool.washington.edu/ project has discovered people can’t find sites they visited earlier but they seem to have developed a needlessly baroque way to deal with this problem. I have been using “Powermarks”:http://www.kaylon.com/ for several years and now have more than 5,000 bookmarks in a simple database which lets me get to any of them almost instantly. It may be the most useful software I have ever bought…

8 January 2004

It has been noted before that search engines’s algorithms don’t magically provide the ‘best’ results for any query – they only provide the best matches using a given algorithm, and that algorithm can be biased. The latest issue of “First Monday”:http://firstmonday.org/ – an excellent e-journal – includes a detailed examination of one key aspect. Dr “Susan L Gerhart”:http://pr.erau.edu/~gerharts/ has attempted to determine whether the problems with such algorithms tend to conceal controversies and while her results (done on a small scale) don’t seem to show consistent failures she nonetheless suggests that search engines may indeed suppress controversy and adduces some interesting arguments why this might be the case alongside recommendations for search engine programmers of how to produce more representative results.

31 December 2003

More evidence (if more were needed) that search engines like Google have a certain amount of unaccountable power. A satirical site that (among many other things) passed on instructions on how to make a search for ‘miserable failure’ come back with a George Bush page found that “it had been banned from using Google to advertise”:http://www.blather.net/shitegeist/000169.htm. It turns out you can’t place ads using Google for a site criticising an individual unless the site is clearly labelled “satire”. Of course the site still turns up in Google searches…

It’s possible that it wasn’t so much the anti-Bush sentiment that annoyed Google’s ad staff as the encitement to ‘game’ Google.

? Previous PageNext Page ?