Weblog on the Internet and public policy, journalism, virtual community, and more from David Brake, a Canadian academic, consultant and journalist

Archive for the 'Search Engines' Category | back to home

30 July 2003

ChefMoz is a clever idea but a little under-cooked at present. Looking at the London section it has 172 restaurants listed and categorised (out of c. 10,000 available restaurants) and just 24 reviews linked – the Paris entry has 226 entries and 31 reviews. The search engine is pretty limited in its ability to use the categories that have been input. Nonetheless, it is an idea that deserves to go far and I hope it gets developed a little more. If you want to know where to eat in, say, Afghanistan (where conventional restaurant guides may fail to cover you) dmoz may have the answer one day – right now it just has one review.

The main existing London restaurant guides I used to rely on online – Zagats, the Evening Standard and Time Out – all now charge to use them.

Thanks to Danny O’Brien’s Oblomovka for the link

3 July 2003
Filed under:Censorship,Search Engines at9:08 pm

Ben Edelman at the excellent Berkman Center for the Internet and Society has done a quick and dirty Empirical Analysis of Google SafeSearch which indicates (not surprisingly) that using “Safe Search” to prevent unwanted porn links coming up on your kids’ searches also accidentally (I have to assume) hides pages by the US Congress, NASA’s shuttle programme and numerous entries from Grolier Encyclopedia. It also lets through “numerous sites with sexually-explicit content in response to searches that unambiguously seek such materials, even as the majority of sexually-explicit content does seem to be blocked.”

As Edelman points out, if you use SafeSearch you will never know what was blocked or even how much was blocked so you can’t judge how much is missing. There is also no formal mechanism for warning organizations they have been blocked and no appeals process if they have been improperly blocked.

Yet more evidence (if more were needed) for my concern that search engines have a lot of tacit and even unintended power without a great deal of scrutiny.

2 July 2003
Filed under:Search Engines at9:24 pm

At least this report from Veritest seems to suggest so. Mind you it was commissioned by Inktomi. Yahoo bought Inktomi recently so if they switch back to Inktomi for their search engine (as opposed to directory) results it could provide Google with serious competition.2 movie erotica sapphicsecretary movies sexmovies sexualmovies spycammovie squirtsquirting movies pussyboy teen moviesteen girl moviemovies tgp teenmovies masturbating teens

5 June 2003

Researchers have found that good information doesn’t always drown out bad in recommender systems. In fact, the research done by the creators of movielens shows if you give (for example) a movie a higher rating than it “deserves” other people will also be inclined to give it a high rating. So “innocent” people will unconsciously “play along” with people trying to influence the system and reinforce their dirty work. Unfortunately for the creators of recommender systems, users will notice when overall a recommender system’s results are poor. The writers of the academic paper (available in full here) suggest one way to avoid this problem would be to hide the rating of a film from users who want to rate it themselves so they aren’t influenced by others’ ratings.

I didn’t find collaborative filtering useful when I did use it, but that was nearly ten years ago when the MIT Media Lab was playing about with what became Firefly. Perhaps if my DVD Recorder was smarter and networked with other such recorders to compare my TV/film preferences with others’ without my needing to enter the details by hand it would have enough data to be able to adequately predict my viewing tastes. Personally I suspect mine are atypical enough that it would be difficult to predict what I would like mathematically. Then again, most people probably think they are unique in this respect!

18 May 2003
Filed under:Search Engines,Weblogs at10:38 am

Tom Coates suggests that because weblogs tend to link more to other sites that have useful information or views on a subject, one can get to “100% information saturation on any given subject in the blogosphere without reading anywhere near 100% of the weblogs in it”.

But while he qualifies the statement later on to “100% of the information available in the blogosphere” there remains an unspoken assumption that because there are so many weblogs and sites out there, there will be thoughtful posts on any given subject. But how many – say – liberal arts professors have weblogs or even websites? How many trade union leaders or indeed politicians? How many people living in the developing world?

Moreover the idea that more incoming links > a more informative opinion is flawed. People often link to things they disagree with or think are stupid(*), and sites that start being the most popular have a substantial advantage in likelihood of being linked to again simply because more people visit them and have a chance to see a view. If an issue like “will Venice sink into the sea” turns up in the news, and a weblogger who has hitherto laboured in obscurity happens to be the world expert on the subject, what is the likelihood that enough people will find her to make her weblog rise to visibility through the haze of links that are popular just because the posters are?

This is also the problem with search engines like Google that weight pages by number of inbound links – the results appear good, but lots of stuff that might be better remains obscure because it doesn’t yet have lots of incoming links. In fact the more useful the links you do find are the more dangerous it is because you may fail to realise the extent of what’s missing.

(*) Some people (can’t remember who offhand, though) have the clever idea of trying to put metadata into the HTML of links saying useful stuff like “I think this link contains useful/useless information” or “I agree/disagree with this link”. They want to come up with something that could be accepted as a standard. Of course the “installed base” of links out there is collosal so even if successful it would take years for this innovation to have much of an impact.

27 February 2003
Filed under:Search Engines at10:12 pm

I was just referred to this article at serchenginewatch which points out that almost no search engine uses the “keyword” metatag any more because it has been so widely misused in the past.

It’s sad (but perhaps inevitable) that a concept that was designed to help build the Semantic Web (a web where computers will understand better what is on web pages and their relationship to each other) has been rendered useless by commercial misuse.

P.S. Talking about commercial poisoning search engines, Overture has bought the FAST web search engine – once a promising competitor to Altavista and the rest. Now like Altavista it has been purchased by a company that gets its money from providing paid-for web links to search engines (and about which I am rather suspicious).phone 24 loan 7words loan indian americanatlantic loan aloan online advance paydayarm 3 loan 1800 loan studentmortgage loan bad adverse credithr loans boat 24 Map

21 February 2003

Brian’s Buzz – it searches 12 external sources of Windows information plus two of his own.

Thanks to Follow Me Here for the tip.amature nude teenamateur porn animal3d porn videotips 75 sexfree 100 porn picssex pics movies adultteens about19 teen Map

20 February 2003

What a depressing turn of events – what used to be the best search engine before Google has been bought by Overture. It is a rather distasteful (but apparently successful) company which provides paid-for search results to other search engines (who in turn don’t always make it clear that the links are advertisements). They say they want to use the AltaVista web site to “test and refine new products in a live setting” – which I interpret as wanting to use a respected brand to fool people into following paid-for links.

P.S. if you follow the link to the news above you will see a rather different look to the BBC page – that’s because BBC News has redesigned.

Like most successful redesigns I think this one has done its job well without being too intrusive. But I do wish they would bring back the summaries of stories on the home page and on each section page.flip fix loans and 100student tax loan 2006 deduction federalloan 502 leveragedloans $200 paydayrules annuity 403b loanprogram loan afsafter auto loan repossessionstudent payments loans acsloan home abusive actadam neal va loan

16 February 2003

As Dan Gillmor points out this step means Blogging Goes Big-Time. Google does appear to have an unerring nose for buying up companies and organizations doing cool stuff. It’s just a little worrying that one company might end up controlling large chunks of both web consumption (through search) and web creation (through blogger). Still, it’s hard to argue with something that will give self-publishing a big boost, and Google has mostly used its power responsibly. I have some concerns about their privacy policies though (see this and earlier posts of mine in the same category, and this – admittedly a little paranoid – overview).

One might ask “what bad things could realistically emerge from the Google/Blogger merger anyway?” Well, you may remember last month the Chinese authorities shut access to sites hosted by blogspot.com. I believe that has been resolved already but now Google owns Blogger and there is some evidence that Google is willing to “do business” with China’s censors. See this Wired interview

I have recently written a review of the academic literature about search engines which had some further Google-related comment.

Other comments have been made by Ben and Mena Trott (who created the software this weblog runs on), Neil Macintosh @ The Guardian, Azeem Azhar and Cory @ BoingBoing.

[Later] There’s also coverage from Slashdot and the BBC.

1 February 2003
Filed under:Search Engines,Useful web resources at5:27 pm

Googlert “performs regular Google searches on your behalf and sends an email alert containing any new results that appear”. Now you can keep an eye on what’s new on the web that interests you (or simply what is new about you!) without lifting a finger.

Thanks to boingboing and megnut for the link

? Previous PageNext Page ?