Updates on the Internet and its social and public policy implications, useful websites, political/cultural musings and more from a UK-based academic, internet consultant and journalist

Archive for the 'Interesting facts' Category | back to home

20 December 2012

Given the huge amount of data now available online, I am having great difficulty persuading my journalism students of the value of looking elsewhere (for example a library). One way to do so I thought might be to show them how little of what has been written in the pre and early web era is currently available online. I don’t have a good source of data to hand about this so I just put together this graph pulling figures out of my head– can anyone volunteer a better source of data for this? Someone from Google Books perhaps? [Update - Jerome McDonough came up with a great response which I have pasted below this graph]

If the question is restated as what percentage of standard, published books, newspapers and journals are not available via open-access on the web, the answer is pretty straightforward: an extremely small percentage.  Some points you can provide your students:

* The Google Books Project has digitized about 20 million volumes (as of last March); they estimate the total number of books ever published at about 130 million, so obviously the largest comprehensive scanning operation for print has only handled about 15% of the world’s books by their own admission.

* The large majority of what Google has scanned is still in copyright, since the vast majority of books are still in copyright — the 20th century produced a huge amount of new published material.  An analysis of library holdings in WorldCat in 2008 showed that about 18% of library holdings were pre-1923 (and hence in the public domain).  Assuming similar proportions hold for Google, they can only make full view of texts available for around 3.6 million books.  That’s a healthy number of books, but obviously a small fraction of 130 million, and more importantly, you can’t look at most of the 20th century material, which is going to be the stuff of greatest interest to journalists.  You might look at the analysis of Google Books as a research collection by Ed Jones (http://www.academia.edu/196028/Google_Books_as_a_General_Research_Collection) for more discussion of this.  There’s also an interesting discussion of rights issues around the HathiTrust collection that John Price Wilkin did you might be interested in : http://www.clir.org/pubs/ruminations/01wilkin [I wonder what the situation is like for Amazon's quite extensive "Look inside the book" programme?]

As for newspapers, I think if you look at the Library of Congress’s information on the National Digital Newspaper Program at http://chroniclingamerica.loc.gov/about/ you’ll see a somewhat different problem. LC is very averse to anything that might smack of copyright violation, so the vast majority of its efforts are focused on digitization of older, out-of-copyright material.  A journalist trying to do an article on news-worthy events of 1905 in the United States is going to find a lot more online than someone trying to find information about 1993.

Now, the above having been said, a lot of material is available *commercially* that you can’t get through Google Books or library digitization programs trying to stay on the right side of fair use law in the U.S.  If you want to pay for access, you can get at more.  But even making that allowance, I suspect there is more that has never been put into digital format than there is available either for free or for pay on the web at this point.  But I have to admit, trying to get solid numbers on that is a pain.

[Thanks again to Jerome, and thanks to Lois Scheidt for passing my query on around her Library Science friends...]

12 October 2011
Filed under:Academia, Interesting facts at2:17 pm

I’ve been looking at Amabile, T. (1996) Creativity in Context : Update to the Social Psychology of Creativity, Westview Press, Boulder, Colorado ; Oxford. In it I learned Dean K. Simonton tried to find out the effects of stress on creativity by, among other things, correlating the creativity of Beethoven, Mozart and other composers with the intensity of the wars affecting their countries at the time. I also just learned that “it is said that Schiller kept rotting apples in his desk drawer because the aroma helped him concentrate on writing poetry… Dr Johnson required a purring cat, and orange peel, and plenty of tea to drink.” I much prefer Johnson’s prescription to Schiller’s!

22 June 2011
Filed under:Interesting facts, journalism at5:37 pm

While doing  a little research into the state of the journalism industry globally (with a little help from “The Changing Business of Journalism and its Implications for Democracy“), I came across the following striking figures:

Between 2004 and 2008 newspaper circulation increased 16.4% in South America, 16.1% in Asia, and 14.2% in Africa according to a report by the World Association of Newspapers. afaqs!, an Indian media, advertising and marketing organisation, said print media readership in India rose from 232 million in 2000 to 302 million in 2007. The 2010 China Media Industry Report estimated the total value of the country’s media industries in 2009 was 490bn yuan (£47bn), up from 211bn in 2004.

Of course journalism faces well-publicised challenges from the internet and from the greying of its consumers in the developed world but across much of the developing world burgeoning middle classes, democratisation in many countries and an array of new communication technologies are contributing to major growth in the size of media industries. Not all of this by any means will get fed back into the kind of journalism publics need around the world but some at least should…

16 June 2011

The LSE recently hosted Abhijit Banerjee and Esther Duflo who delivered a talk (MP3) about their new book Poor Economics: A Radical Rethinking of the Way to Fight Global Poverty.  I had already read about some of their interesting findings– that a small incentive  to attend would encourage a big increase in immunisation among poor people (and reduce the cost per immunisation) and that even hungry people when given more money tend to spend it on better tasting food rather than more nutritious food.   I didn’t expect them also to comment on academic streaming and on electronic voting but in both cases they had interesting things to say about them from the developing world perspective.

Whatever the problems with electronic voting (and there have been many identified) there is some evidence that because they are more user-friendly for the less literate,  in Brazil they apparently helped  to increase successful voting by the poor and thus changed the political complexion in their favour.

As for academic streaming, a common argument against it is that the students who are less capable if they are all lumped together are not inspired by the example of more able pupils, and that the able pupils tend to get neglected because teachers have to concentrate on teaching to the lowest common denominator. This may be the case in some educational systems, but one study they highlighted found that less able students benefited significantly from streaming because, they suggest, teachers in India tend to concentrate on helping the most able students.

Although their work has been criticised in some quarters for neglecting the macrolevel systemic and political problems that cause difficulties for the poor, this seems to be mere quibbling–it is beyond the scope of even the most able scholars to give a complete picture of how to tackle poverty. Their approach which concentrates on finding the best solution to a series of common problems of the poor in different contexts using randomised controlled trials seem to me a refreshing and thought-provoking one and if you can’t afford the book I recommend you have a look around their extensive website which includes links to a profusion of relevant studies.

2 December 2010

Machine of death cover

This podcast interview by Jesse Brown with the creator of Dinosaur Comics and this web interview about the brief but dazzling success of a short story collection, ‘Machine of Death are interesting at a number of levels.

Briefly, a group of well-known web comic creators got together and found contributors from among their readers for this short story collection that they would then illustrate. No mainstream publisher would touch it because it didn’t contain material from authors they recognised, so they thought they would self-publish it. And they organized the fan base they had gathered from their web comic activity to buy the book all at once in order to get media attention. It worked and the book hit number 1 for several hours on Amazon US (though as they said it only took “thousands” of sales to do this – it’s now at #1192). A few days later, they released the full text of the book free as a downloadable PDF.

This phenomenon has naturally excited a number of the proponents of “new authorship” models and it is indeed an impressive achievement, but I would add a few cautionary notes to this tale:

Ryan North says he is able to make a ‘comfortable living’ from t-shirt sales driven by his free online comic strip but wouldn’t say how much this amounted to (and his standards of ‘comfortable’ may have been formed by his recent status as an impecunious grad student).

It benefited from promotion by the fan bases of several well-known web comics authors, was promoted on a number of very prominent sites like boingboing, and falls into the sci-fi/fantasy genre. It may even be a great read (I don’t know yet but I have started downloading the podcast). Taken together this constitutes a nearly ‘perfect storm’ in favour of this book.

The broader question for the future of this model has to be how replicable it is. At the moment this is newsworthy – the economic significance of online-driven publication will be proven when tens of thousands instead of (I’m guessing) a few hundred authors can earn enough in this way to afford to bypass the conventional publishing system.

Of course none of this should take away from the fact that even if this is not the start of an economic revolution for new authors it may well be the start of a cultural revolution enabling many more people to become published authors (even if with a rather different notion of what being ‘published’ means). It is this as much as anything else I intend to explore in my upcoming research.

27 August 2010

I have long known one of the UN’s key prerequisites to help reach the target Millennium Development Goals is that developed countries should donate a paltry .7% of their GNP to aid projects (at present nearly all fall well short of this). I just found out (via the Economist) that there’s another even more ambitious but contrasting target. It seems that poor old NATO is suffering because most of its member nations are not spending up to the 2% of GDP target it has set for military expenditure. Would it be too much to ask that countries reach the .7% aid target first?

13 April 2010

I’ve been listening to the free Librivox audiobook of this for fun and I was surprised given that it was written in 1905 at how liberal its politics are – it contains often sympathetic references to most of the better known people’s revolts. I was also struck that although it was aimed at children it has in several places explanations of the Greek and Roman derivations of some of the vocabulary.

28 July 2009
Filed under:Interesting facts, Personal at1:43 pm

I was thirsty on a long-distance train recently. I wasn’t planning to have a coke when I got to the buffet car but to my surprise it was the cheapest drink – 2/3 the cost of the smoothie ‘healthy option’ – cheaper even than tea. I looked at the health notes – 29% of a whole day’s sugar per serving! Then I looked at the fine print – this single 500ml bottle contained two “servings” thus it was nearly 2/3 of a day’s sugar ration all by itself!!

30 January 2009
Filed under:Interesting facts, Old media at6:13 pm

I happened to be looking at the Oxford English Dictionary and I discovered that there’s a recent addition to it: Rashomon (n.) – “…resembling or suggestive of the film Rashomon, esp. in being characterized by multiple conflicting or differing … interpretations.”

You know you’ve arrived when your work becomes an OED-recognised description of something…

The American Petroleum Institute back in the 1950s produced a piece of propaganda not just about the importance and benefits of oil but about the importance of competition in the economy – ironic since only 43 years previously Standard Oil was one of the biggest and most ruthless near-monopolies until it was broken up by the US government.

Next Page ?