Weblog on the Internet and public policy, journalism, virtual community, and more from David Brake, a Canadian academic, consultant and journalist

Archive for the 'Old media' Category | back to home

20 December 2012

Given the huge amount of data now available online, I am having great difficulty persuading my journalism students of the value of looking elsewhere (for example a library). One way to do so I thought might be to show them how little of what has been written in the pre and early web era is currently available online. I don’t have a good source of data to hand about this so I just put together this graph pulling figures out of my head– can anyone volunteer a better source of data for this? Someone from Google Books perhaps? [Update – Jerome McDonough came up with a great response which I have pasted below this graph]

If the question is restated as what percentage of standard, published books, newspapers and journals are not available via open-access on the web, the answer is pretty straightforward: an extremely small percentage.  Some points you can provide your students:

* The Google Books Project has digitized about 20 million volumes (as of last March); they estimate the total number of books ever published at about 130 million, so obviously the largest comprehensive scanning operation for print has only handled about 15% of the world’s books by their own admission.

* The large majority of what Google has scanned is still in copyright, since the vast majority of books are still in copyright — the 20th century produced a huge amount of new published material.  An analysis of library holdings in WorldCat in 2008 showed that about 18% of library holdings were pre-1923 (and hence in the public domain).  Assuming similar proportions hold for Google, they can only make full view of texts available for around 3.6 million books.  That’s a healthy number of books, but obviously a small fraction of 130 million, and more importantly, you can’t look at most of the 20th century material, which is going to be the stuff of greatest interest to journalists.  You might look at the analysis of Google Books as a research collection by Ed Jones (http://www.academia.edu/196028/Google_Books_as_a_General_Research_Collection) for more discussion of this.  There’s also an interesting discussion of rights issues around the HathiTrust collection that John Price Wilkin did you might be interested in : http://www.clir.org/pubs/ruminations/01wilkin [I wonder what the situation is like for Amazon’s quite extensive “Look inside the book” programme?]

As for newspapers, I think if you look at the Library of Congress’s information on the National Digital Newspaper Program at http://chroniclingamerica.loc.gov/about/ you’ll see a somewhat different problem. LC is very averse to anything that might smack of copyright violation, so the vast majority of its efforts are focused on digitization of older, out-of-copyright material.  A journalist trying to do an article on news-worthy events of 1905 in the United States is going to find a lot more online than someone trying to find information about 1993.

Now, the above having been said, a lot of material is available *commercially* that you can’t get through Google Books or library digitization programs trying to stay on the right side of fair use law in the U.S.  If you want to pay for access, you can get at more.  But even making that allowance, I suspect there is more that has never been put into digital format than there is available either for free or for pay on the web at this point.  But I have to admit, trying to get solid numbers on that is a pain.

[Thanks again to Jerome, and thanks to Lois Scheidt for passing my query on around her Library Science friends…]

5 October 2012
Filed under:journalism,Old media,Online media at7:39 pm

When I joined New Scientist in 1995 as Net Editor (and ever since) I wondered why it largely covers the natural sciences not the social sciences. I assumed this was something to do with the ongoing intellectual and ideological struggle between ‘hard’ sciences and ‘soft’ sciences and the related divide between qualitative and quantitative research. Imagine my surprise when thanks to the 3rd October podcast of Thinking Allowed, I discovered that the same people who launched New Scientist had launched New Society as well (50 years ago yesterday), explicitly as a social scientific publication.

I just remember New Society – it was merged with the New Statesman in 1988, a year after I arrived here in the UK (more detailed memories can be found on the podcast and in this recollection in the THES). Wouldn’t it be nice if on the anniversary of New Society’s birth New Scientist might be inspired to broaden its remit and introduce a New Society section? After all there’s no reason to keep off New Society’s patch now…

22 June 2012
Filed under:journalism,Old media,Online media,research at12:55 pm

I’m writing a book chapter at the moment about the use of “user generated content” by journalists from the traditional media and to justify why I concentrate on the traditional media I thought I’d dig up a statistic or two about how dependent the public remains on traditional media for its news. I went looking for an update of Robert W. McChesney’s “The Titanic Sails On: Why the Internet won’t sink the media giants” written in 2000 and found his 2011 updated book The Death and Life of American Journalism. On page 17 I found this striking statement, “Harvard’s Alex S Jones estimates that 85% of all professionally reported news originates with daily newspapers and that he has seen credible sources place that figure closer to 95%”. Thinking this sounded like an interesting study I looked up the source and found Alex Jones’ book Losing the News: The Future of the News That Feeds Democracy. On page 4 he says, “my own estimate is that 85% of professionally reported accountability news comes from newspapers, but I have heard guesses from credible sources that go as high as 95%” (emphases mine). In other words either Jones has failed to cite his own research or (more probably) McChesney is reporting second hand and third hand guesswork.

This kind of thing really annoys me particularly when it takes me several minutes to get to the bottom of what turns out to be nothing more than a guess, and particularly when I know that there are a number of studies that discuss the sources of news with a greater deal of rigour. For example, there is How News Happens which argues that in Baltimore in 2009 95% of original news stories came from traditional news outlets, particularly newspapers (although its methodology has come under fire), or Paterson’s fascinating 2007 study showing that the leading online news sources (and to a lesser extent newspapers) are heavily dependent on news agency copy.

12 January 2012

I’ve just been listening to a segment on TV and TVOD on the BBC’s Media Show and it has reminded me just how far outside the mainstream my media consumption practices are. The average British household apparently ‘watches’ four hours of TV a day – a record high figure. This probably includes ambient sporadically viewed ‘TV on in the corner’ but still how on earth do they find the time? I probably watch an average of an hour of TV a week. X Factor has been an extraordinary success for ITV – I have never watched it (and probably haven’t watched ITV at all in a year). The channel I view programmes from most is probably (you guessed it) BBC4. Even with the proliferation of DVRs, TVoD etc, people still watch 88% of their television ‘live’. I watch or listen to almost nothing in that way any more. By far the bulk of my audiovisual media consumption comes in (audio) podcast form – about 1.5 hours a day – because I can do it while doing other things eg cycling to and from work.
It’s really odd to realise just how far outside of the media consumption mainstream I am (and it’s hard for me to imagine myself into the heads of more typical media consumers).

4 March 2011
Filed under:e-books,new readership,Old media at8:28 pm

I’m unimpressed at Harper Collins’ move to limit the number of times an e-book that is bought by a library can be loaned to 26. Most limits on distribution of content are at least partly justified by the fact that they are designed to prevent new copyright infringing uses of that content (even if in practice they also limit fair uses of that text). However, this new rule limits libraries from operating in perfectly normal, legitimate ways. One might conceivably argue that some limit could be set to account for the fact that physical books bought by libraries have always had the physical limitation of only being lendable for a certain number of times before they deteriorated (that’s why they tend to buy more hardbacks). But would a hardback become effectively unreadable after 26 readings as a Harper Collins e-book now will be? And is that the rationale they are offering libraries?

4 February 2011

1) I started my new job as Senior Lecturer in the Division of Journalism and Communication at the University of Bedfordshire this week and have enjoyed meeting my new colleagues (and collecting my new Macbook Pro).
2) I just met my editor at Palgrave and agreed to write a book (my first full-length academic one) provisionally titled “Sharing Our Lives Online: Risks and Exposure in Social Media” – likely to be delivered in 2013. I plan to blog about it as I write using the “Sharing Our Lives Online” category, so keep an eye on that…
3) On my way back from that meeting I discovered that my wife has also just found a position for when her current one finishes, which given the turbulent situation in the NHS where she works is a big relief.

Of course I would be open to receiving further good news but these three bits of news are certainly enough to be starting with!

15 September 2010
Filed under:Academia,new authorship,Old media,research at12:59 pm

I missed this when it first came around in April – according to Bowker who owns ‘Books in Print’, the publisher in the US which published the most titles – 272,930 – very close to the number of new titles published by all traditional publishers – is Bibliobazaar, which was then written up by Publisher’s Weekly. Turns out they specialise in packaging and reselling out of print books via print on demand. More evidence of the long tail in action, though of course each book probably only gets a few new readers and it is not clear if the figures for these new publishers mixes new titles with newly reissued titles and with any title from their back catalogues which makes comparison difficult. It would be interesting to know more about what sells well and to whom in that market if there are discernable patterns and how this differs from mainstream publishing.

8 September 2010

There has been much concern about people selecting only news and information they already know they are interested in and that agrees with their point of view via the internet. I have found that increasingly the “omnivore” blog from bookforum.com has been fulfilling that role for me, bringing me articles every week on the future of books, of journalism or of academia. Unfortunately, I am starting to suffer from punditry fatigue. Read too much on the same subject from newspapers and magazines – even if the subject is important to you – and it all starts to blur together after a while. In truth, it shows up the problems even with good journalism as compared to academic work. There is copious opinion but often little reference or only selective reference to new data or even to new arguments or approaches to the issues. Yet I feel I still need to read or at least skim it all in case I miss some new piece of information. Perhaps I would be better off just relying on the stuff that my peers circulate via the blogosphere and twittersphere?

6 September 2010
Filed under:journalism,Old media,Online media at11:32 am

The NYT just ran a piece on how various high profile US newsrooms use web traffic figures to inform their judgement about the news. Most seem to claim that low traffic stats don’t cause them to withdraw resources from stories that aren’t getting traffic but interestingly there is buried in there some evidence from the NYT itself that its blogs don’t have the same status as that paper’s traditional product. According to its executive editor, Bill Keller, “we don’t let metrics dictate our assignments and play because we believe readers come to us for our judgment” but “Mr. Keller added that the paper would, for example, use the data to determine which blogs to expand, eliminate or tweak.”

13 April 2010

I’ve been listening to the free Librivox audiobook of this for fun and I was surprised given that it was written in 1905 at how liberal its politics are – it contains often sympathetic references to most of the better known people’s revolts. I was also struck that although it was aimed at children it has in several places explanations of the Greek and Roman derivations of some of the vocabulary.

Next Page ?