Weblog on the Internet and public policy, journalism, virtual community, and more from David Brake, a Canadian academic, consultant and journalist
7 March 2014


If you are using images online as a journalist you need to ensure that you have the rights to put them on your site legally.  If you do a Google image search, click on “search tools” and select “usage rights” that’s one way to ensure what you’re finding you can use, but in addition image libraries like Getty Images contain a lot of very high quality images (> 35m at last count) including pictures relating to the latest news. This is why they can charge for them and put watermarks over the images you can see for free so you don’t pirate them. Now, however, tired of trying to fight the many online pirates of their content, Getty seems to have decided to make it easy for people to use their images online for free in controlled ways with attribution.

They are defining “non-commercial” (and therefore permissible) uses of their images quite broadly so as long as you use their image embedding tool you should be able to legitimately use their many pictures on most journalistic projects online (for print use you would still need to purchase them).  There is already speculation that the other major picture agencies may do likewise. Here’s how to take advantage of Getty Images’ new embed feature (and its limitations).

Getty’s “front page” for searching embeddable images is here.

13 February 2014

I love hearing about the latest digital tools that help one operate as a journalist/researcher whether that be twitter search and monitoring tools, bookmark management tools, people search tools etc. “Search : theory and practice in journalism online” by Dick is particularly good for finding and describing this stuff – but I am not aware of any articles that bring the different pieces together to describe all the key online tools a journalist uses and how they all go together into a work flow. I plan to come up with something myself to share with students and if I do I will post it here but I would love to hear what other people are using.

22 January 2014

I’m as excited as anyone about the potential for organizations and governments to use the ever-increasing amounts of data we’re ‘sharing’ (I prefer the less value-laden ‘giving off’) because of our love of smartphones and the like. So I enjoyed this presentation by Tom Raftery about “mining social media for good”.

(Slideshare ‘deck’ here)

And I am sure his heart is in the right place, but as I read through the transcript of his talk a few of his ‘good’ cases started to seem a little less cheering.

Waze, which was recently bought by Google, is a GPS application, which is great, but it’s a community one as well. So you go in and you join it and you publish where you are, you plot routes.

If there are accidents on route, or if there are police checkpoints on route, or speed cameras, or hazards, you can click to publish those as well.

Hm – avoid accidents and hazards sure – but speed cameras are there for a reason, and I can see why giving everyone forewarning of police checkpoints might not be such a hot idea either.

In law enforcement social media is huge, it’s absolutely huge. A lot of the police forces now are actively mining Facebook and Twitter for different things. Like some of them are doing it for gang structures, using people’s social graph to determine gang structures. They also do it for alibis. All my tweets are geo-stamped, or almost all, I turned it off this morning because I was running out of battery, but almost all my tweets are geo-stamped. So that’s a nice alibi for me if I am not doing anything wrong.

But similarly, it’s a way for authorities to know where you were if there is an issue that you might be involved in, or not.

To be fair Tom does note that this is “more of a dodgy use” than the others. And what about this?

A couple of years ago Nestlé got Greenpeace. They were sourcing palm oil for making their confectionery from unsustainable sources, from — Sinar Mas was the name of the company and they were deforesting Indonesia to make the palm oil.

So Greenpeace put up a very effective viral video campaign to highlight this […] Nestlé put in place a Digital Acceleration Team who monitor very closely now mentions of Nestlé online and as a result of that this year, for the first time ever, Nestlé are in the top ten companies in the world in the Reputation Institute’s Repute Track Metric.

Are we talking about a company actually changing its behaviour here or one using their financial power to drown out dissent?

You should definitely check out this talk and transcript and if we’re going to have all this data flowing around about us it does seem sensible to use some of it for good ends – there are certainly many worthy ideas outlined in it. But if even a presentation about the good uses of social media data mining contains stuff that is alarming, maybe we should be asking the question more loudly whether the potential harms outweigh these admitted goods?

31 December 2013

Wordle of Sharing Our Lives Online: Risks and Exposure in Social Media

Just as the old year passes I have finished off the last substantive chapter to my upcoming book. Now all I have to do is:

  • Add a concluding chapter
  • Go through and fill in all the [some more clever stuff here] bits
  • Check the structure and ensure I haven’t repeated myself too often
  • Incorporate comments from my academic colleagues and friends
  • Submit to publisher
  • Incorporate comments from my editor and their reviewers
  • Index everything
  • Deal with inevitable proofing fiddly bits
  • Pace for months while physical printing processes happen… then…
  • I Haz Book!

Doesn’t seem like too much further, does it?

Update Jan 2, 2014 –  I have finished my draft concluding chapter, which ends, “[some form of ringing final summing-up here!]”

10 November 2013

Tracking my paper's readership using academia.edu

Just as we are all finding out how much the government has been tracking our meta-data, a whole ecosystem of public-facing meta-data tracking services is arising, giving us the chance to measure our own activity and track the diffusion of our messages across the web. This is particularly noticeable when looking at Twitter but other social media also increasingly offer sophisticated analytics tools.

Thus it was that as my latest open access paper “Are We All Online Content Creators Now? Web 2.0 and Digital Divides” went live two days ago I found myself not just mentioning it to colleagues but feeling obliged to update multiple profiles and services across the web – FacebookTwitteracademia.edu, Mendeley and Linkedin. I found to my surprise that (by tracking my announcetweet using Buffer) only 1% of the thousands I have ‘reached’ so far seem to have checked my abstract. On the other hand, my academia.edu announcement has brought me twice as many readers. More proof that it’s not how many but what kind of followers you have that matters most.

Pleasingly, from Academia.edu I can also see that my paper has already been read in Canada, the US, Guyana, South Africa, the Netherlands, Germany, Poland, and of course the UK.

The biggest surprise? Google can find my paper already on academia.edu but has not yet indexed the original journal page!

I will share more data as I get it if my fellow scholars are interested. Anyone else have any data to share?

23 October 2013

It has long been understood by scientists (but not by enough parents) that the amount that children are talked to has a crucial impact on their later educational development so I was pleased to see the New York Times pick this story up. However it rather wastes this opportunity because it is so clumsily written – particularly in its handling of statistics.

The first paragraph is confusing and unhelpful “…by age 3, the children of wealthier professionals have heard words millions more times than those of less educated parents.” Clearly, rich kids don’t hear millions of times more words than poor ones but that might be what you pick up from a quick scan. Further down the story, “because professional parents speak so much more to their children, the children hear 30 million more words by age 3 than children from low-income households”– unfortunately, this is meaningless unless you know how many million words both kinds of children heard overall. The difference is only hinted at near the end of the piece when you finally find out (through a different study) that “some of the children, who were 19 months at the time, heard as few as 670 “child-directed” words in one day, compared with others in the group who heard as many as 12,000″.

Very annoyingly, despite saying the 20 year old study in the first paragraph was a “landmark” there is no link to the study on the website or information to guide readers so they could find it later. The story makes reference to new findings being based on a “small sample” but doesn’t say how small.

Crucially while it seems to suggest that pre-kindergarden schooling could make up for this gap, it presents no evidence for this. Intuitively, to solve this particular problem a big push to get parents to talk to their babies and small children would be much more effective since they spend much more time with them than any educator could.

Ironically there was a much better-explained story on the same issue also from the NYT back in April – but not alas in the print edition.

So Tim could you take this as a reasonable excuse to bring some important research to the public eye, and Motoko (whose work on the future of reading I have liked a great deal) could you go back to the piece online and tidy it up a bit if you get the chance?

28 March 2013

Like many a tech-savvy parent I am trying to divert my kid’s gaming attention towards Minecraft – and with some success. There’s a ‘legacy’ iBook G4 he can use but getting the program to run at all was difficult and now that it is running, I have found it runs unusably slowly, even with all the graphical options I could find turned down (and with non-working sound). This to run a game that is deliberately designed to look ‘retro’ and which I imagine could have worked on a Mac LC c. 1990 if suitably coded! Since it’s a very popular game with a hyperactive development community I thought there was bound to be a way to make things work better. Alas, nothing I tried (Magic Launcher launching Optifine Light mainly) seemed to work and it took me several hours of forum reading, installation and tweaking to get this far.

It’s not a new observation but what makes older machines like my nine-year-old macbook obsolete does not actually seem to be the speed or capability of the underlying hardware but the steady ratcheting up of the assumptions that software makes. Somewhere (presumably in Java, which is Minecraft’s ‘environment’) I’m guessing there’s a whole load of un-necessary code that has been added in the last nine years which has dragged what should be a perfectly usable game down to a useless speed.

Just to drag this back to academic relevance for a moment, this is to my mind a good example of how the structure of the computer industry aggravates digital divides by gradually demanding users ‘upgrade’ their software to the point that their machines stop working, well before the end of their ‘natural’ lives.

PS If anyone has managed to get Minecraft working adequately on a Mac of similar vintage please share any tips…

25 March 2013
Filed under:e-books,Personal at10:56 pm

I just bought a (paper) copy of Cannery Row and caught myself thinking “£9 for 148 pages? For a book published in 1945 (which I would prefer to be in the public domain by now)?” And yet why not? It’s a work of well-established literary value, attractively produced with a 16 page introduction (thanks Penguin Classics). I’m a fast reader but even so this will likely give me several hours of reading pleasure – more if I reread it later or lend it to a friend.

Alas, years of immersion in free or cheap digital content (plus access to academic libraries for free and exam copies of the texts I think relevant to the courses I run) seem to have undermined my willingness to shell out for content – even though I frequently remind my journalism students that if they won’t pay for content they can hardly expect others to pay for their content when they get out into the working world!

Makes me feel like going and shelling out £27.95 for some Hemingway short stories just to balance out my stinginess…

6 March 2013
Filed under:Call for help,journalism,Online media at11:35 am

Much of the discussion about which way the journalism industry is doing suggests that freelancing will increase while staff jobs decline (see for example here and Paulussen 2012) but Felix Salmon at Reuters has just written an interesting piece suggesting most online content will be written by staff writers not freelances because online journalist is just too fast and frequent to make sense as a freelance business. His piece was inspired by Nate Thayer who complained recently about being asked to write for a major US magazine for free (for the exposure). The key paragraphs are here:

The exchange has particular added poignancy because it’s not so many years since the Atlantic offered Thayer $125,000 to write six articles a year for the magazine. How can the Atlantic have fallen so far, so fast — to go from offering Thayer $21,000 per article a few years ago, to offering precisely zero now? The simple answer is just the size of the content hole: the Atlantic magazine only comes out ten times per year, which means it publishes roughly as many articles in one year as the Atlantic’s digital operations publish in a week. When the volume of pieces being published goes up by a factor of 50, the amount paid per piece is going to have to go down.
But there’s something bigger going on at the Atlantic, too. Cohn told me the Atlantic now employs some 50 journalists, just on the digital side of things: that’s more than the Atlantic magazine ever employed, and it’s emblematic of a deep difference between print journalism and digital journalism. In print magazines, the process of reporting and editing and drafting and rewriting and art directing and so on takes months: it’s a major operation. The journalist — the person doing most of the writing — often never even sees the magazine’s offices, where a large amount of work goes into putting the actual product together.
The job putting a website together, by contrast, is much faster and more integrated. Distinctions blur: if you work for theatlantic.com, you’re not going to find yourself in a narrow job like photo editor, or assignment editor, or stylist. Everybody does everything — including writing, and once you start working there, you realize pretty quickly that things go much more easily and much more quickly when pieces are entirely produced in-house than when you outsource the writing part to a freelancer. At a high-velocity shop like Atlantic Digital, freelancers just slow things down — as well as producing all manner of back-end headaches surrounding invoicing and the like.
This is an interesting take on the issue but I am afraid it paints an overoptimistic picture of the future of “digital journalism”. It should be remembered that The Atlantic is one of the most successful and most digitally focused of American publications. Felix suggests that, ” it’s much, much easier to get a job paying $60,000 a year working for a website than it is to cobble together $60,000 a year working freelance for a variety of different websites.” I am very sceptical that any but a few of those who work full-time at the profusion of new digital content enterprises or offshoots of existing products will be earning anything like that sum–there’s just too much competition. I would expect many or most “jack of all trades” full-time or near-full-time digital producers will end up being on some form of precarious contract working from home.
Update: Alexis Madrigal, who oversees the Atlantic’s technology channel, has responded to the Thayer affair with a rather gonzo post about their business model and why it leads to ill-paying or unpaid invitations to blog.
I would be most interested in any more solid evidence in this area whether about the incomes and backgrounds of these new digital journalists or about the casualisation of journalism more generally.

Paulussen, S. (2012). Technology and the Transformation of News Work: Are Labor Conditions in (Online) Journalism Changing? In E. Siapera & A. Veglis (Eds.), The handbook of global online journalism. Chichester: John Wiley

20 December 2012

Given the huge amount of data now available online, I am having great difficulty persuading my journalism students of the value of looking elsewhere (for example a library). One way to do so I thought might be to show them how little of what has been written in the pre and early web era is currently available online. I don’t have a good source of data to hand about this so I just put together this graph pulling figures out of my head– can anyone volunteer a better source of data for this? Someone from Google Books perhaps? [Update – Jerome McDonough came up with a great response which I have pasted below this graph]

If the question is restated as what percentage of standard, published books, newspapers and journals are not available via open-access on the web, the answer is pretty straightforward: an extremely small percentage.  Some points you can provide your students:

* The Google Books Project has digitized about 20 million volumes (as of last March); they estimate the total number of books ever published at about 130 million, so obviously the largest comprehensive scanning operation for print has only handled about 15% of the world’s books by their own admission.

* The large majority of what Google has scanned is still in copyright, since the vast majority of books are still in copyright — the 20th century produced a huge amount of new published material.  An analysis of library holdings in WorldCat in 2008 showed that about 18% of library holdings were pre-1923 (and hence in the public domain).  Assuming similar proportions hold for Google, they can only make full view of texts available for around 3.6 million books.  That’s a healthy number of books, but obviously a small fraction of 130 million, and more importantly, you can’t look at most of the 20th century material, which is going to be the stuff of greatest interest to journalists.  You might look at the analysis of Google Books as a research collection by Ed Jones (http://www.academia.edu/196028/Google_Books_as_a_General_Research_Collection) for more discussion of this.  There’s also an interesting discussion of rights issues around the HathiTrust collection that John Price Wilkin did you might be interested in : http://www.clir.org/pubs/ruminations/01wilkin [I wonder what the situation is like for Amazon’s quite extensive “Look inside the book” programme?]

As for newspapers, I think if you look at the Library of Congress’s information on the National Digital Newspaper Program at http://chroniclingamerica.loc.gov/about/ you’ll see a somewhat different problem. LC is very averse to anything that might smack of copyright violation, so the vast majority of its efforts are focused on digitization of older, out-of-copyright material.  A journalist trying to do an article on news-worthy events of 1905 in the United States is going to find a lot more online than someone trying to find information about 1993.

Now, the above having been said, a lot of material is available *commercially* that you can’t get through Google Books or library digitization programs trying to stay on the right side of fair use law in the U.S.  If you want to pay for access, you can get at more.  But even making that allowance, I suspect there is more that has never been put into digital format than there is available either for free or for pay on the web at this point.  But I have to admit, trying to get solid numbers on that is a pain.

[Thanks again to Jerome, and thanks to Lois Scheidt for passing my query on around her Library Science friends…]

? Previous PageNext Page ?