Sarah and I are currently working in Berlin, at least until early September, and as luck would have it, the Web of Data Meetup group held their third meet up group here, in Fjord's rather lovely German HQ, right by Checkpoint Charlie, as it happens (making this our only venture into the former West Berlin on the trip so far, if only by metres).
Not having acquainted ourselves with the full schedule, we'd anticipated perhaps two or three presentations and a bit of informal networking. Not a bit of it. Rather, the seven-hour-plus afternoon and evening session was tightly run, intelligently programmed and both enlightening and engaging. While being detailed enough to be meaningful it somehow remained sufficiently service-/strategy-/product-focussed to by graspable by the non-engineers in the room – about half of us it turned out.
Billed Data Journalism, the workshop looked at a range of issues which arise from the uses (and possible misuses) of data and metadata in contemporary journalism – in its print, broadcast and online guises – and asked what fundamental changes are affecting the industry as a result.
Jonathan Gray of the Open Knowledge Foundation set the scene at the afternoon's opening, making clear the degree to which government and business the world over was opening up its records, and pointing to the vast opportunities this opens for journalists and campaigners.
Tom Scott from the BBC talked about the thinking behind and build process behind the fabulous BBC Wildlife Finder, and how it used broadcast metadata to aggregate video clips around various genres of biology, from animal type to ecozone. He pointed out something which emerged over and over during the afternoon – that smart use of data and tagging allowed the dynamic creation and aggregation of content which, done "by hand" would be financially and logistically untenable. He had one of the great lines of the afternoon, too: "People care about things, not web pages." Quite.
Deutsche Presse Agentur's Gerd Kamp then took us more directly into the world of journalism, discussing DPA's use of data in everything from scoping stories regionally (hugely important when syndicating stories to local news outlets) to creating dynamically-generated infographics and maps. Gerd made the astute observation that many of the CMSs used by journalists already included metadata tagging functionality but that too few journalists were incentivised to use them.
More from the BBC: Silver Oliver and Jem Rayfield from BBC News talked about the broadcaster's online coverage of the World Cup, taking us through, respectively, the IA and data approaches necessary to disaggregate content from its usual "linear" context and instead present it around rather meaningful, dynamically-created pages (over 800 of them, by the end of the tournament.)
The Guardian's information architect Martin Belam gave a funny and thought-provoking overview of the online newspaper's use of data in the creation of stories and campaigns, looking in some detail at the way The Guardian has been able to use crowd-sourcing collaborations with its readers to sift through otherwise unwieldy – and hence pretty much useless – data sets. Martin has blogged about the day, too.
Political scientist Ole Wintermann of The Bertelsmann Foundation's Future Challenges sounded a slightly cautionary note among the enthusiasm. It's all well and good the media – and hence both the public and civil society in general – having access to unprecedented amounts of data, but inferring causal meaning from data, be it about, climate or demographics, remained a huge challenge. In a moment somewhat reminiscent of Neville Brodie's recent appearance on Newsnight, he was especially scathing about the reliance of journalists and policy makers on infographics to communicate ideas without really examining substance – especially substance in terms of causality. Sarah and I both felt it was one of the most adroit interventions of the afternoon (and I'll have more to say on the matter in post in the near future).
The presentations ended with the University of Btitish Columbia's Eric Ulken talking about his former employer, the LA Times' use of data. Not passing up a golden opportunity to foment a slightly stroppy discussion he talked us through, let's say, somewhat controversial use of data by the paper: its publication of teacher performances in the city. It certainly raised the room's temperature, and rightly so. Responsibility and caution need to applied in data journalism as much as in the venerable profession's other guises – and arguably more so.
So a thoroughly enjoyable, stimulating and, yes, rather tiring day. Our thanks to Georgi Kobilarov of linked data specialists Uberblic Labs for organising it all.