Open Data Special @ PICNIC ’10

Open Data Special @ PICNIC’10

Frank Kresin, Karen van der Molen and Tom Demeyer, all Waag Society, Amsterdam, served a whole day of discussion on Open Data at the 5th Picnic media festival last Friday, 24th September. The day consisted of presentations from a panel of five experts in the morning and an—extremely well-attended—bar-camp like lab session in the afternoon. The aim of Waag’s involvement in Open Data, as is their hallmark, is to build a bridge between policy-driven top-down approaches and grass-roots bottom-up. For top-down read: the overdue implementation of the EU PSI directive, dating from 2003, and in the case of the Netherlands the principle of everything government publishes is public unless there are specific reasons for it not to be.

Rufus Pollok (Director of the Open Knowledge Foundation and the Mead Fellow in Economics at Emmanuel College, University of Cambridge), great mixture of activist and economist, opened the panel citing his own request that ‘we want the data raw and we want it now’ (http://rufuspollock.org/misc/). Open data, so Pollok, means the freedom for anyone, commercial or noncommercial, military or local community, based in rural India or a European capital, to use, re-use and distribute any data, geographical, statistical, economic, legal, electoral or financial—as long as it is non-personal. He portrayed open data as a platform on which anybody could build their own services and businesses. That way it would be able to create a ‘read/write society’ to replace the read-only version we’ve got now where the authorities dictate what the people should understand from government data. Instead anybody should be allowed to read all the details and write their own evaluation, since ‘the best thing to do with your data will be thought of by someone else’ (Pollok).

How would data evolve, if all data was open? With that question in mind Julian Tart of Future Everything started to work on the Open Data Cities initiative in Greater Manchester (http://2010.futureeverything.org/lab/opendata/odi). It was only one obstacle that Greater Manchester itself consists of 10 local boroughs, which means 10 local authorities with 10 different agendas. Authorities were also fearful—fearful of getting sued by government agencies (!), fearful since they did not know what data they had got, and particularly fearful of public opinion and the press. So he devised some kind of guerilla strategy, bringing together data users from the community and data managers from local authorities. By breaking down the ‘them versus us’ mentality they created narratives and prepared arguments that helped local authority executives to see themselves as ‘cool’ when opening up their data. And opening up data would help them to reduce the costs of freedom of information requests, an estimated four to five million pounds for Greater Manchester. Still there remain problems to be solved—such as who is going to host the massive amount of data that some projects generate. And there are questions unanswered: What if the efficiency gain of open data results in job losses in local government? Who is it who is empowered by open data—is it the already empowered or do initiatives live up to their responsibility to enable all?

Jarmo Eskelinen (Forum Virium, Helsinki) told the story—so far—of Helsinki Region Infoshare (http://www.hel.fi/wps/portal/Helsinki_en/Artikkeli?WCM_GLOBAL_CONTEXT=/Helsinki/en/news/helsinki+region+infoshare+opens+up+public+databases) along the ‘3 M’ of Open Data: mandate, mindset and method. He briefly established that government has a mandate to open data, stipulated in the EU Public Sector Information (PSI) Directive (2003; http://ec.europa.eu/information_society/policy/psi/actions_eu/policy_actions/index_en.htm)—notably with the intention to stimulate a market of 27 billion Euros per year, at least that is the estimated business value of public sector information created in the EU each year. Again, Eskelinen finds some barriers to the implementation of that mandate in the mindset of government officials which could briefly be summarised as ‘say yes—do no’. It is the known fears that hinder opening up data. But it’s also the understanding that who collected the data would know best what to do with it. And then there’s a third barrier: the method. Helsinki chose to establish a central Infoshare, a place that would collect and harmonise publicly hold datasets—without copying them to a new repository, that is—and distribute them to end users through a portal, to be launched in early 2011. To make this happen, Eskelinen adopted a three-way strategy. One, find one local champion who goes through the whole process and then can tell peers: ‘I’ve done it, and it is so good’. Two, convince internal data holders that they actually have the same goals with their data as external users have: access, collaboration, creation of new services, participation, commitment, savings, improving data quality, and feedback. Three, create quick wins—as Helsinki did with the contest ‘apps4finland—doing good with open data’ (http://www.verkkodemokratia.fi/apps4finland) that resulted in some 40 submissions, winners to be announced no 7 October.

It was Ton Zijlstra who unmasked government officials who pride themselves of opening-up data that this just ‘is the law, stupid’, a remark he kindly attributed to some activist online comment to perceived achievements (http://www.slideshare.net/TonZijlstra/picnic-open-data-session). Ton declared himself an optimistic radical—despite the fact that ‘open government data’, according to him, is an oversimplification as a very notion. Open is currently interpreted in the most restrictive ways, such as having to pay a EUR 6000 annual fee to access ‘open’ data. Government is a conglomerate of various agencies with much diverse areas of responsibility and influence, both geographically and hierarchically. And data ‘is not a simple one thing’ either, so Zijlstra. The same thing holds for the potential users of government data. ‘The field of application if extremely varied.’ While it might be the hackers and grass-root communities that are most vocal requesting open data and commercial IT service providers who are most eager to secure their share of an information distribution and enhancement market, it might be government bodies themselves who could benefit most quickly and most substantially from open data. Imagine fire fighters who need information on the current state of road works to plan their route, all sorts of details on the site of the emergency they are attending to—such as hazardous material they could encounter on site, hidden obstacles under water etc. This information is held in government repositories, and it would take the fire chief currently several weeks to obtain them. ‘Don’t forget that government could reuse their own data’ was one of Zijlstra’s important messages. And the other message was that we all need to make use of (open) data, we need to keep sharing our stories—as in the example of ‘Ambtenaar 2.0’ (http://www.ambtenaar20.nl/) a Netherlands based social network for anybody working on bringing government to level 2.0. ‘Keep it simple, do it together and reach out to every stakeholder’ was Zijlstra’s message to governments.

Open government data in France comes with a sophisticated pricing scheme, as Daniel Kaplan (Fing, Paris) demonstrated in his presentation (http://www.slideshare.net/slidesharefing/open-public-data-future-scenarios). Central government data is sold through an agency called APIE (Agence du Patrimoine Immatériel de l’Etat, the State’s Intangible Heritage Agency, https://www.apiefrance.com/, note the .com domain) and manage their respective licenses. On a local level, however, there are a number of initiatives that come closer to the dominant story of open government data as being freely available in machine-readable form to everybody. Yet is this scenario really the only one? It might be the situation in France that triggered this question for Kaplan. So he went to explore what could be the consequences of opening up government data. And sure the dominant story is not the only scenario. Another scenario would be that the provision of open government data could be carried out by privatised organisations, leading to better services for who is able to pay more, an unequal power distribution among who is able to interpret data, data research driven by funders’ agendas, and eventually citizen frustration, loss of trust and participation. Such a scenario might be just as likely as the dominant story. So Kaplan highlighted a number of tensions, the most important being the role of government as facilitator or service provider, empowerment of the empowered or the disenfranchised, creating opportunities for the public or for corporations, privatisation of services versus data openness, the role of government as primary data source or primary data user. Time did not allow Kaplan to discuss the possible levers, particularly the ones that rely on a political choice, such as the promotion of a ‘culture of data’, specific actions against ‘data divide’, theeExtension of open data to some private data, data crowd sourcing, hard thinking on ‘transparency’ PSI and specific public regulatory and proactive roles on PSI.

The session closed with recommendations of the speakers to the Dutch government—how to go about implementing Open Government Data—a video of which is available here: http://www.youtube.com/watch?v=c0MgSbgpe_E or here: http://www.vimeo.com/15318389 (and because I was too slow to realise what was happening, the video is missing Ton Zijlstra’s statement, apologies).

In the afternoon, the bar camp discussed a wide range of topic around open data—regarding upstream aspects, such as ‘how do I start’; ‘how to mobilise those who have data to unlock it’; what are other sources than government; provenance and quality of (open) data—downstream aspects, such as creating a community of smart data users; energising the collaboration within government, education, business, citizens; the role of citizens; ‘how to make young people aware of open data’—and ecosystem aspects: ‘what is the ecosystem?’; agile data to get rid of heavyweight semantic definitions; and the search for striking examples. Space and time don’t allow reporting all the results (that are available at http://opendatawiki.net), so here is a selection of some of the most striking contributions.

Under the header ‘agile data’ participants advocated to ‘just throw anything out there. Zero structure.’ This should enable quick re-use by grass root initiatives. However, there was the insight, that some kind of minimal structure would be needed. And since examples like Wikipedia or Open Street Maps (OSM) were cited, it is quite obvious that this minimal structure can be quite rigid, like Wikipedia’s editorial policies and their enforcement or the structures of geodata in the case of OSM. The argument seemed to go rather along the lines of the folksonomy-taxonomy-discusison—for more details about it see e.g. Speller, Edith (2007): Collaborative tagging, folksonomies, distributed classification or ethnoclassification: a literature review (http://www.librarystudentjournal.org/index.php/lsj/article/viewArticle/45/58) and Reamy, Tom (2009): Folksonomy folktales (http://www.kmworld.com/Articles/Editorial/Feature/Folksonomy-folktales-56210.aspx). Yet the argument articulates the urge of users not to wait until some committees have mutually agreed on a standard to publish data before data actually is made available. And it asks governing bodies to watch and promote grass-root structural annotation.

At least two other discussions highlighted the issue of crowd sourcing. On a more general level, the ecosystem discussion group concluded, ‘that the Open Data ecosystem needs a “healing” process. Here we mean a way to have colliding and colluding datasets refine and enhance the quality and value of data. Example: By combining datasets they are validated (or invalidated). The quality of resulting services will be a driver to improve underlying datasets.’ However, the discussion on provenance and data quality noted: ‘Self cleaning/corrective processes of crowds—how do beliefs/value systems/biases influence—can’t assume the “crowd” will eventually converge on the “truth”’.

Under ‘striking examples’, finally, at least two are worth mentioning. One is the Helsinki travel planner at http://www.reittiopas.fi which generates multimodal advice on how to get from A to B and includes a carbon footprint calculator. It is said to be the most popular website in Helsinki and probably the best candidate for a ‘striking example’. Another case was how allowing members of the public to have a close look on budget spending in Canada uncovered a large charity fraud that saved the budget C$ 3.2 billions (http://eaves.ca/2010/04/14/case-study-open-data-and-the-public-purse/).

The Open Data Special @ PICNIC’10 was a day fully packed of information, questions and discussions on open government data (an beyond). It attracted a knowledgeable and interested crowd; some were invited to share their experience, others came along to bootstrap their own initiatives or to find out how to hook into this developing ecosystem. Most prominently absent were the real hackers … but then PICNIC is not the place where these people tend to hang out.