The open data delusion: can we find meaning in the data?

The open data delusion is a phenomenon in which we believe that increasing the amount of data published or accessible will increase the public understanding of the issues.

What the financial crisis shows is that the data was out in the market place for those with the knowledge, insight, and resources to look and analyse.  Unless the firms had their own information, which they had created themselves, the information was in the market.* As such, putting the firms under FOIA, at least for market information, would not change the amount of information already available.  In some ways, it would only confuse and not clarify because it would represent the financial institution’s understanding of the events or the external market.**   (For the sheer scale of the problem consider the

The problem with open data is that it is a second order issue. The true problem is actually more complex.  At one level, it is to understand and use the data.  Data by itself is meaningless, it has to be used and applied to hold meaning.  Therein is the problem because most people do not know how to extract the meaning from the information so they are reliant upon those that do.  Even an “application” cannot give that for the person. Understanding though comes from education and that, in turn, requires a focus on critical thinking skills.  The danger of the open data delusion is that it creates the belief that open data is the solution. The reality is that open data is only the first step.  Moreover, depending on the issue to which open data is being applied, it may need further refinement.  However, that only transfers the issue from the question of access to the data to who is processing or organising the data.  In some cases, the open data’s meaning  is already clear, such as crime statistics showing the location, date and time of the incident.  Other data, though, still requires work and that means the public can still be kept from understanding the data.

The second level of the issue is whether the public can see the intent or search out the data to make the necessary link to the issues they face.  A mother may fear the crime problems in her neighbourhood. She knows there is crime, so she does not need a list of dates, times, and locations. Instead, she needs to know if it is going up, or down, and if it changing from burglary to drug dealing.  Moreover, if her concern is about gangs, will the crime data tell her what she needs to know about the existence of gangs and how they work within the area?  Can open data answer that for her?  Even if it could, would she know to use that information and apply it to her situation?


Instead, the open data debate is more about access for communities and sub-communities within society based upon the belief or argument that these communities will use that data to the benefit of society. Leaving aside the market logic arguments, in that they will want to make money from the data so that they will create applications for the public, we still have an interface between society and the information. The relationship is no different from that between the public and the press when dealing with the reporting of news or stories.  In one sense, the issue is only devolved. Instead, of News of the World mediating the stories and pronouncements from the government, it will be a cohort of bloggers or analyst mediating it.  Is this problematic, not necessarily, but the issue does exist and it makes the public dependent upon sources that are less accountable, in some ways, and potentially less reliable.  In much the same way that people rely upon the popular media to understand a government initiative.

Are the public simply accepting what the media are saying without considering whether it is being “spun” by the media or the government? Are they taking a critical view and working to understand the underlying principles within the argument or issue so that they can develop their own understanding?  The first path is easier and it reflects the cost of accumulating the knowledge needed to be a discerning citizen or reader. Yet, the second path is what is needed to be a citizen.  Otherwise, the citizen becomes dependent upon the sources of the information and not being able to think or act for themselves. One might argue this argument is suggesting that every citizen has to be an economist, a political scientist, and a statistician to take part.  On the contrary, it is making a more modest proposal in that it is only asking that the citizen be educated enough so that they can be discerning in their judgements.  Democracy’s soul is at stake. It would appear that the open data movement, while well intentioned, has the seeds of democracy’s destruction.  As the public lose their ability to discern the political messages and acts and need greater mediation, the more they become dependent on those mediating forces and, ultimately on the government.

The citizen has to be able to find meaning within the data. The meaning allows a citizen to turn information into knowledge so that they can develop informed political opinions. In doing so, they become engaged in the political process.  Without informed opinions, the citizen is left unengaged and unable to engage in the political realm.  Therefore, the open data movement will only succeed if it can help create meaning for the citizens.  We need to encourage and develop discerning opinions about the political realm and thereby reinvigorate the public space.  For Americans it is the choice between continuing as we are and have been, or seeking to renew the public space.

*The point is here is less about an efficient market theory, which is not a strong or sustainable theory, but rather that the market was acting and deciding in ways that could show, to those looking for it, what was happening.  The follow up question, though, is who would be thinking of such questions or even looking at such data for the malignant possibilities within it.  The public were not aware of the emerging bubble or even interested in it, because they did not understand its effect.  The issue is not predicting the outcome but rather identifying the emerging trend.  The underlying question for this post is whether open data would have changed that issue.


** For the sheer scale of the issue, the reader is urged to look at pages 30-35 of the first volume of the filing.  (  Then again, how many people will have read the report, or even the executive summary to understand its issues?  Perhaps that raises the question of how much knowledge is needed, at all, to decide on an issue. Lehman Brothers had over 2600 software systems and applications operating when it went bankrupt.   The amount of data on its systems was estimated at three petabytes or, roughly, 350 billion pages.




About lawrence serewicz

An American living and working in the UK trying to understand the American idea and explain it to others. The views in this blog are my own for better or worse.
This entry was posted in linked data, open data, public sector, republicanism and tagged , , , , . Bookmark the permalink.

4 Responses to The open data delusion: can we find meaning in the data?

  1. Peter Krantz says:

    Interesting discussion but I am not sure about the conclusion. Setting aside effects on innovation and productivity I agree that the benefit of data appears when it is used, understood and acted upon.

    The current situation where government and traditional media makes the selection and choose the way to present it makes it difficult to validate decisions made based on it. We also miss out on topics that may be too narrow to be interesting for the typically broad readership of a newspaper.

    The main difference with open data is that we now have at least a possibility of having multiple people make the selection of domains and topics to cover. As it becomes cheaper to reach and engage a large group of people (via e.g. social media) chances are that we will see more data being used to engage citizens in the things that affect them.

    Open data may not be the solution for all things in a democracy and there will probably be plenty of unused datasets published with poorly defined semantics making them very hard to use. But it is reasonable to expect that it will be a key factor in holding government accountable in areas where traditional media will fail.

    • lawrence serewicz says:

      I agree about the open data and accountability. The issue though is whether open data is leading to that end. Moreover do we only have open data for government statistics? Truly open data would see large global companies opening up their data to similar apps or development opportunities.

      It re?mains to be seen, although early results are positive, that the open data benefits the common good.

  2. bmwelby says:

    Thanks for pointing me towards this in response to my own initial thoughts on the role of open data within local government, it’s pretty deep stuff!

    I really like your connecting of engagement with open data into engagement with the state and therefore a greater democratic awareness. For me it’s really important that there is a narrative to go alongside open data from the agency to whom it belongs.

    Publishing data without context or explanation will look like it’s been published without purpose and you could justifiably ask why anybody, apart from a few passionate individuals, would get involved? If we think that open data can be a tool to support democracy then it has to be partnered with opportunities for people to get involved with the rest of democracy, rather than simply as a stick with which to beat it?

    Hope you enjoy the rest of my posts on open data, unusually for me they’re not really going to be talking about the public a great deal…

    • lawrence serewicz says:

      Thanks for the positive response. I am glad you liked it. The open data movement will work best when the public are engaged in it. At the same time, it will require people to “translate” the open data movement into their understanding. It is fascinating tim in local and central government with the opportunities that are emerging. I look forward to future posts.

Comments are closed.