The open data delusion is a phenomenon in which we believe that increasing the amount of data published or accessible will increase the public understanding of the issues.
What the financial crisis shows is that the data was out in the market place for those with the knowledge, insight, and resources to look and analyse. Unless the firms had their own information, which they had created themselves, the information was in the market.* As such, putting the firms under FOIA, at least for market information, would not change the amount of information already available. In some ways, it would only confuse and not clarify because it would represent the financial institution’s understanding of the events or the external market.** (For the sheer scale of the problem consider the
The problem with open data is that it is a second order issue. The true problem is actually more complex. At one level, it is to understand and use the data. Data by itself is meaningless, it has to be used and applied to hold meaning. Therein is the problem because most people do not know how to extract the meaning from the information so they are reliant upon those that do. Even an “application” cannot give that for the person. Understanding though comes from education and that, in turn, requires a focus on critical thinking skills. The danger of the open data delusion is that it creates the belief that open data is the solution. The reality is that open data is only the first step. Moreover, depending on the issue to which open data is being applied, it may need further refinement. However, that only transfers the issue from the question of access to the data to who is processing or organising the data. In some cases, the open data’s meaning is already clear, such as crime statistics showing the location, date and time of the incident. Other data, though, still requires work and that means the public can still be kept from understanding the data.
The second level of the issue is whether the public can see the intent or search out the data to make the necessary link to the issues they face. A mother may fear the crime problems in her neighbourhood. She knows there is crime, so she does not need a list of dates, times, and locations. Instead, she needs to know if it is going up, or down, and if it changing from burglary to drug dealing. Moreover, if her concern is about gangs, will the crime data tell her what she needs to know about the existence of gangs and how they work within the area? Can open data answer that for her? Even if it could, would she know to use that information and apply it to her situation?
Instead, the open data debate is more about access for communities and sub-communities within society based upon the belief or argument that these communities will use that data to the benefit of society. Leaving aside the market logic arguments, in that they will want to make money from the data so that they will create applications for the public, we still have an interface between society and the information. The relationship is no different from that between the public and the press when dealing with the reporting of news or stories. In one sense, the issue is only devolved. Instead, of News of the World mediating the stories and pronouncements from the government, it will be a cohort of bloggers or analyst mediating it. Is this problematic, not necessarily, but the issue does exist and it makes the public dependent upon sources that are less accountable, in some ways, and potentially less reliable. In much the same way that people rely upon the popular media to understand a government initiative.
Are the public simply accepting what the media are saying without considering whether it is being “spun” by the media or the government? Are they taking a critical view and working to understand the underlying principles within the argument or issue so that they can develop their own understanding? The first path is easier and it reflects the cost of accumulating the knowledge needed to be a discerning citizen or reader. Yet, the second path is what is needed to be a citizen. Otherwise, the citizen becomes dependent upon the sources of the information and not being able to think or act for themselves. One might argue this argument is suggesting that every citizen has to be an economist, a political scientist, and a statistician to take part. On the contrary, it is making a more modest proposal in that it is only asking that the citizen be educated enough so that they can be discerning in their judgements. Democracy’s soul is at stake. It would appear that the open data movement, while well intentioned, has the seeds of democracy’s destruction. As the public lose their ability to discern the political messages and acts and need greater mediation, the more they become dependent on those mediating forces and, ultimately on the government.
The citizen has to be able to find meaning within the data. The meaning allows a citizen to turn information into knowledge so that they can develop informed political opinions. In doing so, they become engaged in the political process. Without informed opinions, the citizen is left unengaged and unable to engage in the political realm. Therefore, the open data movement will only succeed if it can help create meaning for the citizens. We need to encourage and develop discerning opinions about the political realm and thereby reinvigorate the public space. For Americans it is the choice between continuing as we are and have been, or seeking to renew the public space.
*The point is here is less about an efficient market theory, which is not a strong or sustainable theory, but rather that the market was acting and deciding in ways that could show, to those looking for it, what was happening. The follow up question, though, is who would be thinking of such questions or even looking at such data for the malignant possibilities within it. The public were not aware of the emerging bubble or even interested in it, because they did not understand its effect. The issue is not predicting the outcome but rather identifying the emerging trend. The underlying question for this post is whether open data would have changed that issue.
** For the sheer scale of the issue, the reader is urged to look at pages 30-35 of the first volume of the filing. (http://lehmanreport.jenner.com/VOLUME%201.pdf) Then again, how many people will have read the report, or even the executive summary to understand its issues? Perhaps that raises the question of how much knowledge is needed, at all, to decide on an issue. Lehman Brothers had over 2600 software systems and applications operating when it went bankrupt. The amount of data on its systems was estimated at three petabytes or, roughly, 350 billion pages.
- Constructing the Open Data Landscape (scraperwiki.com)
- Why Open Data Alone Is Not Enough (wired.com)
- Opening government, the Chicago way (radar.oreilly.com)
- Interrogating The Open Data Ethos (or Open Data != Open Governance) (thereisnowetware.wordpress.com)