Open data and transparency: are we buying a vacumn cleaner?

Consider the following story: When vacuum cleaners were just hitting the wider market in 1930s, a salesman visited a home in rural Texas. The new vacuum cleaner salesman knocked on the door on the first house of the street.

A woman answered the door. Before she could speak, the enthusiastic salesman barged into the living room and said he would prove his vacuum by cleaning the already spotless living room carpet. The woman refused saying that she did not think it would work.

The salesman insisted that his vacuum was the best new invention for household chores. To prove his point he took out a bag of dust and dirt and poured it on her floor. The woman grew visibly upset because she had spent hours cleaning it and would have to clean up the mess.

The salesman reassured her: “Madam, if I could not clean this up with the use of this new powerful Vacuum cleaner, I will do it for you.” With that promise, the woman relented.

The salesman then pulled out the electric cord and began looking for an electric outlet only to become visibly upset. At that point, the woman explained: “There’s no electricity in the house…”

Is the movement to open data and linked data like the vacuum cleaner that still needs electricity?

For most public organisations, the data sets and the infrastructure may not yet exist.  The key question then is who is developing the electricity? Where is the electricity?

Some public organisations, the pathfinders, may already have the electricity so they can use the vacuum cleaner. However, I would argue that for most public organisations the challenge is to create the electricity needed to run the vacuum cleaner.

I believe in the open data philosophy, which is strongly correlated to the principles behind the freedom of information act, and I wonder if we are putting the vacuum cleaner before the electricity.

What do I mean?  First, the public sector organisations may not have data sets that are valuable or reusable. I recall that one of the early data sets put on the open data site related to the annual shopping trolley in the river survey. By itself, this was interested, but it was not developed nor presented in a way that would allow it be linked or exploited beyond mere interest.  This is not a criticism of the council that developed the list; rather it is to point that more is needed than raw data.

Another early example of an open data list provided was the list of GPs in an area. This information is already known and accessible in a number of ways that a public sector organisation does not have to provide it as their example of open data.  We could get into a situation where we are, to paraphrase Chairman Mao; letting a thousand, flowers bloom without any thought to how we are going to harvest them.

At the same time, a public sector cultural issue to be considered. Open data advocates needs to consider the hidden challenge of culture. How does one convince public sector organisation that are facing reduced funding that it has to spend time organising its data sets, under taking the data quality controls,  and then publishing them.  In other words, what is needed is maximum compliance, but the minimum may be all that is achievable unless organisations can see the immediate cashable savings. To put it directly, promises do not pay the bills. J

Third, the main source of the data sets on is from central government rather than local government, which suggests a fundamental disconnect between the vision and the reality. This is like opera. In London, there are many opera houses. The farther you go from London, the fewer opera houses you will find. People are still interested in Opera, but they do not have the same infrastructure or market to sustain it. Central government has more resources to obtain data, process it, and present it.  The previous government spent a lot of time and money creating an infrastructure for that purpose with the goal of developing common standards across local government and improving service delivery for the public.  If local government is to do the same, the results may not be consistent or coherent across the sector.  What this reflects is that organisations, because of their local circumstances, are going to do things differently with different results.

In some cases, the data is not the issue, it is having the tools (the applications) to use it for analytical purposes or wider value development.  For example, the census data under the Output Area Classification has been available to everyone for a long time. Yet, only a few large companies have made extensive use of it.  The emerging issue is no accessibility but analysis. How many Councils or even eager individuals have the time or the resources to create applications that may unlock something (still undefined) from the data?

There still appears to be a growing gap between what is being visioned and what is being delivered.  The data may have some interest, but is it useful?  Is it being used to its full effect?  Are we seeing increased transparency as a result?

Organisations may be publishing more data and the public may even be looking at it differently, but is it leading to changed behaviours?  Are those changed behaviours perverse outcomes or are they leading to more efficiency. For example, the apocryphal story that someone was going to avoid disclosing a large payment under the £500 spends agenda by breaking it down into £499 chunks. I am not sure that type of perverse outcome emerged, but I do wonder about the other side.  Are we seeing a change in behaviour?

I think we are, but I would like to see more research into this development because the promise and the reality may be two different things.  If more efficiency were being achieved, it would be good to see organisations showing how that has happened and connecting the promise to the performance.

I think we are developing the electricity, but it will be sometime before we see the vacuum cleaner at least the one beyond the beta version.


