Making all-you-can-eat data tastier

Dr Eric Crampton
The National Business Review
23 June, 2017

In Damon Knight’s classic science-fiction short story, helpful aliens provide Earth with unlimited energy and plenty of food. The aliens even have a manual titled, “How to Serve Man.” It all seems too good to be true – until a translator works out the book’s first paragraph and sees it is not your traditional alien butler’s training guide.

Sir Peter Gluckman’s discussion paper released this week, “Using evidence to inform social policy: The role of citizen-based analytics,” also provided a bit of a surprise.

I expected to read about open data and the citizens that might analyse it. Statistics New Zealand has been working toward more open data. Opening data so that ordinary citizens can not just access – but use and build on it – has incredible potential.

But, the chief science adviser’s discussion paper was not about citizen-analysts doing science on open data. The title had led me to expect the wrong thing.

The translator in Knight’s story who figured out that “How to Serve Man” was the aliens’ cookbook had a far worse surprise but this one still jarred. Citizens were the objects of the analysis rather than the ones to be undertaking it: Citizen-based analysis meant analysing data gathered over a citizen's lifetime.

Sir Peter’s essay covers important ground. The government’s investment approach leans heavily on available data linked through Statistics New Zealand’s Integrated Data Infrastructure. That data, properly analysed, can help the government find out whether social programmes are achieving their desired goals. The government can then prioritise expenditure on programmes that work.

Better policy outcomes
A stronger evidence base makes it easier for better policy to be good politics – or at least non-suicidal politics. And it can help encourage a shift from spending-based political discourse, where the party that cares the most promises to spend the most, to a focus on outcomes.

All of that requires public confidence in the underlying statistics, in the analysis and in the security of data.

While Statistics New Zealand does publish a great deal of official data, the published summary tables are rather rigid. If you want to look at the data from an angle that differs from what was in the published tables, things quickly become difficult.

One solution would allow citizens access to anonymised samples of the data. Users can then explore the data as suits their needs.

The US makes anonymised samples of its main surveys easily and widely available. Anyone in the world with a web browser can download small representative samples of the Census, even for states as tiny as Wyoming – as well as other government data.

Investment in official statistics is then highly leveraged: other countries’ governments, including New Zealand’s, pay academics to analyse American data. The University of Minnesota maintains one archive of that data, along with a superb interface for accessing and analysing it. The university’s website also lists about 7000 academic papers citing its archive.

Better approach
Statistics New Zealand’s Confidentialised Unit Record Files (CURFS) are held behind guarded walls, application procedures and burdensome restrictions on real-world use. Because use of the data is then difficult, many New Zealand academics instead turn to more easily available American data.

A better approach would allow open access to those anonymised datasets but penalise improper use of them. Risks in broadening access to the CURFS is low but penalising misuse would help to mitigate any remaining issues.

Sir Peter suggests making it a criminal offence to re-identify individuals in anonymised datasets. The suggestion follows similar recommendations by the New Zealand Privacy Commissioner and policy moves by Australia.

Criminalising such misuse of anonymised data could increase public confidence in the security of government-collected data – and go a long way toward enabling citizen-based analysis of citizen-based data.

When analysis is restricted to a relatively small cadre of government-recognised experts, the amount we can learn from official data is limited by that choke-point: the relatively small number of analysts with access to the data. But when data is opened, academics and citizen-analysts around the country and around the world can chip in to help.

More open access
Government Statistician Liz MacPherson and I participated in an open-data panel discussion hosted by Koordinates earlier this month. When asked about more open access to the CURFs, the Government Statistician encouragingly replied, “Watch this space.”

And quietly, in the background, Statistics New Zealand has been thinking about better ways of enabling citizen access to citizen-based data.

A host of innovative techniques are available for ensuring the confidentiality of personal data while still allowing analysis. It may take some time to get there, as well as some back-end system upgrades to handle it all, but the potential is large.

The kind of citizen-based analysis recommended by Sir Peter should be something more than policy analysts tucked away in secure data labs, pouring over citizens’ data. Opening that data in meaningful, secure ways enables more citizens to do their own analysis.

When more analysts help to crunch the numbers, we have a better chance of finding out which policies work to serve man, and which work to serve man.

Stay in the loop: Subscribe to updates