FOI Audit; the curse of the PDF

Governments across Canada are embracing open data like never before, but it’s still a halfhearted commitment.

There are more open data sites than ever, at all levels of government. But the 2015 Newspapers Canada FOI Audit has found a huge discrepancy between the language of data openness, and the realities of trying to obtain data through freedom-of-information requests.

There are those who would say we should be patient, that it’s better to work with governments to encourage greater openness than to shout from the rooftops when they fail to perform. I think we need to do both.

The FOI audit, for which I am the project leader, is one of the ways we can keep the pressure on, and once again, there is lots to talk about. In particular, government officials are proving hard to wean off that most data-unfriendly of file formats, the PDF.

I don’t really understand why there is such allegiance to the PDF for data. The format is meant for publishing documents, so they will look the same as they do in a hard-copy format. That’s great for anything that might otherwise be released on paper, a budget document, a press release, documents released under FOI. But data, if it’s to be real data, needs to be in a format that can be imported into an analysis application without further conversion. PDFs are full of formatting information and other junk and require error-prone conversion if data is to be extracted from them.  PDFs that are actually embedded images are even worse. They are quite literally pictures of data.

Public bodies give various reasons for using PDFs for data. They say they have to, their software to process FOI requests produces PDF as the output format. Or they cite internal policy. Or they say they’re afraid a requester would alter the data (really?).  Sometimes they give no reason at all.

I think it’s likely that some of the FOI officers believe what they are saying; they’ve been trained to do things a certain way, so they do it. But at other times, it seems to be a deliberate ploy to avoid making machine-readable data available.

If open data is going to have any meaning, as we have said in the audit, it has to be open not only when officials have carefully reviewed and manicured it, but when citizens ask for it. Otherwise, the whole idea of open data is hollow.

We have a particularly long way to go with the federal government, which denied, at least in part, six in ten audit requests for data. In the audit, if data is released in PDF or on paper, it is recorded as denied in part.

You can read all the details of how Canadian municipalities, the provinces and the federal government responded to data requests in the full audit report.