Skip to main content

A National Data Agency?

Over at Eager Eyes, Robert Kosara has a suggestion for the Obama Administration. He points out that making the government's raw data available to the public could enhance governmental transparency* and lead to some new ways of looking at the country's problems.
The challenge is not only data availability. A lot of data is, in fact, available. The US is the most transparent nation in the world – to an extent that can be frightening to an outsider (think pay data for state employees, property tax data, etc.).

The challenge is that a lot of data is published in a format that is human-readable, not machine-readable. This might sound like a good thing, but it's not. Machine-readable data can be processed and transformed into any number of human-readable forms, that direction is trivial. Making human-readable data accessible to a machine is much more difficult, error-prone, and expensive.

What we need is a National Data Agency (NDA). This agency would be tasked with collecting data that all other agencies collect and produce, and making it available in a central place and in electronic, machine-readable form. There could and should be a reasonable data presentation on its website, perhaps even a National Data Dashboard (showing data of interest like debt, spending, jobless rate, etc.). But the bulk of data analysis would be left to third parties: analysts, journalists, citizens (and also aliens like me). Easily available data would make for more insightful reporting, more informed decisions, and endless business opportunities.
He even has a seal for the NDA already made up. Where do I apply?

* Frankly, I have a feeling we'll all be very upset once we actually see what's going on.

Comments

Popular posts from this blog

Recommended: a new review "zoo"

"A Tour Through the Visualization Zoo" is a fantastic introduction to some attractive and sophisticated new visualization formats. The article and illos were put together by Stanford's Jeffrey Heer, Michael Bostock, and Vadim Ogievetsky. Heer is an HCI/visualization genius whose journal articles I've been following with interest; Bostock is the whiz behind the D3 archive of javascript code for visualization.

Run, don't walk. It's great.

Blast from the past: a 1974 data treatise by Edward Tufte

Back in 1974, Yale poli-sci professor Edward Tufte published a slim volume called Data Analysis for Politics and Policy (Prentice-Hall, $3.95). The book in its entirety is available for free download (PDFs) at Tufte's website, accompanied by a contemporary review from the Journal of the American Statistical Association. More than 30 years later, the review amuses me with its restrained praise of the perspective that would eventually make Tufte a Major Figure (and a minor fortune):
Tufte puts residual plots to good use to gain understanding of a data set, and he shows how finding outliers gives the analyst hints about the inadequacy of a statistical model... The discussion of graphical techniques in general is quite good... A brief but compelling discussion of the "value of data as evidence," with regard to the interpretation of nonrandom samples, is presented. If you happen to have a spare 48MB lying about, DAPP's worth a download.

[via Sofa Papa]