Skip to main content

Let us now praise data dumps

As someone who deeply appreciates raw data and admires those who wrangle it, I have to give a shoutout to my fellow travelers out there:
  • Big bunches of ripe bananas to the primates over at Infochimps, "a community to assemble and interconnect a giant free almanac, with tables on everything you can put in a table—things like a century of hourly weather, every major league baseball game, decades of stock prices, or every US patent filing." Check out the (still small but interesting) visualization gallery.
  • Swivel, which aims to "make it easy for everyone to collaborate and explore data together — because better informed people make better decisions: in voting booths, in corporate boardrooms and at neighborhood meetings." A worthy undertaking, and the site's getting a lot of press. (Sadly, the graphics it offers are little more than rudimentary.)
Now that I'm an aspiring Tufte, I'll definitely be keeping an eye out for more of these troves.


Popular posts from this blog

Recommended: a new review "zoo"

"A Tour Through the Visualization Zoo" is a fantastic introduction to some attractive and sophisticated new visualization formats. The article and illos were put together by Stanford's Jeffrey Heer, Michael Bostock, and Vadim Ogievetsky. Heer is an HCI/visualization genius whose journal articles I've been following with interest; Bostock is the whiz behind the D3 archive of javascript code for visualization.

Run, don't walk. It's great.

Blast from the past: a 1974 data treatise by Edward Tufte

Back in 1974, Yale poli-sci professor Edward Tufte published a slim volume called Data Analysis for Politics and Policy (Prentice-Hall, $3.95). The book in its entirety is available for free download (PDFs) at Tufte's website, accompanied by a contemporary review from the Journal of the American Statistical Association. More than 30 years later, the review amuses me with its restrained praise of the perspective that would eventually make Tufte a Major Figure (and a minor fortune):
Tufte puts residual plots to good use to gain understanding of a data set, and he shows how finding outliers gives the analyst hints about the inadequacy of a statistical model... The discussion of graphical techniques in general is quite good... A brief but compelling discussion of the "value of data as evidence," with regard to the interpretation of nonrandom samples, is presented. If you happen to have a spare 48MB lying about, DAPP's worth a download.

[via Sofa Papa]