Skip to main content

Amazing repository of population data

Data nerds like me will enjoy wallowing in the Integrated Public Use Microdata Series, a massive collection of U.S. Census microdata that's been made available to anyone for social and economic research. From the website:
IPUMS-USA is a project dedicated to collecting and distributing United States census data. Its goals are to:
  • Collect and preserve data and documentation
  • Harmonize data
  • Disseminate the data absolutely free!
We are cautioned to "use it for GOOD -- never for EVIL." 'Nuff said.

The project is funded by the National Science Foundation, Sun Microsystems, the University of Minnesota and the National Institutes of Health.

Comments

Popular posts from this blog

Recommended: a new review "zoo"

"A Tour Through the Visualization Zoo" is a fantastic introduction to some attractive and sophisticated new visualization formats. The article and illos were put together by Stanford's Jeffrey Heer, Michael Bostock, and Vadim Ogievetsky. Heer is an HCI/visualization genius whose journal articles I've been following with interest; Bostock is the whiz behind the D3 archive of javascript code for visualization.

Run, don't walk. It's great.

Blast from the past: a 1974 data treatise by Edward Tufte

Back in 1974, Yale poli-sci professor Edward Tufte published a slim volume called Data Analysis for Politics and Policy (Prentice-Hall, $3.95). The book in its entirety is available for free download (PDFs) at Tufte's website, accompanied by a contemporary review from the Journal of the American Statistical Association. More than 30 years later, the review amuses me with its restrained praise of the perspective that would eventually make Tufte a Major Figure (and a minor fortune):
Tufte puts residual plots to good use to gain understanding of a data set, and he shows how finding outliers gives the analyst hints about the inadequacy of a statistical model... The discussion of graphical techniques in general is quite good... A brief but compelling discussion of the "value of data as evidence," with regard to the interpretation of nonrandom samples, is presented. If you happen to have a spare 48MB lying about, DAPP's worth a download.

[via Sofa Papa]