A Study In Absurdity

Nate Silver takes pollster Veronique de Rugy of George Mason University to task for some shoddy data work:
The study (pdf) ... claims that congressional districts which elected a Democrat to the Congress received a larger amount of stimulus finds by a margin which is statistically significant even after controlling for certain other effects like the unemployment rate. However, the study does not control for at least one other variable that is overwhelmingly important in determining the dispensation of stimulus funds.

The variable in question is in fact pretty obvious if you simply look at the districts that have received the largest amount of stimulus money, according to de Rugy's dataset.

The district that received the largest amount of stimulus funding in the 4th Quarter of 2009, according to de Rugy's tally, is California's 5th Congressional District. Is there anything notable about the 5th Congressional? Well, it is home to the state capital, Sacramento. Let's keep that in mind.

Next on the list is New York's 21st Congressional District. The largest city in the 21st is the state capital of New York, Albany.

Third is the 21st Congressional District of Texas. It contains parts of Texas' state capital, the wonderful city of Austin.


That de Rugy has testified before Congress on the basis of her evidence, and never paused to consider why the top five congressional districts on her list overlap with Sacramento, Albany, Austin, Tallahassee and Harrisburg, is mind-boggling. The presence of a state capital is the overwhelmingly dominant factor it predicting the dispensation of stimulus funds. This could have been discerned in literally five minutes if she had bothered to look at the apparent outliers in her dataset and considered whether they had anything in common -- a practice that should be among the first things that any researcher does when evaluating any dataset.
Read the whole thing. It's worth it.