Tuesday, August 19, 2008

Random Thought on Data Collection

Government data collection is tremendously important—the limits of government data sometimes dictate the limits of our knowledge. Not surprisingly, though, government data collection is consistently overlooked. Here's a very rough idea for how to improve government data collection: If a group of private citizens is willing to pay for a reasonable addition to the government's data collection, the group should be able to compel the government to make the addition to its data collection. Here's how it would work. The Consumer Expenditure Survey (CES) reports the amount consumers spend on "apparel and services," but doesn't break down consumers' spending on apparel any further. Suppose a handful of economists are studying the use of t-shirts, and they really wish the CES broke down consumer spending on apparel into spending on t-shirts vs. pants. If it would cost the Bureau of Labor Statistics an additional $50,000 to collect the data on consumer spending on t-shirts vs. pants, and the economists are willing to pay the BLS for the increased costs, then the economists should be able to compel the BLS to break down spending on apparel into spending on t-shirts vs. pants. Obviously, the government would have to set limits on the kinds of additions to its data collection practices that it's willing to make. For example, a private group couldn't pay the government to include on the Time Use Survey a question about the respondent's sexual encounters. But if it would be appropriate for the government to make a particular addition completely voluntarily, then private groups should be able to pay the government to make the addition. A lot of data collection can be done privately, of course. But there are economies of scale in data collection that make it more efficient for the government to collect certain additional data. Moreover, accuracy concerns tilt the scales in favor of having the government collect certain additional data. As long as the government deems the additional data collection appropriate, then why shouldn't private groups be able to pay the government to conduct the additional data collection? I don't know how you would solve the collective action problem—that is, how to stop one of the economists studying t-shirts from refusing to help pay for the addition to the CES but then using the addition in his research. The idea of granting the paying parties a temporary monopoly on the use of the additional data, like in intellectual property, makes me very uneasy. Even with the collective action problem, though, I think my plan would improve government data collection at no (monetary) cost to the taxpayer. This is an incomplete thought, so feel free to offer your input.


ei said...

Can I just point out that the word "privacy" makes no appearance in this post? ;-)

One can make the case that privacy is a solution to a second-best problem. In a world with perfect markets and perfect government, more public information is always better.

But what if government is less than perfect?

Economics of Contempt said...

I thought the inclusion of privacy concerns was implicit in my requirement that the government deem the additional data collection appropriate. It is odd that I never actually used the word "privacy" though.

Ideally, the public's privacy rights would be completely unchanged under the policy. If privacy concerns would prevent the government from collecting the additional data now, then they would also prevent the additional data collection under my policy.

GFW said...

Here's a simple objection. The data we gather can affect the type of questions we ask of the data as much as the other way around. Thus, if you allow well funded groups to influence (even if only by addition) the content of the dataset, they may be able to influence the framing of debate that is informed by that dataset.

Sure, your t-shirt example is completely innocuous, but there have got to be other surveys where the data gathered would substantially affect the conclusions drawn.

Taking it further, wouldn't the government have an incentive to drop the "unsponsored" data elements?

Economics of Contempt said...


On your first objection, I agree that some additional data would be used to try to influence the framing of certain debates. But however undesirable that may be, it's not a legitimate reason to withhold the additional data. Basically, we'd be saying to people who want the additional data, "It's entirely appropriate for you to have this information and you're willing to pay the bill, but we don't think you'll use this information wisely, so you can't have it." People misuse existing government-collected data all the time, but the solution isn't to stop collecting the data being misused.

On your second objection, I have no response. You're right, and I can't think of any way to solve that problem. We can't, for example, require government agencies to keep collecting all the data they collect with taxpayer money, because the government is constantly changing what data it collects.

And taking your objection even further, my policy might give the government an incentive to drop so much unsponsored data (or drop such important unsponsored data) that the policy ends up hurting the quality of government data collection.

That might be a deal-breaker. Damn, I thought I had a good idea.


