How many Americans are poor? How big is the gap in economic well-being between different groups? Do government programs help? What holes in the safety net remain?

Our understanding of these central questions about economic well-being in the United States is based on error ridden data. Government surveys, used for official poverty and income statistics, suffer from a large and growing amount of misreporting of income. Some researchers have turned to administrative data due to the widely recognized problems with surveys. But administrative data sources on their own do not capture the full set of resources available to individuals, and they do not contain the rich demographic information available in surveys that enables focus on vulnerable groups.

The unfortunate consequence is that our understanding of economic well-being in the United States is biased and incomplete. As a result, policymakers are forced to address poverty and economic disparities in the dark.

Our mission is to build the most accurate dataset on economic well-being ever created for the United States.

We are building the Comprehensive Income Dataset (CID, pronounced the same as “kid”) to address the inaccuracies in our basic understanding of economic well-being in the United States. The CID is based on the fundamental insight that no single data source on its own can provide a full or accurate measure of economic well-being. But when multiple data sources are linked together, the strengths of each data source can be harnessed while overcoming their individual limitations. We link together several national household surveys, an extensive set of tax records, and numerous federal and state administrative program data on government benefits. We conduct rigorous research and apply cutting edge statistical techniques to combine these data sources in a way that maximizes the accuracy of our comprehensive income measures.

We are using the Comprehensive Income Dataset to produce a highly accurate understanding of deprivation and economic disparities in the United States.

The CID is transforming our understanding of the most important questions about economic well-being in the United States. The CID enables us to estimate accurate measures of poverty, deep poverty and extreme poverty in a given year and over multiple decades. We can also estimate income disparities between different groups and the extent of income inequality in the United States. And because our goal is to provide the most complete possible understanding of economic well-being for the entire U.S. population, we are developing new methods and linking novel additional data sources that allow us to understand populations under-covered or not covered at all by major household surveys—including people experiencing homelessness.

We are creating a new evidence base that will inform the next generation of policies that seek to improve the well-being of the most disadvantaged members of society.

Policymakers require an accurate and complete understanding of the economic well-being of the U.S. population in order to target assistance to the most needy and address disparities. The CID allows us to accurately assess how existing programs are addressing deprivation and where holes remain.  It will also allow the examination of the effects of changes in government policies as well as simulations of potential new policies. This transformative evidence base will enable policymakers to make highly informed decisions about some of the most pressing public policy questions today.

The Comprehensive Income Dataset will become the preeminent tool for other researchers seeking to understand economic well-being in the United States.

The CID will ultimately be made available to a broader community of researchers, enabling them to study the impact of policies on accurately measured economic well-being in the United States. It will also inform efforts by the Census Bureau and other government agencies to improve official income and poverty statistics. We will facilitate this goal by making public the “blueprints” for the CID. Careful documentation and rigorous empirical evidence form the basis of our decisions for how to combine the data source. Making this research public will allow other researchers to understand why the CID was constructed in the way it was and also allow them to make different decisions in their own research.

