Written by: Director Robert Groves
As prior blogs have noted, there are 3 ways to evaluate the quality of a census: 1) process indicators that describe the operations, 2) comparisons with other estimates of population size, and 3) a sample survey matched to the census, often called a “post-enumeration survey.” The first two sets of evaluations suggest that the 2010 Census was a good one. On Tuesday, May 22, we’ll report the results of the statistical estimates of coverage of the 2010 Census based on our post-enumeration survey, labeled “Census Coverage Measurement” or CCM.
For the first time, we will also break the estimates into components of coverage. These include correct and erroneous enumerations in the census, tallies of the number of people for whom all characteristics were imputed (inserted statistically), and estimates of how many people were missed in the census.
We will provide the estimates for the nation; major demographic groups, such as by race, by age and sex categories, and for owners and renters; by important census operations (mailout/mailback, update/leave, and interviewing in person); and for the 50 states, the District of Columbia, and large counties and places.
These results will help us as we plan and conduct research to improve the 2020 Census. On the other hand, results from the CCM have limitations. The estimates have sampling error and a vulnerability to violations of the underlying assumptions.
Net Coverage Error and Components of Coverage
The net coverage error of a census is defined as the difference between the true population size and the census count. Although the net coverage error provides valuable information about the census count, it is a single number that summarizes the results of various census operations and actual events. For example, the estimated net coverage error for the 2000 Census was very small (we estimated a net overcount of 0.49%, with a standard error of 0.20%). However, evaluations showed that a large number of erroneous enumerations offset a large number of omissions.
To address this, in addition to measuring net coverage error, the CCM program estimated separately the components of census coverage. First, we divided the number of census records into estimates of its three components—correct enumerations, erroneous enumerations, and misses filled-in by whole-person imputations (people for whom all characteristics were imputed). “Erroneous enumerations” are census records that should not have been included, such as duplicates, fictitious people, people who died before Census Day (April 1, 2010), or were born after Census Day. Second, we split the estimated population into those people deemed to be correct enumerations and those who were missed in the census.
Estimating the number of omissions (those missed in the census) is complicated by practical and conceptual issues. As the missing component of the true population size, omissions cannot be collected and analyzed directly, but can be estimated by deduction through dual system estimation. As with correct and erroneous enumerations, defining what should qualify as an omission is difficult. For example, for some housing units captured in the census, the records of the entire household are whole-person imputations, that is, treated as missing data and filled in statistically from another housing unit. Should such records be considered as omissions from the census? Valid arguments on each side of the issue render it a difficult one, with no solution that satisfies every intended use of the data. We will present estimates in as transparent way as possible to facilitate such arguments.
The coverage estimates from the post-enumeration survey should be used to build a better 2020 Census. By examining what operations or procedures suffered the most erroneous enumerations or omissions, we can construct targeting features in 2020. By knowing how different successive coverage improvement operations affect the estimated coverage, we can make better cost-quality tradeoff decisions for 2020.