The Silver Lining of the 2020 Census

Photo Credit: Dimitrios Karamitros/Shutterstock

Every census produces not just one data product, but various products that serve different purposes and are delivered to different audiences over a period of months.

The first polished data product the census produces is both its simplest – and most consequential. The “reapportionment data file” is normally supplied toward the end of the census year. This file includes the number of people resident in each state and their household address. Nothing more. But the aggregate numbers are used to allocate seats in the House of Representatives and Electoral College Votes proportionate to the size of each state’s population.

With this responsibility in mind, the Census Bureau defines its accuracy target as counting every resident once (no undercount) and only once (no overcount) at the right address (no location errors).

In a normal census, field operations take three to four months, divided between mailing forms out and getting them back, and follow up to non-responders by trained enumerators. (There are a few other procedures to count overseas military, foreign service officials, the homeless, etc., though none take very long.)

In any case, these raw census numbers invariably have errors – a lot of them. Mistakes have been a constant feature of census-taking since 1790.

President Washington complained about the lazy, incompetent enumerators who conducted the first census, and about the people who dodged the enumerators for fear that census information could be used against them (perhaps to collect taxes). He instructed Jefferson to estimate the undercount, which boosted the count from just under four million to just over that number – which, reasoned Washington, would signal vitality and strength to the British, and warn them not to even think of reasserting colonial rule.

The census, and its multiple uses, have come a long way since 1790, but never has the initial enumeration been error-free. Never has its first-cut count been released to the public.

What has always followed the raw enumeration phase is a “find and fix the errors” phase, which has steadily – even dramatically – improved across the history of census taking.

For the 2020 reapportionment data, the error correction phase of the process was expected to take about five months. For more complicated data products — redistricting data with race, ethnicity, gender, age, and household composition characteristics presented with block-level granularity – it was expected to take another several months. Some quality checks take a year or two to complete: for example, the undercount (to estimate how many people were missed by enumerators) and the overcount (to eliminate people who were counted twice by mistake).

In 1998, when I first found my way to census headquarters in Suitland, Maryland – as its very naïve and untested Director – my earliest discovery was that hundreds of its employees spent a vast amount of time and effort correcting mistakes in the many surveys underway, just as they would soon be doing for the 2000 decennial census. The staff liked doing this work. In fact, nothing better describes the culture of the Bureau than its relentless attention to errors and the satisfaction felt when it invents new, improved techniques for catching, and correcting, mistakes.

As everyone knows by now, the 2020 census has been abnormal in two important ways.

First, no previous decennial census in American history had to deal with a pandemic. Major field work was ramping up in mid-March, 2020, but so was Covid-19. Field work was put on hold for two-months. As a result, the census was immediately off-schedule.

Then there was the second abnormality: an unprecedented stream of lawsuits and political directives from the White House, generating uncertainty about big issues. Would a citizenship question be added to the census form itself? Would undocumented residents be included in the apportionment count? (The answer is still unclear.) At the same time, the White House was adding political appointees to senior census staff, creating still more confusion and conflict over who was actually calling the shots.

In recent weeks, the most disconcerting development has been the forced rush by the Department of Commerce (the Bureau’s immediate boss) to complete the census process.

The Census Bureau has announced that 99.9% of the country has been counted. That is a great achievement, but also misleading. These raw data have not yet been cleaned of many errors and mistakes. In this most basic sense, the census is not over. But the White House has denied the Bureau the time and resources necessary to produce numbers at the customary level of accuracy, much higher than the initial raw count. The White House wants delivery, irrespective of accuracy, for political reasons.

Consider one of dozens of quality procedures customarily completed before data reports are released. In every census, some people are erroneously counted twice. It takes months to track those errors down, but – given time – the Bureau can do it, which is among the clean-up tasks that occur in the five or so months between field work and the end-of-year release of reapportionment counts.

In late March of 2020, just as the census was getting underway in full force, the pandemic sent thousands of college students’ home. The proper place to count them was their college town. But some number, probably large, were also counted by their parents because on April 1, census day, that is where they were. The “some number, probably large” has to be a guess unless and until the Bureau is given time to find and fix double-counting. Another example: millions of well-off Americans have a permanent residence and a week-end place, and this spring, because of the pandemic many of them fled the former for the latter at the height of the census process – another source of potential double counts.

The list is long, which prompted a small group of ex-census employees (including three ex-directors) and other census experts to produce a report outlining the full array of census quality indicators, published this month as the 2020 Census Quality Indicators Task Force, A Report from the American Statistical Association.

This report had two target audiences. First are those in a position to influence the quality of the 2020 census, even at this late date – namely, the Census Bureau leadership, the Commerce Department, Congress, the Supreme Court, key advocacy groups, and, if Biden is elected president, his transition team.

One example of what is still possible is captured in this passage of the Report:

“The Census Bureau has measured the quality of decennial censuses for decades. Many of these indicators have been used in the past but have only been released to the public at the national level. However, the indicators we are recommending are different in two ways. Because of the truncated timeframe and the effects of the pandemic and multiple natural disasters, we believe it is important for the Census Bureau to make the quality assessment results available to the public at the census tract level in order to ascertain the extent to which some areas may have been counted more accurately than others and determine the data’s fitness for various uses. In addition, many of the indicators from the field processes are newly available this decade due to the automation of NRFU [Nonresponse Follow-up Operation]. Daily processing and assignment of the NRFU cases produce a wealth of data to evaluate the quality and progress of NRFU that was not available in previous censuses. The Census Bureau’s current plan for quality assessment is unknown, and the compressed schedule has eliminated many quality-control steps that the Bureau would have included before releasing the apportionment data. However, the 99 percent completion rate by state publicly released to date is insufficient to measure quality.”

Elsewhere, the Report recommends that experts not on the Census Bureau payroll be granted access to the quality indicators specifically linked to the reapportionment and redistricting data, in hopes of “restoring public trust.” These data are currently being litigated and drawn into partisan debates. Media coverage raises questions about the impartiality of the data and whether the Bureau is sufficiently insulated from political interference. That this is generally unfair to the Bureau is certainly the case, but it is unavoidable in a highly charged political election that overlaps census data releases central to reapportionment and redistricting.

Independent assessment of quality indicators is unusual, but so are the current conditions. It is in the interest of the Census Bureau and the Report audiences noted above that any reasonable step that can build public trust be taken. This Report, incidentally, is one among many efforts to protect the census by emphasizing quality. The Census Quality Reinforcement initiative, based at Georgetown University, is broader in scope and though not yet publicly available, will soon be a rich extension of and compliment to the American Statistical Association report.

I noted above that there are two audiences, the second of which explains the “silver lining” in my title: It is American public circa 2025, as the 2030 census design comes into view.

This is not the place to outline that design beyond saying that it is already under active discussion, inside the Census Bureau and in various nonpartisan settings intent on protecting the census from the damaging political treatment it is being subjected to in 2020. The census has always and always will be political. However, it has always and always will be scientific. The latter explains why quality indicators have always and always will have a key role in census-taking.

The 2020 census marks a turning point in the visibility of quality indicators; this is the silver lining in my title. The White House, by cutting short a key step in the census process, has inadvertently helped publicize that this census, like all those that preceded it, has errors – and that the Bureau is adept at finding and fixing them. This has attracted widespread media attention.

A census without ample time and staff to execute quality checks is a flawed census. Although no census is perfect, the difference between one where quality controls are fully applied and one where they are cut short is the difference between fit for purpose, or not fit.

This years’ experience with the census has highlighted the critical importance of making sure that the census is of the highest possible quality. The need for quality indicators will be discussed and debated in the press, and studied by academics, who will doubtless find ways to improve still further the quality of our census data. Census data users will demand such quality indicators, and will insist that they are protected by regulation. Congress will hold hearings that feature error correction, perhaps as soon as 2021. As a result, the quality indicators of the census will, in time, become as highly visible as the enumeration itself.

This adds up to a significant step forward for the Census Bureau and its army of error finders.

They have too long gone unnoticed. The time has come to celebrate their work with as much enthusiasm as we celebrate the work of enumerators.

Kenneth Prewitt is Carnegie Professor, Columbia University, and ex-Director of the Census Bureau.