Symposium 2001/10

6 July 2001

 

                                                                                                           English only

 

 

Symposium on Global Review of 2000 Round of

Population and Housing Censuses: 

Mid-Decade Assessment and Future Prospects

Statistics Division

Department of Economic and Social Affairs

United Nations Secretariat

New York, 7-10 August 2001

 

 

 

 

 

 

 

 

 

 

Post-enumeration surveys (PES’s): are they worth it?*

David C. Whitford and Jeremiah P. Banda **


Contents

 

Summary. 1

A. Introduction. 2

1. Purpose of a post-enumeration survey. 2

2. Problems and constraints associated with post-enumeration surveys. 2

B. Design and methodological issues. 3

1. The P or population sample. 3

2. The E sample. 4

3. Frames. 4

4. Sample design. 5

5. Listing. 5

6. Interviewing. 5

7. Matching. 6

8. Reconciliation. 6

9. Estimation. 6

C. Country practices. 8

1.  Cambodia. 8

2. Occupied Palestinian Territory. 9

3. Zambia. 10

4. Mongolia. 12

5.  Burundi and Rwanda. 12

6. Namibia. 13

7.  United States of America. 13

D. Use of PES results. 14

E. Lessons learned from country examples. 15

F. Conclusions and recommendations. 15

References. 17

 


Summary

Post-Enumeration Surveys: Are They Worth It or Not?

 

The post-enumeration survey (PES) is a method for evaluating the results of a census. As censuses become more complicated, and as the results of censuses are used for more and more policy and planning purposes, it is important to examine the quality and limitations of census data and to understand the types and extent of inaccuracies that occur. Several methods are available to evaluate censuses, including demographic analysis, comparison of census results with data from other sources and matching census responses with responses from interviews conducted during a PES. In many developing countries, alternative sources of population data are not available, so the PES is the major tool for evaluating the census.

 

Basically, a PES is an independent survey that replicates a census. The survey results are compared with census results, permitting estimates to be made of coverage and content errors. Coverage errors refer to people missed in the census or erroneously included, whereas content errors evaluate response quality of selected questions. The PES allows census organizations to uncover deficiencies in the methodology of the census and make adjustments for future censuses. PES results can also be used to adjust census results, although this is as likely to be a political decision as a technical one.

 

1.                  Ideally, to ensure independence, the PES would be undertaken by staff who have not worked on the census. In practice, the PES generally uses the most qualified census workers available and ensures that they work in different enumeration areas (EAs) in the PES if they also worked in the census. Like all survey work, if the PES methodology is flawed—for example, if it uses poor sample design and incomplete frames—the results may not be reliable. The PES draws a sample of the population, which can be chosen in several stages. When the sample is established, addresses of all housing units are listed, and interviewing begins. The PES normally takes place near enough to the census to ensure that people remember who was in the household on census day. This is particularly essential in a country that takes a census on a de facto basis. The next steps are matching the census and PES data and reconciling discrepancies.

 

2.                  Results from a PES have been useful to many countries. In Zambia, for example, the PES found that age reporting was more accurate than anticipated, and it helped analysts to notice at an early stage the effects of the HIV/AIDS epidemic on age structure. The main objective of Cambodia’s PES was to provide national-level estimates of coverage and content errors in the census. The PES was conducted in March 1998, two weeks after the census. Mongolia conducted a limited PES to evaluate census coverage; lack of funds precluded a more elaborate survey. In some EAs, the census and the PES were not independent operations, as evidenced by high agreement between census and PES results. Namibia faced some operational problems in its PES, including confusion over boundaries of EAs; failure to pre-list housing units or pre-test questionnaires; and lack of reconciliation procedures. In the United States of America, use of laptop computers for the PES interviews and an automated software system for matching improved the speed and quality of the work.

 

3.                  Post-enumeration surveys are worth conducting if they are carefully planned and well implemented. The PES methodology is adaptable to many circumstances, and the fact that the PES is carried out immediately after the census means that overhead costs may be greatly reduced. For the PES to succeed, its planners should develop good area frames with well-defined EAs; design plausible probability samples; adopt efficient but realistic matching rules; attempt to maintain independence between the census and the PES; use the same definitions and concepts in both the census and the PES; use well-trained field staff; carry out pre-tests for the PES and reconciliation; allocate adequate funds for the PES; include relevant and useful items for matching purposes; and keep the PES as simple as possible and set objectives that are attainable.


A. Introduction

4.                  Census taking is improving constantly throughout the world.  As censuses improve and people get used to using information from them, censuses undergo more and more scrutiny. For instance, given one census, the science of demographic analysis can produce independent estimates of population size for comparison with the results of the following census. Differences between the demographic analysis predictions and the actual results of the new census are inevitable. These differences can take the form of unreasonable sex ratios or wild disparities in age cohorts.

 

5.                  We reiterate that as censuses improve, they are used more and examined more. A country’s last census may not have undergone much scrutiny, but its next one could be a bombshell waiting to explode. As time passes, the need for self-evaluation of census results inevitably increases.

 

6.                  It should also be noted that a population census is the most extensive and expensive data-collection exercise any country can undertake. With vast amounts of resources spent, there is usually tremendous pressure on census takers to ensure that census results are accurate. As a result of the massive nature of the census operation, it is inevitable that some inaccuracies arise from deficiencies, including errors of coverage and response. The major difference among countries is the extent of such errors. This, however, does not diminish the importance of the census as long as users understand the limitations of the data and the errors do not affect the major uses of the data (Cambodia, National Institute of Statistics, 1999).

 

7.                  It is against this background that a number of methods for evaluating censuses have been developed. Such methods include demographic analysis, comparison of census totals with figures from other sources and matching census returns with those from interviews selected on the basis of a probability sample in a post-enumeration survey (PES). This paper focuses on the PES as a method of evaluation. The paper discusses the purpose of PES’s; problems and constraints associated with the PES; design and methodological issues; country experiences; uses of PES’s and suggestions to improve PES programmes.

1. Purpose of a post-enumeration survey

8.                  While a number of methods have been developed to evaluate census data, for many developing countries the PES seems to be the most ideal owing to paucity of appropriate data to facilitate the effective use of other methods. The lack or incompleteness of registration systems and absence of regular population and demographic surveys contributes to the lack of use or limited use of other methods of census evaluation. In general, a number of countries have relied primarily on PES methodology to evaluate the census undercount (Biemer et al., 2001).

 

9.                  Post-enumeration surveys are an accepted census self-evaluation tool. Typically, a PES is an independent survey that replicates a census. The survey and the census results are then compared (matched). The results of the comparison are used to measure the coverage and/or errors in content of the census. Estimates of net coverage, the number of people omitted in the census, the number erroneously enumerated and content error rates for specific questions are typical products of a PES.

 

10.              Additionally, these estimates can be broken down further into their component parts. One can design the survey so that reliable estimates of undercount or overcount can be obtained for the entire census, for geographic areas of interest in the census, and for any of a host of demographic characteristics, such as age, race and sex, for which one might desire census coverage statistics.

 

11.              The survey results also enable one to be able to uncover census methodologies or operations that, when implemented, produced less than desirable results. Suppose, for instance, that a high census omission rate was observed in rural areas. One might then use specific PES results to examine whether the rural errors were due to the omission of whole housing units. If so, this might well imply an incomplete census frame and cause one to re-examine the methodology for building an address list in rural areas.

 

12.              PES results can be used to adjust census results. Using a carefully designed survey, under- or overcounts can be converted into adjustment factors and the census population increased or decreased accordingly by these factors. Later in this paper we will discuss post-stratification and the need to ensure homogeneity within each adjustment cell. It has been reported that in some African countries, PES results have been used to support or defend census results when the accuracy of the census is challenged (Onsembe, 1999).

 

13.              In addition, censuses are used for many other purposes, such as updating population estimates; developing and updating sampling frames; correcting and updating population registers and the establishment and updating of key components of the Geographic Information System (GIS). These many uses suggest that there is a need to use an objective method for assessing coverage and content errors as a crucial step for concluding a census operation (Abu-Libdeh, 1999). Quality assurance alone, introduced at various stages of census operations, cannot ensure a complete evaluation of the qualitative and quantitative accuracy of census data (UN Economic Commission for Africa, 1975).

 

14.              In summary, post-enumeration surveys have many good purposes. They basically inform users regarding the quality of the census data. As stated earlier, providing limitations of published census data increases the confidence of informed users in such data. On the other hand, there are distinct limitations and constraints in managing and implementing the evaluation survey.

2. Problems and constraints associated with post-enumeration surveys

15.              Although a PES can be an important component of a census programme and can contribute to the process of building confidence in the census results, a poorly designed and executed survey can inflict considerable damage to census legitimacy. We list below some of the problems and constraints associated with PES’s:

 

·        Planning and management of a PES, ideally, have to be undertaken by a staff that is separate from the census staff. This is not usually the case in many countries;

 

·        The design of the survey—especially the matching step—is relatively complex. For example, in the United States planners continue to find design flaws in the matching system. However, as corporate experience grows, these flaws become more and more minor;

 

·        The survey must be independent of the census. In the survey’s sample areas, census results must not be biased by the implementation of the PES;

 

·        The PES interview itself is demanding. Usually it incorporates questions to determine if the respondent should really be counted at the residence in question. Also, the PES interview usually transpires after the census interview, at which point the respondent may feel overburdened and not be as forthcoming with accurate information;

 

·        Some of the developing countries lack technical personnel with experience and skills in survey methodology in general and PES in particular;

 

·        Past failures in some countries in conducting PES’s discourage such countries and others from conducting PES’s in the subsequent rounds;

 

·        Some of the countries, such as the United States of America, which have conducted PES’s, have not used the results to adjust population census figures. In such cases questions have been raised about the rationale for conducting PES’s;

 

·        In some countries, census planners feel it is enough to institute good-quality assurance procedures at various stages of census activities; therefore they see no need for a PES (UNSD/SADC, 2001).

 

16.              Lastly, some of the countries are ambivalent about conducting a PES because a census is usually a grueling and taxing operation, which saps the energy of those involved. The general fatigue it generates may be sufficient to discourage the conduct of the survey. Additionally, by the time the census enumeration is completed there is usually a feeling of accomplishment among census planners. They may, therefore, not see the need for conducting a PES, which, after all, may just expose glaring discrepancies between census and PES results to the detriment of the reputation of the census or statistical organization.

B. Design and methodological issues

17.              Considering whether a post-enumeration survey is worth it or not leads immediately to some decisions that have to be made regarding the design of the survey. These decisions revolve around what goals one has for the survey and what answers best suit the individual situation in which the survey will be conducted.

 

18.              We will assume in this paper that the goal of the PES interview is to establish carefully who lived in the subject housing unit on the day the census was officially taken. In the next step we match the results from the interview to appropriate census forms in a well-defined area around that subject housing unit.[1]

1. The P or population sample

19.              The central facet of a PES is measurement of census omissions. PES methodology calls the sample used to measure omissions the P sample or population sample. Roughly, one interviews this sample and compares (matches) it to the census results.

 

The resulting tallies can be represented in a two-by-two table: 

 

 

 

 

In census

 

Out of census

 

 

 

In PES

 

 

 

 

Out of PES

 

 

 

 

 

 

 

 

 

 

where

 

 

is the estimate of the number of people counted in both the census and the survey,

 is the estimate of the number of people counted in only the survey,

 is the estimate of the number of people counted in only the census,

is the estimate of the number of people missed by both the census and the survey,

is the total estimate of the number of people counted in the survey,

is the total number of people counted correctly in the census, and

is the estimate of the total number of people.

 

20.              The dual-system estimation model assumes independence between inclusion in the census and in the PES. (We elaborate on how independence is implemented later in this paper.) The dual-system estimate (DSE) of the total population is given by

 

=

 

21.              Simply stated, the DSE raises the census total by the ratio of the total estimate of the number of people in the PES divided by the estimate of the number  that matched to the census.

 

22.              In the section above on purposes of a PES, we discussed breaking down the DSE estimates by geographic areas and for any of a host of demographic characteristics, such as age, race and sex, for which one might desire census coverage statistics.  If direct estimates are desired for any of these breakdowns, one might post-stratify the sample results into the categories desired. The objective of post-stratification is to include in each dual-system estimate people who have similar capture probabilities in the census. 

2. The E sample

23.              Omissions are not the whole story in evaluating a census. Errors can be made in the census itself that affect the overall under- and overcount measurement:

 

·        The census can contain duplicate or multiple enumerations;

·        The census could have people or housing units ascribed to the wrong geographic location (and thus not matching the PES interview);

·        People could be less than perfectly enumerated—that is, there could be insufficient information for matching to the PES interview;

·        The census could have erroneously enumerated someone who should have been enumerated elsewhere or the enumerator could have made up a fictitious person.

 

24.              So, if one is interested in quantifying these errors and their effect on census coverage, a sample of the census enumerations has to be checked to tally the number of times these types of errors, called erroneous enumerations, occurred. In PES parlance, this sample is called the E sample. The section below on estimation explains how the quantification of these errors is incorporated into the dual-system estimation formula.

 

25.              A desirable option for the E sample is to draw it directly from the census for the sampled areas used in the P sample. This facilitates matching and helps ensure that the survey is balanced—that is, that one is searching for omissions in the exact same area where one is searching for erroneous enumerations. The area one searches for omissions and erroneous enumerations is called the search area. 

 

26.              For instance, the person might have lived elsewhere for the rest of the year. Some E‑sample units and the people in them will not match any of the PES interviews. They might have been missed in the P‑sample frame or truly erroneously enumerated. Since these people have not been asked the battery of questions to determine if a person should have actually been counted at the particular housing unit on census day, a follow-up operation is needed to determine if the unmatched people in the E‑sample unit were or were not erroneously enumerated—that is, whether one of the census errors listed above occurred or did not occur.  More about this follow-up interview is presented in the section below on reconciliation.

 

27.              Two other design decisions have to be made: What is the primary sampling unit for the survey? and What is the definition of cases to be included in the survey?  This leads into our next topic, the sampling frame of the survey.

3. Frames

28.              A popular choice for a sampling frame is to use an area sample for the coverage measurement survey.   The primary sampling unit can be the block.   Blocks are land areas surrounded by visible geographic features such as roads and streams. The frame, therefore, consists of creating a universe of blocks in the country and dividing those into sets of blocks (or clusters of blocks) that can be interviewed by a single interviewer within the allotted time.

 

29.              Another option is to use a survey that is already in place that is being taken around the time of the census.  This has the large advantage of using an existing organization to manage the PES.  It also has several disadvantages:

 

·        The existing survey may not be large enough, and supplementing it may be as complex as creating a specially designed survey;

·        Procedures may have to be augmented with the result that the quality of the existing survey and the PES suffers;

·        The ultimate sampling units may not lend themselves to being an efficient erroneous enumeration sample, where duplication, geocoding errors and so forth need to be easily discernible.

4. Sample design

30.              Above, we mentioned the option of using block clusters as a frame for the survey.  One might want to design the sample in several stages by first choosing a group of these clusters and then optimizing the sample by subsampling. For instance, the housing unit totals from a previous census might be used to choose the initial sample of clusters; then, after the addresses of all of the housing units in the sample clusters are listed, the block clusters might be divided into small, medium and large sampling strata, where

 

·        Small clusters might consist of 0-2 housing units;

·        Medium clusters might consist of 3-79 housing units; and

·        Large clusters might consist of 80 or more housing units.

 

31.              The next step could be to subsample some of these clusters:

 

·        Medium and large clusters might be subsampled whenever their actual counts of housing units differed significantly from what was expected (from the previous census);