|
Symposium 2001/10 6 July 2001 English only
|
Symposium on Global Review of 2000 Round
of
Population and Housing Censuses:
Mid-Decade Assessment and Future
Prospects
Statistics
Division
Department of
Economic and Social Affairs
United Nations
Secretariat
New York, 7-10
August 2001
David C.
Whitford and Jeremiah P. Banda **
1.
Purpose of a post-enumeration survey
2.
Problems and constraints associated with post-enumeration surveys
B. Design and methodological issues
2.
Occupied Palestinian Territory. 9
E. Lessons learned from country examples
F. Conclusions and recommendations
Post-Enumeration Surveys: Are They Worth It or Not?
The
post-enumeration survey (PES) is a method for evaluating the results of a
census. As censuses become more complicated, and as the results of censuses are
used for more and more policy and planning purposes, it is important to examine
the quality and limitations of census data and to understand the types and
extent of inaccuracies that occur. Several methods are available to evaluate
censuses, including demographic analysis, comparison of census results with
data from other sources and matching census responses with responses from
interviews conducted during a PES. In many developing countries, alternative
sources of population data are not available, so the PES is the major tool for
evaluating the census.
Basically,
a PES is an independent survey that replicates a census. The survey results are
compared with census results, permitting estimates to be made of coverage and content
errors. Coverage errors refer to people missed in the census or erroneously
included, whereas content errors evaluate response quality of selected
questions. The PES allows census organizations to uncover deficiencies in the
methodology of the census and make adjustments for future censuses. PES results
can also be used to adjust census results, although this is as likely to be a
political decision as a technical one.
1.
Ideally, to
ensure independence, the PES would be undertaken by staff who have not worked
on the census. In practice, the PES generally uses the most qualified census
workers available and ensures that they work in different enumeration areas
(EAs) in the PES if they also worked in the census. Like all survey work, if
the PES methodology is flawed—for example, if it uses poor sample design and
incomplete frames—the results may not be reliable. The PES draws a sample of
the population, which can be chosen in several stages. When the sample is
established, addresses of all housing units are listed, and interviewing
begins. The PES normally takes place near enough to the census to ensure that
people remember who was in the household on census day. This is particularly
essential in a country that takes a census on a de facto basis. The next steps
are matching the census and PES data and reconciling discrepancies.
2.
Results
from a PES have been useful to many countries. In Zambia, for example, the PES
found that age reporting was more accurate than anticipated, and it helped
analysts to notice at an early stage the effects of the HIV/AIDS epidemic on
age structure. The main objective of Cambodia’s PES was to provide
national-level estimates of coverage and content errors in the census. The PES
was conducted in March 1998, two weeks after the census. Mongolia conducted a
limited PES to evaluate census coverage; lack of funds precluded a more
elaborate survey. In some EAs, the census and the PES were not independent
operations, as evidenced by high agreement between census and PES results.
Namibia faced some operational problems in its PES, including confusion over
boundaries of EAs; failure to pre-list housing units or pre-test
questionnaires; and lack of reconciliation procedures. In the United States of
America, use of laptop computers for the PES interviews and an automated
software system for matching improved the speed and quality of the work.
3.
Post-enumeration
surveys are worth conducting if they are carefully planned and well
implemented. The PES methodology is adaptable to many circumstances, and the
fact that the PES is carried out immediately after the census means that
overhead costs may be greatly reduced. For the PES to succeed, its planners
should develop good area frames with well-defined EAs; design plausible
probability samples; adopt efficient but realistic matching rules; attempt to
maintain independence between the census and the PES; use the same definitions
and concepts in both the census and the PES; use well-trained field staff;
carry out pre-tests for the PES and reconciliation; allocate adequate funds for
the PES; include relevant and useful items for matching purposes; and keep the
PES as simple as possible and set objectives that are attainable.
4.
Census
taking is improving constantly throughout the world. As censuses improve and people get used to using information from
them, censuses undergo more and more scrutiny. For instance, given one census,
the science of demographic analysis can produce independent estimates of
population size for comparison with the results of the following census.
Differences between the demographic analysis predictions and the actual results
of the new census are inevitable. These differences can take the form of
unreasonable sex ratios or wild disparities in age cohorts.
5.
We
reiterate that as censuses improve, they are used more and examined more. A
country’s last census may not have undergone much scrutiny, but its next one
could be a bombshell waiting to explode. As time passes, the need for
self-evaluation of census results inevitably increases.
6.
It should
also be noted that a population census is the most extensive and expensive
data-collection exercise any country can undertake. With vast amounts of
resources spent, there is usually tremendous pressure on census takers to
ensure that census results are accurate. As a result of the massive nature of
the census operation, it is inevitable that some inaccuracies arise from
deficiencies, including errors of coverage and response. The major difference
among countries is the extent of such errors. This, however, does not diminish
the importance of the census as long as users understand the limitations of the
data and the errors do not affect the major uses of the data (Cambodia,
National Institute of Statistics, 1999).
7.
It is
against this background that a number of methods for evaluating censuses have
been developed. Such methods include demographic analysis, comparison of census
totals with figures from other sources and matching census returns with those
from interviews selected on the basis of a probability sample in a
post-enumeration survey (PES). This paper focuses on the PES as a method of
evaluation. The paper discusses the purpose of PES’s; problems and constraints
associated with the PES; design and methodological issues; country experiences;
uses of PES’s and suggestions to improve PES programmes.
8.
While a
number of methods have been developed to evaluate census data, for many
developing countries the PES seems to be the most ideal owing to paucity of
appropriate data to facilitate the effective use of other methods. The lack or
incompleteness of registration systems and absence of regular population and
demographic surveys contributes to the lack of use or limited use of other
methods of census evaluation. In general, a number of countries have relied
primarily on PES methodology to evaluate the census undercount (Biemer et al.,
2001).
9.
Post-enumeration
surveys are an accepted census self-evaluation tool. Typically, a PES is an
independent survey that replicates a census. The survey and the census results
are then compared (matched). The results of the comparison are used to measure
the coverage and/or errors in content of the census. Estimates of net coverage,
the number of people omitted in the census, the number erroneously enumerated
and content error rates for specific questions are typical products of a PES.
10.
Additionally,
these estimates can be broken down further into their component parts. One can
design the survey so that reliable estimates of undercount or overcount can be
obtained for the entire census, for geographic areas of interest in the census,
and for any of a host of demographic characteristics, such as age, race and
sex, for which one might desire census coverage statistics.
11.
The survey
results also enable one to be able to uncover census methodologies or
operations that, when implemented, produced less than desirable results.
Suppose, for instance, that a high census omission rate was observed in rural areas.
One might then use specific PES results to examine whether the rural errors
were due to the omission of whole housing units. If so, this might well imply
an incomplete census frame and cause one to re-examine the methodology for
building an address list in rural areas.
12.
PES results
can be used to adjust census results. Using a carefully designed survey, under-
or overcounts can be converted into adjustment factors and the census
population increased or decreased accordingly by these factors. Later in this
paper we will discuss post-stratification and the need to ensure homogeneity
within each adjustment cell. It has been reported that in some African
countries, PES results have been used to support or defend census results when
the accuracy of the census is challenged (Onsembe, 1999).
13.
In
addition, censuses are used for many other purposes, such as updating
population estimates; developing and updating sampling frames; correcting and
updating population registers and the establishment and updating of key
components of the Geographic Information System (GIS). These many uses suggest
that there is a need to use an objective method for assessing coverage and
content errors as a crucial step for concluding a census operation (Abu-Libdeh,
1999). Quality assurance alone, introduced at various stages of census
operations, cannot ensure a complete evaluation of the qualitative and
quantitative accuracy of census data (UN Economic Commission for Africa, 1975).
14.
In
summary, post-enumeration surveys have many good purposes. They basically
inform users regarding the quality of the census data. As stated earlier,
providing limitations of published census data increases the confidence of
informed users in such data. On the other hand, there are distinct limitations and
constraints in managing and implementing the evaluation survey.
15.
Although a
PES can be an important component of a census programme and can contribute to
the process of building confidence in the census results, a poorly designed and
executed survey can inflict considerable damage to census legitimacy. We list
below some of the problems and constraints associated with PES’s:
·
Planning
and management of a PES, ideally, have to be undertaken by a staff that is
separate from the census staff. This is not usually the case in many countries;
·
The design
of the survey—especially the matching step—is relatively complex. For example,
in the United States planners continue to find design flaws in the matching
system. However, as corporate experience grows, these flaws become more and
more minor;
·
The survey
must be independent of the census. In the survey’s sample areas, census results
must not be biased by the
implementation of the PES;
·
The PES interview
itself is demanding. Usually it incorporates questions to determine if the
respondent should “really” be counted at the residence in question. Also, the
PES interview usually transpires after the census interview, at which point the
respondent may feel overburdened and not be as forthcoming with accurate
information;
·
Some of the
developing countries lack technical personnel with experience and skills in
survey methodology in general and PES in particular;
·
Past
failures in some countries in conducting PES’s discourage such countries and
others from conducting PES’s in the subsequent rounds;
·
Some of the
countries, such as the United States of America, which have conducted PES’s,
have not used the results to adjust population census figures. In such cases
questions have been raised about the rationale for conducting PES’s;
·
In some
countries, census planners feel it is enough to institute good-quality
assurance procedures at various stages of census activities; therefore they see
no need for a PES (UNSD/SADC, 2001).
16.
Lastly,
some of the countries are ambivalent about conducting a PES because a census is
usually a grueling and taxing operation, which saps the energy of those
involved. The general fatigue it generates may be sufficient to discourage the
conduct of the survey. Additionally, by the time the census enumeration is
completed there is usually a feeling of accomplishment among census planners.
They may, therefore, not see the need for conducting a PES, which, after all, may
just expose glaring discrepancies between census and PES results to the
detriment of the reputation of the census or statistical organization.
17.
Considering
whether a post-enumeration survey is worth it or not leads immediately to some
decisions that have to be made regarding the design of the survey. These
decisions revolve around what goals one has for the survey and what answers
best suit the individual situation in which the survey will be conducted.
18.
We will
assume in this paper that the goal of the PES interview is to establish
carefully who lived in the subject housing unit on the day the census was
officially taken. In the next step we match the results from the interview to
appropriate census forms in a well-defined area around that subject housing
unit.[1]
19.
The central
facet of a PES is measurement of census omissions. PES methodology calls the
sample used to measure omissions the P sample or population sample.
Roughly, one interviews this sample and compares (matches) it to the census
results.
The resulting tallies can be represented
in a two-by-two table:
|
|
In census |
Out of
census |
|
|
In PES |
|
|
|
|
Out of
PES |
|
|
|
|
|
|
|
|
where
is the estimate of the number of people counted
in both the census and the survey,
![]()
is the estimate of the number of people
counted in only the survey,
![]()
is the estimate of the number of people counted in only the
census,
![]()
is the estimate of the number of people
missed by both the census and the survey,
![]()
is the total estimate of the number of
people counted in the survey,
![]()
![]()
is the estimate of the total number of
people.
20.
The
dual-system estimation model assumes independence between inclusion in the
census and in the PES. (We elaborate on how independence is implemented later in
this paper.) The dual-system estimate (DSE) of the total population is given by
=![]()
21.
Simply
stated, the DSE raises the census total by the ratio of the total estimate of the
number of people in the PES divided by the estimate of the number
that matched to the
census.
22.
In the
section above on purposes of a PES, we discussed breaking down the DSE
estimates by geographic areas and for any of a host of demographic
characteristics, such as age, race and sex, for which one might desire census
coverage statistics. If direct
estimates are desired for any of these breakdowns, one might post-stratify the
sample results into the categories desired. The objective of
post-stratification is to include in each dual-system estimate people who have
similar capture probabilities in the census.
23.
Omissions
are not the whole story in evaluating a census. Errors can be made in the
census itself that affect the overall under- and overcount measurement:
·
The census
can contain duplicate or multiple enumerations;
·
The census
could have people or housing units ascribed to the wrong geographic location
(and thus not matching the PES interview);
·
People
could be less than perfectly enumerated—that is, there could be insufficient
information for matching to the PES interview;
·
The census
could have erroneously enumerated someone who should have been enumerated
elsewhere or the enumerator could have made up a fictitious person.
24.
So, if one
is interested in quantifying these errors and their effect on census coverage,
a sample of the census enumerations has to be checked to tally the number of
times these types of errors, called erroneous enumerations, occurred. In PES parlance,
this sample is called the E sample. The section below on estimation
explains how the quantification of these errors is incorporated into the
dual-system estimation formula.
25.
A desirable
option for the E sample is to draw it directly from the census for the
sampled areas used in the P sample. This facilitates matching and helps
ensure that the survey is balanced—that is, that one is searching for omissions
in the exact same area where one is searching for erroneous enumerations. The
area one searches for omissions and erroneous enumerations is called the search
area.
26.
For
instance, the person might have lived elsewhere for the rest of the year. Some
E‑sample units and the people in them will not match any of the PES
interviews. They might have been missed in the P‑sample frame or truly
erroneously enumerated. Since these people have not been asked the battery of
questions to determine if a person should have actually been counted at the
particular housing unit on census day, a follow-up operation is needed to
determine if the unmatched people in the E‑sample unit were or were not
erroneously enumerated—that is, whether one of the census errors listed above
occurred or did not occur. More about
this follow-up interview is presented in the section below on reconciliation.
27.
Two other
design decisions have to be made: What is the primary sampling unit for the
survey? and What is the definition of cases to be included in the survey? This leads into our next topic, the sampling
frame of the survey.
28.
A popular
choice for a sampling frame is to use an area sample for the coverage
measurement survey. The primary
sampling unit can be the block. Blocks
are land areas surrounded by visible geographic features such as roads and
streams. The frame, therefore, consists of creating a universe of blocks in the
country and dividing those into sets of blocks (or clusters of blocks) that can
be interviewed by a single interviewer within the allotted time.
29.
Another option
is to use a survey that is already in place that is being taken around the time
of the census. This has the large
advantage of using an existing organization to manage the PES. It also has several disadvantages:
·
The
existing survey may not be large enough, and supplementing it may be as complex
as creating a specially designed survey;
·
Procedures
may have to be augmented with the result that the quality of the existing
survey and the PES suffers;
·
The
ultimate sampling units may not lend themselves to being an efficient erroneous
enumeration sample, where duplication, geocoding errors and so forth need to be
easily discernible.
30.
Above, we
mentioned the option of using block clusters as a frame for the survey. One might want to design the sample in
several stages by first choosing a group of these clusters and then optimizing
the sample by subsampling. For instance, the housing unit totals from a
previous census might be used to choose the initial sample of clusters; then,
after the addresses of all of the housing units in the sample clusters are
listed, the block clusters might be divided into small, medium and large
sampling strata, where
·
Small
clusters might consist of 0-2 housing units;
·
Medium
clusters might consist of 3-79 housing units; and
·
Large
clusters might consist of 80 or more housing units.
31.
The next
step could be to subsample some of these clusters:
·
Medium and
large clusters might be subsampled whenever their actual counts of housing units
differed significantly from what was expected (from the previous census);