In the course of a far-reaching restructuring process of Statistics Netherlands (the so-called TEMPO-operation), which began in 1992 and was finalised around 1995, several studies were made on different aspects of the statistical process. One of them implied an in-depth internal review of our surveying and data collection policies. This review revealed a number of embarrassing facts. Most of these were not entirely new, but it was for the first time that what many people suspected was reported explicitly and systematically.
First of all, it appeared that, despite the fact that data collection and data editing accounts for, roughly, half the cost of Statistics Netherlands, there was relatively little systematic knowledge and documentation about the data capture and data editing processes, in particular in the area of institutional and business statistics. Most methodological efforts and innovations (sophisticated sampling techniques, questionnaire design and testing, controlled editing, computer-assisted data capture) had been concentrated in the area of household surveys.
Secondly, surveying was shown to be a rather scattered sort of activity. There was relatively little exchange of information and experience between different subject matter departments. This had led to the situation that businesses received questionnaires from several departments, that the same questions were sometimes asked more than once (and occasionally in slightly different forms), and that such matters as sending out reminders, when and how to involve field staff and 'how to deal with non-response' were dealt with in different ways.
At the same time, an external review was held of survey practices as they were perceived by businesses. A management consultants firm (Ernst & Young) was asked to interview a number of businesses about how they appreciated the surveys of Statistics Netherlands. Although the general level of acceptance was actually more positive than expected, several points of criticism were reported, as well as suggestions on how survey practices could be improved. Some of these were rather trivial and can be implemented easily, others will take more time, because they require the development of new technology and/or organisational restructuring.
It was noted, for example, that respondents desire a clear explanation of what a survey is for. This may seem obvious, but it appeared that some of our introductory letters were unclear in this respect. For about half of the questions asked, information could not easily be extracted from the company's own records. Sometimes, more or different detail was asked than companies need for their own management or for other purposes, such as the statutory annual company accounts which are required for taxation purposes. Some companies expressed their desire to receive an early announcement of the surveys they may expect in a certain calendar year. Mention was made of the need for appropriate timing of surveys and for realistic form-filling delays; surveys should not be held too early (e.g. before annual accounts were normally completed), but not too late either and the time given to complete questionnaires should not be too short or too long. Consistency in the way respondents were urged to cooperage (reminders, visit of field staff, any application of legal procedures) could be improved. Pre-printing of information that is already available from previous surveys was appreciated. About one quarter of the interviewed companies perceived some form of duplication between different questionnaires. Many thought that better co-operation between different public agencies could substantially reduce the form-filling burden. Finally, many companies appeared to have a positive attitude with respect to forms of computerisation of data capture, provided this was done intelligently. In later paragraphs of this paper, the possibilities and complications of electronic data capture will be explored in more detail.
Possibilities to reduce the response burden
The interest of the respondents in a reduction of the response burden are obvious: they profit most from a minimum approach to sample size and size of the statistical questionnaire, combined with maximum correspondence between the content of questionnaires and bookkeeping records. It is clear, however, that minimising the response burden is not only in the interest of respondents. It serves the statistician as well: data quality will be higher, response will be quicker and response rates higher, while collection cost fall.
The concept of 'response burden' has two dimensions, a quantitative and a qualitative one. The first refers to cost in terms of time and money and the second to how the response burden is perceived by the respondents. A response burden policy should explicitly involve the latter aspect, because in the end it is perception which, more than workload, determines the willingness of respondents to cooperage.
When considering ways to reduce or minimise response burden, a distinction may be made between on the one hand a corporate strategy, which relates to the totality of business surveys of a national statistical agency, and on the other hand aspects relating to individual surveys. For both points of view a number of issues and policies will be considered.
Corporate survey strategy
A corporate survey strategy assumes the presence of a 'survey control centre' within the NSI, where there is an overview of all business surveys and the domains they cover. Besides, for effective knowledge and control one should dispose of a central data base showing, for each individual business, in which surveys it is actually involved. Several instruments for carrying out a corporate strategy, directed towards reduction of the response burden, are briefly discussed below.
Co-ordination of data collection
Respondents generally dislike receiving questionnaires from different departments within an NSI. Moreover, when surveying departments act more or less independently, there is a risk of overlaps and redundancy in questionnaires. Therefore, individual surveys should be developed within the context of a co-ordinated survey strategy. Such a strategy aims at minimising the number of questionnaires and contact points. For a new survey this means that first of all one has to consider whether the required information cannot be collected by incorporating the questions in a survey that already exists. Integration of questionnaires and clustering of surveys not only reduce the (perceived) response burden, but also contributes to the consistency of reported data and thus to the quality of statistics.
Apart from this possibility, the number of contact points for the respondents can be reduced by centralisation of surveying activities in such a way that one particular respondent only communicates with one department within the statistical agency. Of course, there can be good reasons to deviate from this ideal construction. Still, even when respondents are tackled from different parts of the organisation, contacts can be streamlined by appointing an 'account manager' who is responsible for a harmonised approach of a particular (group of) respondents.
Co-ordinated delineation of sampling populations
Survey populations are often delineated according to field of economic activity: there are separate production statistics for separate SIC groupings. Care should be taken that a particular unit is not classified in different groupings at the same time. This risk can be reduced by drawing samples for all such surveys from one unequivocal source: a centrally maintained business register. However, a true guarantee for avoiding overlaps asks more: the different surveys covering different SIC groupings should apply the same type of statistical unit, as well as a uniform method and moment for selecting their respective sample sub-populations from the comprehensive sampling frame.
Co-ordinated sampling
In areas covered by several simultaneous sample surveys, cumulation of response load for individual respondents can be avoided by applying special sampling methods. Two approaches occur. In case of rotating samples a unit drawn in survey A for period t can be excluded for the next n periods, while for constant samples a unit drawn in survey A for period t can be excluded, or its drawing chance reduced, for survey B in the same period. As the response burden is primarily a matter of perception, it is important that respondents are explicitly informed on the existence and application of such a method. Often respondents of sample surveys have the feeling that they are hit unevenly, compared to their competitors. At Statistics Netherlands, a special kind of automated sampling methodology (the so-called EDS-system: EnquĂȘte Druk Systeem, in English RBS: Response Burden System) has been developed to spread the response burden as evenly as possible across enterprises. This is achieved by co-ordinated sampling and by attaching accumulated response burden markers to the enterprises in the sampling frame. In other words: the probability for an individual enterprise to be sampled for a certain survey is reduced to the extent that this enterprise has already been sampled in previous surveys, in the course of a certain period of time. It should be noted that EDS does not reduce the overall amount of response burden and also that, in practice, it only affects small and medium-sized enterprises, because larger enterprises are normally part of all samples.
Survey strategy for individual surveys
Taking into account that the final product of a survey is the sum of the efforts of respondents and statisticians, the survey designer should aim at a design in which the share of the respondent is limited to the necessary minimum. The central issue here is the notion that the intended statistical output does not automatically determine what the respondents input should be. This is true for all the elements of the final product. Given the contents and the quality standards of the output, a number of measures apply.
Population
This is the most trivial aspect and the message is simple: make samples as small as possible. The use of advanced sampling techniques and high quality sampling frames contributes to this goals. Besides, one should make maximum use of auxiliary information, which may be derived from other surveys or from administrative sources. An important aspect of respondent friendly behaviour is that the survey strategy is flexible enough to leave room for varying sample sizes over time. E.g., in case of a monthly statistic one might consider to have higher sample sizes every third month and lower sizes or even cut-offs in the months between, while still achieving acceptable quality, e.g. by interpolation.
Statistical units
A first demand in this respect is that the statistical unit is defined in such a way that the respondent can recognise himself as a real transactor in the economy rather than an artificial construction. This can be realised by stressing the requirements of 'autonomy' and 'data availability' in operational definitions of statistical units, while accepting a certain degree of 'heterogeneity'. There will, however, always remain situations where the respondent is not able or willing to report data on the level of the envisaged statistical unit. Then the 'reporting unit' will deviate from the statistical unit. In case of a 1:n relationship the statistician will have to allocate the data reported, while in the inverted case consolidation is necessary.
Variables: concepts and definitions
Whenever statistical concepts deviate from accounting concepts, this should be considered to be a problem for the statistician, not for the respondent. This means that questionnaires should be designed in such a way that they can be completed directly from bookkeeping records, and that it is, again, up to the statistician to bridge the gap between questionnaire concepts and statistical output concepts.
Variables: number and detail
As with sample sizes, it may be justified to alternate the contents of questionnaires. Once the 'maximum' questionnaire has been designed, one should seriously consider whether it is necessary to apply it 'full size', for each respondent and each reporting period. For smaller businesses, asking less detail should be considered. Whether it is wise to vary number and detail of variables over time, e.g. by asking detail only at intervals of three months, depends on the situation. In certain situations, in particular when respondents have taken special measures to generate the requested data automatically from their computer systems, it is better to maintain a constant rhythm.
Couleur locale
When a survey covers distinct SIC-areas, accounting practices and vocabulary may differ among industry branches. This may ask for different questionnaires for different groups of respondents.
Introduction and presentation
Response burden is not only determined by the time it takes to answer questions, but also by the time and effort needed to read and understand questions, introductory letters and explanatory notes. These should not only be brief and clear but, above all, applicable. Not or hardly applicable instructions and extensive, irrelevant explanations are very irritating. This is another justification for tailoring questionnaires to homogeneous groups of respondents. SIC-code is an important parameter for identifying such groups, as well as data reported in related or previous surveys. Pre-printing of product specifications is an example of the use of the latter.
Two stage sampling
An effective way to avoid overkill is two stage sampling. This requires the conduct of two consecutive surveys. First a few simple questions are asked to a large number of enterprises, e.g. whether or not a business carries out Research and development activities, and if so, whether the expenditure exceeds a certain threshold. The findings of the first stage are then used to narrow down the target population for the second stage. Besides, it is possible that the first stage generates data which can be used as estimators in the second stage, enabling to reduce the sample size.
Feedback of results
Providing respondents with some statistical results of the survey is a measure which, of course, only affects perception of burden. The information to be given should be relevant and readable and must therefore be carefully edited. It makes sense to test whether the statistics are of real interest to the businessman. If not, the effect on his perception may well be negative, instead of positive.
Measurement of the response burden
Although, in a quantitative sense, this increases the response burden, adding a question regarding time and cost that was required to complete the questionnaire may have a positive effect on the perception of the response burden. If properly introduced, respondents may consider such a question as an indication that the NSI cares about their burden and is keen on doing something about it. Besides, businesses who state a disproportionate burden, may be counselled on how to reduce the completion time.