Statistics Division Home
Development of National Statistical Systems
Country Profiles of Statistical systems
Key Features of National Statistical Systems
Country Practices
Handbook of Statistical Organization
National Quality Assurance Framework
Technical Cooperation Trust Fund UN-China
Search by Country
Free Text Search
Recent Updates
Sign in

Confidentiality Protocol

STATISTICS NEW ZEALAND

Confidentiality Protocol

Contents

1 Introduction
2 Legal obligations and policy
3 Restricting use of information to statistical purposes
4 Protecting confidential information
5 Rules for avoiding disclosure of confidential information in outputs
6 Microdata
7 Integrated data

Appendix A How Disclosures can Occur
Appendix B Aggregated Data Release Rules
Appendix C Microdata Release Rules


Other related documents

1 United Nations Fundamental Principles of Statistics
2 Privacy Principles
3 Confidentiality Principles of the Statistics Act 1975
4 Publishing by Consent under Section 37(4)(a) of the Statistics Act
5 Release of Business Frame Unit Record Data to Outside Organisations
6 Meat and Wool Board policy (to come)
7 Archiving Policy (draft)
8 Protocol for Use of IRD Data
9 Standard Confidentiality Statement (to come)
10 Security for Completed Questionnaires
11 Incident Management and Disaster Recovery (to come)
12 Retention and Destruction of Survey Records
13 Data Custodianship
14 Protocols for Official Statistics
15 Microdata Access Protocols
16 Data Integration Protocols (draft)
17 Population Census Record Linkage


1. INTRODUCTION

1 Statistics New Zealand relies on the co-operation and goodwill of New Zealanders and New Zealand businesses for its job of producing official statistics, as they provide the information which is the foundation of our outputs. One important way of maintaining such co-operation and goodwill is by protecting the confidentiality of information collected from respondents through ensuring that identifiable information is securely maintained, is only used for statistical purposes and is not revealed in any outputs.

2 Confidentiality of information obtained from respondents is protected by staff following a code of practice which sets out how such information is to be protected. This code of practice applies to the custody of information obtained from other agencies, as well as that obtained through Statistics New Zealand's own statistical collections.

3 The code is based on legal obligations under the Statistics Act, as well as operational and ethical requirements. It is also consistent with the requirement of the United Nations Fundamental Principles for Official Statistics that individual data collected by statistical agencies for statistical compilation are to be strictly confidential and used exclusively for statistical purposes. Applying the code of practice consistently in all aspects of the work of Statistics New Zealand leads to trust in the integrity of its statistical processes.

4 Statistics New Zealand follows the code of practice by ensuring that:
- respondents and staff know of the guarantees provided to respondents by the Statistics Act;
- all guarantees given to respondents and clients are met;
- the confidence in the organisation held by independent guardians of the public interest (such as The Ombudsman and the Privacy Commissioner) is maintained;
- the distinction between statistical activities and regulatory activities such as those administered by the Department of Social Welfare and Inland Revenue Department is always reinforced;
- it is open about how it handles rare situations;
- its commitment to other laws on confidentiality and privacy is unqualified;
- it advises against putting into law any exceptions which conflict with well-established confidentiality protection practices;
- those affected are consulted when solutions are being sought to difficult situations involving confidentiality of statistical records;
- other confidentiality, privacy and security matters in Statistics New Zealand are consistent with how it handles statistical data; and
- any perceived breach of confidentiality is handled with fair and open enquiry, and our responses recognise the seriousness with which confidentiality protection is maintained.

5 Maintaining the confidence of respondent information is an important element in addressing privacy concerns. All statistical collections represent a degree of intrusion into people's lives and business activities. Privacy concerns should be addressed by ensuring that the need for any information is justified in terms of use, that the Privacy Principles as applied to statistical information are followed and that confidentiality undertakings are honoured.

6 Conflicting with the desire to protect the confidentiality of the respondent information is the need to obtain maximum statistical use of data collected at considerable expense and respondent effort. Users of data often require detailed information for their studies. The more detailed the data requested the more difficult it becomes to ensure the confidentiality of respondent information is preserved. In providing useful statistical information Statistics New Zealand must ensure that the responses of individuals are protected.


2 LEGAL OBLIGATIONS AND POLICY

7 The Statistics Act 1975 provides the legal authority for the collection of information from persons and businesses. The Act also requires that information so collected be maintained securely and only used in ways prescribed by the Act, with a guarantee of confidentiality. It is recommended that those dealing with the collection, processing and release of information from Statistics New Zealand's statistical collections familiarise themselves with the Act, particularly Sections 21 (Declaration of Secrecy), 37 (Security of Information), 38 (Information is Privileged) and 40 (Offences and Penalties). The key elements are:

37(1) Information furnished to the department under the Statistics Act shall only be used for statistical purposes.

21(1) Every employee of Statistics New Zealand, before entering on his duties, shall take and subscribe a statutory declaration of secrecy.

37(2) No person other than an employee of Statistics New Zealand who has made the Declaration of Secrecy specified in s.21 of the Statistics Act shall be permitted to see any individual schedule or any answer to any question put under the Statistics Act

37E The department shall take such steps as are necessary to ensure that the security provisions of the Act are complied with in respect of information from individual schedules that is copied or recorded for the processing, storage or reproduction of particulars.

40(c) Every person employed in the execution of any duty or the exercise of any power or function under this Act commits an offence who knowingly fails to keep inviolate the secrecy of the information gathered or entered on the schedules collected pursuant to this Act and, except as allowed in this Act, divulges the contents of any schedule filled in or any information furnished to the department under this Act.

37(3) No information contained in any individual schedule and no answer to any question put for the purposes of this Act shall be separately published or disclosed to any undertaking or to any person not being an employee of the department who has made the statutory declaration of secrecy specified in s.21.

37(4) All statistical information published by the department shall be arranged in such a manner as to prevent any particulars published from being identifiable by any person (other than the person by whom those particulars were supplied) as particulars relating to any particular person or undertaking.
The full requirements of the Statistics Act, and explicit situations when information might be released, are detailed in this document.

8 The Government Statistician makes office rules to implement the security requirements of the Statistics Act. The rules prevent the release of identified information (i.e. information with names and addresses), and ensure that all reasonable steps are taken to avoid the release of identifiable information (i.e. information which can be indirectly used to identify the provider) in outputs. They are set out in Section 5 and Appendix B.

9 There are, however, some explicit situations where information on individuals or businesses might be released or provided for access outside Statistics New Zealand. These are tightly prescribed by the Statistics Act to cover such situations as:
- the individual concerned approving disclosure, outlined in Section 37(4)(a) , or Section 37A(a);
- the information being collected as a joint survey with another government department, local authority or statutory body, and the individual concerned having no objection to the sharing of information with that other agency;
- access to individual information, without names and addresses, by another government department, under specified conditions, for bona fide research or statistical purposes pursuant to the functions and duties of that department; or
- the provision of names and addresses of farmers to the Meat and Wool Boards for electoral purposes (policy to come).
When access is provided outside Statistics New Zealand, the security provisions of the Statistics Act pertain and no identifiable information can be released by the recipients.

10 Furthermore, under Section 37A(b)-(f) the Government Statistician has the discretion to release certain types of information, such as that already publicly available, overseas trade information, and some business register-type information on names and addresses of businesses. The current policy is to not release names and addresses of businesses other than for the purpose of conducting major statistical surveys of national importance, see the Business Frame policy .

11 The use of the exception provisions requires specific authorisation by the Government Statistician.

12 In summary, the Statistics Act requires that:
- information collected under the Statistics Act is not to be used for other than statistical purposes;
- information collected under the Statistics Act is to be maintained securely; and
- identifiable information is not to be published or otherwise disclosed.


3 RESTRICTING USE OF INFORMATION TO STATISTICAL PURPOSES

13 Information provided by respondents is normally coded and captured electronically for aggregation into summary statistics that are made generally available. Identified or identifiable individual forms or electronic records are potentially of interest beyond the statistical purpose for which they have been supplied in confidence by respondents. To maintain respondent trust, consistent with the requirements of the Statistics Act, any access to individual forms with name and address removed (called microdata), in addition to the purpose of production of official statistics, is restricted to government departments for research or statistical purposes, and is at the discretion of the Government Statistician - see Section 6 on Microdata.

14 The Statistics Act, however, specifies two circumstances when individual information can be used for other than statistical purposes. These are:
1. names and addresses of farmers and wool growers may be supplied to the Meat and Wool Boards respectively for electoral purposes. (policy to come)
2. for historical research if archived in accordance with Section 37D

15 Information provided under the Act is privileged and cannot be used in a court other than for prosecutions under the Statistics Act. In addition, the Official Information Act and the Privacy Act grant access to the Ombudsman in limited circumstances. The Government Statistician is to approve all such access to information.


4 PROTECTING CONFIDENTIAL INFORMATION

16 Information provided by respondents should be kept secure, access to information should be restricted to employees of Statistics New Zealand who need such access in order to perform their official duties, and all releases of statistics should avoid disclosing identifiable information.

17 The following practices must be followed to protect confidential information:-
Every employee of Statistics New Zealand must sign a statutory Declaration of Secrecy.

18 All employees of Statistics New Zealand, regardless of their duties and status of employment, must sign a statutory Declaration of Secrecy on commencement of employment with Statistics New Zealand. This includes contractors employed by Statistics New Zealand. The declaration of secrecy requires that employees do not disclose information acquired during their employment in Statistics New Zealand. The declaration continues to apply after ceasing employment.
Employees of another government agency granted access to microdata or forms under provisions of the Statistics Act must sign a statutory Declaration of Secrecy.

19 In exceptional cases specified in the Act and authorised by the Government Statistician, employees and contractors working for other government departments are able to access microdata for bona fide research or statistical purposes - see Section 6. In addition, forms from a joint collection conducted with another government department, local authority or statutory body may be accessed by employees of such agencies for statistical purposes.

20 All employees of and contractors to other agencies granted access to microdata or forms in these circumstances are required to sign a Declaration of Secrecy. For the purpose of the security provisions of the Act, such persons are considered to be employees of Statistics New Zealand.

21 Only persons approved to have access and who have made the Declaration of Secrecy specified in s.21 of the Statistics Act shall be permitted to see any unidentified microdata.
Access to identified individual records of information supplied by respondents should be restricted to those staff who need such access for the production of statistics.

22 Access to identified information, whether in paper or electronic form, should be controlled and restricted to employees of Statistics New Zealand who need such access in order to perform their official duties. Access lists should be kept up to date.

23 Records obtained from other departments require particular care as identifiers tend to be kept with sensitive information. Access should be restricted to employees of Statistics New Zealand who need such access in order to perform their official duties. For Inland Revenue Department data approved staff are required to sign an IRD Declaration of Fidelity and Secrecy. All such information should be securely stored in the same way as other Statistics New Zealand confidential information. See Protocol for Use of IRD Data .
It is important that public statements about confidentiality are simple, clear, accurate and consistent.

24 There is a standard confidentiality statement which should be used on all self-administered forms .
Staff involved with the collection and processing of forms must maintain the security of forms and be careful in dealings with respondents not to divulge information collected in confidence from other respondents.

25 Safeguards should be in place in work areas to ensure the security of forms during collection, transport, processing, storage and destruction. Particular care should be taken when querying of respondent data as part of editing and supplying past data to assist respondents with the supply of current data. The use of faxes and email require special security arrangements - see documents about the security of completed questionnaires.

26 Also see the department's Incident Management and Disaster Recovery policy (to come).
Names and addresses, which provide easy identification of respondents, should not be kept with response data.

27 Statistics New Zealand does not keep names and addresses except as frame and sample information for the purpose of conducting collections (e.g. Business Frame, proposed Address Register). Names and addresses collected in surveys of businesses are to be used only for collection control and editing purposes, and to update the Business Frame. Once quality checks have been completed, names and addresses are to be separated and not stored electronically along with the information supplied. Use of Business Register numbers will allow for any further analysis such as later quality checks or longitudinal analyses.

28 For persons and households, names and addresses obtained on forms are not recorded electronically along with information supplied beyond their use in processing and quality checking, except for any approved archiving . Once they have been used for the purpose of the collection (e.g. to facilitate contact, for editing and coding) and statistical information captured and quality checked, names and addresses should be destroyed.
Collection forms should be destroyed after processing

29 Once required data have been extracted, and forms are no longer required for verification, editing or quality studies, or archival the forms are to be destroyed in a secure manner. Exceptions, such as samples kept for quality studies, must be authorised by the Government Statistician. More information on the policy for the retention and destruction of survey records is available here, for household surveys and business surveys .
Extreme care must be taken with respondent identifiers.

30 For the purpose of respondent liaison or quality checking most survey units will have an internal respondent identifier. Often these are attached to unit record files. While many of the identifiers have no real structure and thus cannot in themselves be used to identify the respondent, the combination of Statistics New Zealand respondent identifiers and their name and address must be carefully protected.

31 The data custodian for each collection should ensure, amongst other duties, that individual information is appropriately secured, in accordance with these protocols. See for the policy on data custodianship.
Unit record data must not be removed from Statistics New Zealand premises except if approved by the Government Statistician.

32 Unit record data and identified information such as forms must remain on Statistics New Zealand premises. The only exceptions are for transportation between Statistics New Zealand offices and for destruction under security, and for the use of records, without names and addresses, by other government departments for research or statistical purposes as approved by the Government Statistician (see Section 6).
All releases of statistics must avoid disclosing identifiable information. A plan must be developed for each collection and followed to avoid disclosing identifiable information.

33 The Act requires the Government Statistician to make office rules considered necessary to avoid the release of identifiable information in outputs, other than where exceptions of the Act are relevant. These rules are outlined in Section 5 and Appendix B. The rules should be the basis of the release plans to be developed for each collection and modified if necessary to ensure identifiable information is not disclosed in outputs. The plans are the responsibility of the relevant output manager.


5 RULES FOR AVOIDING DISCLOSURE OF CONFIDENTIAL INFORMATION IN OUTPUTS

34 In publishing statistics, Statistics New Zealand must take all reasonable steps to ensure that particulars relating to an individual business, household or person are neither divulged in, nor able to be deduced from, published statistics. The Government Statistician is required to implement 'office rules' to prevent the disclosure of individual information in published statistics. These rules should also be followed by persons publishing output produced from access to microdata in accordance with the Statistics Act. For an explanation of how disclosure can occur in outputs, see Appendix A.
Except for information approved for disclosure in accordance with the Statistics Act and relevant departmental policy, the minimum standard outlined in Appendix B must be applied to all statistical tables published and disseminated by Statistics New Zealand to avoid the possible disclosure of identifiable information.

35 Because of their different characteristics, which impact on the likelihood of releasing identifiable information, separate rules are to be applied to population census data, sample data relating to households and persons, and business data. Additional rules may be needed for some outputs such as those created from integrated datasets. The rules are set out in Appendix B.

36 Note that in the case of household sample survey data, minimum cell-size standards are set to avoid releasing unreliable data (ie with high sampling error) rather than being needed to prevent the release of identifiable data.
All cells which are defined as likely to disclose identifiable information must be suppressed or altered in any release of output.

37 The following basic techniques can be applied to outputs likely to contain cells which should be kept confidential:

a) slightly altering outputs so results from analysis are insignificantly affected yet the original values cannot be known with certainty (e.g. random rounding). This method is usually adopted for count data such as for the population census.
b) limiting the detail available (e.g. collapsing detail in classifications, combining cells).
c) suppressing information (e.g. cell suppression, non-release of data).

38 Cell suppression should not to be confined to the primary cells (which failed the confidentiality rules) but should be applied to enough other cells in the particular table to ensure that the suppressed primary cell cannot be derived by subtraction. This is known as secondary cell suppression but is also referred to as complementary or consequential cell suppression. In particular, when cells in tables of business survey magnitude data are identified for suppression by the (n,k) concentration rule, secondary suppression should be applied to other cells as is necessary to avoid disclosing confidential information.

39 Secondary cell suppression is not required for cells that are suppressed for reliability, rather than confidentiality, reasons. Cell suppressions in household sample survey data are usually done for reliability reasons.
Any derived outputs (e.g. averages, percentages, rates, movements) must be based on confidentialised data even if not released in conjunction with the data

40 It is possible for derived outputs that are based on unconfidentialised data to provide a disclosure. Also, if a cell in a magnitude table is suppressed then no derived outputs can use that cell value.
Statistics New Zealand confidentiality rules apply to all data collected by Statistics New Zealand under the Statistics Act, as well as data from other organisations (e.g. vitals data from Internal Affairs, tax data from the Inland Revenue Department) supplied to Statistics New Zealand for purposes of producing statistical outputs. Any confidentiality rules specified by the supplying organisation must also be followed.

41 Statistics New Zealand rules apply to all identifiable information supplied to the Department to ensure that the excellent confidentiality reputation of Statistics New Zealand is maintained. Data supplied to us is done on the basis of trust. We must honour any undertakings made by supplying organisations as well as those given by us.
Statistics New Zealand applies its confidentiality protection to all data of any age

42 The Statistics Act and undertakings made to respondents require information provided by them to be protected for the life of the data. Rules for protecting the release of identifiable data in outputs must be followed for all data regardless of age, but with due regard to any reduction in likelihood of identification as the data ages.
If there is any uncertainty as to the likelihood of disclosure of identifiable information in data to be released then it should be withheld or further confidentialised

43 Confidentiality protection is a fundamental component of Statistics New Zealand’s output methodology. Where any uncertainty exists about the legality of a particular release practice, an "abundance of caution" philosophy should be applied and the statistics not released until advice has been obtained from the Analytical Support Division or the Policy and Planning Division.
The most current confidentiality recommendations must be applied to all data releases

44 Techniques for confidentiality protection change over time in response to requests for greater amounts of data and the computer power available to our users. Methods of release or access may change. Therefore, from time to time, the confidentiality protocols and protection techniques will be reviewed and may be altered. This may mean that data that was previously available is now regarded as possibly allowing identification and hence further such release should cease. To ensure consistency in all outputs at any one time, plus to prevent the inadvertent release of identifiable data under an old protocol, all data releases must conform to the current confidentiality protocols.
The name or address of any person or business selected in a sample survey should not be released, nor should details of which areas are sampled.

45 Releasing details of who is in a sample is an infringement of privacy as well as increasing the likelihood of identifiable information being released. For area based surveys, releasing information on specific areas selected in a sample can increase the likelihood of disclosure of confidential information, however this must be balanced against the need to provide information to support field work.
The parameters in any algorithm used to suppress identifiable information should not be released

46 Knowledge of the parameters (i.e. the n and k of the (n,k) rule) can assist attempts to identify information in outputs, and details should not be published.
Analytical Support Division will provide expert advice and judgement on matters of disclosure protection

47 The responsibility for ensuring identifiable information is not released in outputs rests with output managers. Analytical Support Division is available to provide advice and assistance in the application of the confidentiality protection procedures outlined in this document. The Division is constantly reviewing overseas practices and procedures for effective and efficient techniques for protecting confidential information in outputs while trying to provide the maximum detail required by users. Any matter of doubt should be referred to the Division for their advice.


6 MICRODATA

48 The Statistics Act does not allow for microdata (unit record or low level aggregate data) to be released for public use. However, microdata can be accessed for statistical purposes in several ways.

49 First, Section 37C of the Statistics Act allows for access to information from individual questionnaires by government departments for bona fide research or statistical purposes under certain conditions. The key conditions are:

a) names and addresses must be removed;
b) it must be for research or statistical purposes pursuant to the functions of the department and any published results must not divulge any information which the Statistician could not divulge;
c) it can be provided to employees of government departments and their contractors only, and they are responsible for the security of the data while it is in their custody; and
d) a declaration of secrecy must be made by all persons who have access to the data.

50 This provision is enabling and there is no obligation on the Government Statistician to provide access to microdata to requesting departments. The decision on whether or not to provide access, and in what manner, will depend on a number of factors such as public acceptability (e.g. would there be public concern if information is provided to a regulatory section of an agency), as well as the reputation of the agency itself, particularly on security and privacy matters. Where there may be special privacy or confidentiality concerns access would be restricted to use in the Data Laboratory on Statistics New Zealand premises.

51 In addition, it is the policy of the Government Statistician that a request for access to microdata by a government department not abiding by the Protocols for Official Statistics will be provided only through Statistics New Zealand's Data Laboratory. Access to microdata from some surveys such as the Population Census will be provided only through the Data Laboratory because of the special guarantees that are provided at the time of collection.

52 Researchers contracted to government departments can access data through the sponsoring agency if suitable arrangements can be made with that agency. Microdata provided to a government department must not leave the custody of the department without the consent of Statistics New Zealand. The researchers would also be required to make a declaration of secrecy. When appropriate, arrangements will be made for them to access the data via the Statistics New Zealand Data Laboratory.

53 The Government Statistician may also allow independent researchers to use microdata if they are working on a statistical research project that is considered by Statistics New Zealand to provide significant public benefit. This would normally be provided through the Data Laboratory on Statistics New Zealand premises. If approved, they would be engaged as an independent contractor to the Department, and would make the required declaration of secrecy. The criteria for access to microdata is set out in the Microdata Access protocols, see .

54 The decisions on whether and how to provide access to microdata should be considered on a case-by-case basis based on the criteria in the Microdata Access protocols. Decisions on individual proposals must be made by the Government Statistician. Ways should be sought to help achieve access, unless there could be a risk from an adverse public reaction to the particular use of the data.

55 See Appendix C for guidelines for reducing the risk of identification in microdata.


7 INTEGRATED DATA

56 Statistical analysis of integrated datasets is possible, however any matching or linking of datasets must be undertaken by Statistics New Zealand and the resulting datasets should not leave Statistics New Zealand. See the Data Integration protocol for the process that is required to integrate datasets (new policy to come - draft ). Access to the integrated datasets can be provided to approved researchers in the Data Laboratory consistent with the policy on access to microdata (see Section 6).

57 To maintain widespread trust and co-operation with the population census, special rules apply for any integration of that dataset with other microdata. See .
Note - Attachments to these appendices give the extra detail required for the application and justification of the standard confidentiality rules. See for a directory of the attachments.


Appendix A. How Disclosures can Occur

Adapted from Willenborg and de Waal (1)
(For a more detailed, technical discussion of how disclosures can happen, refer to ).

Tables contain aggregated data in cells. Because they contain aggregated data, and not data on individual entities such as persons, households or enterprises, it may appear that no information about individual entities can be revealed. But consider the following table:




At first sight Table 1 seems perfectly acceptable for publication. It contains only aggregated data. But suppose that there is only one enterprise in region B and Industry 2. Then Table 1 yields that the turnover of this enterprise is 15 million dollars. So, Table 1 can't be published if we want to protect the turnover of the only enterprise in Region B and Industry 2.

So - tables with cells that contain the data of only one entity should not be published.

Suppose that there is not one but two enterprises in region B in industry 2. In this case each of the enterprises in region B in industry 2 can disclose the turnover of the other using Table 1.

So - tables with cells that contain the data of two or less entities should not be published.

And sometimes it might be judged necessary to set the minimum number of contributors to a cell to even higher than three, to protect against 'coalitions' of, say, two contributors combining to disclose the value of the third contributor.

Suppose that there are actually 10 enterprises in region B and Industry 2, and that one of these enterprises contributes 95% to the total value of the cell. In this case one can make a reliable estimate of the turnover of this enterprise if it's known that its contribution is high. So - a cell should not be dominated by the contribution of an individual entity. As before we can extend this and require that the cell should not be dominated by the total contribution of two or more entities, to avoid either/any of them being able to estimate the contribution of the remaining contributor too accurately.

A common way of protecting a table with sensitive cells is to delete all the values of sensitive cells. This is called primary cell suppression. Suppose that the total turnover of enterprises in Region B and Industry 2 is the only cell value that can't be published in Table 1. In that case we can delete this value from the table to get Table 2:




But it is easy to see that Table 2 is not protected at all. The suppressed cell can easily be calculated by using the marginal totals to get 93 - 31 - 47 = 49 - 33 - 1 = 15. So, when marginal totals (i.e. cell or row totals) are given in a table, additional cell values will usually have to be suppressed to prevent recalculation. This is called secondary cell suppression (2). For each primary suppression in a table with both row and column marginal totals, three other suppressions are required to protect it from recalculation (a 'square' of suppressions is required). So, one way of preventing the recalculation of the suppressed cell is shown in Table 3:




Instead of suppressing cells in tables one could also consider redesigning the table, i.e. by combining rows and/or columns. In the example being considered, one could protect the cell corresponding to Region B and Industry 2 by combining Region B and Region C, or Industries 2 and 1 - whatever produces a safe table. In table 4 the result of combining Regions B and C is shown.




_______________________________

1. Willenborg L and de Waal T (1996), Statistical Disclosure Control in Practice, Springer-Verlag, New York.
2. Secondary cell suppression is also known as 'consequential' or 'complementary' cell suppression.


Appendix B. Aggregated Data(1) Rules

There are different levels of risk corresponding to different types of sample designs. Census data is the most at risk, followed by business sample survey data (due to full-coverage strata for large businesses), then household sample survey data. Different rules apply to these three cases.

Population census data

Small Domain Release policy: Except for simple population counts, tables cannot be released if either
the population they correspond to is less than 100, or
the average cell size is 4 or less.
all released census data must be random rounded to base 3, other than simple population counts.


Business surveys

Business sample surveys usually have full coverage strata, and so sampling alone cannot be relied on for confidentiality protection because common classification variables such as industry or region may allow the identification of prominent businesses. So the confidentiality rules required for business sample surveys are more stringent:

count data: apply random rounding or suppress cells with less than 3 contributors(2)

magnitude data: apply the (n,k) concentration rule. This rule specifies that a cell should be suppressed if n or less respondents contribute k% or more to the cell. The current parameters for this rule are available here , and should be kept confidential.

Household surveys

Protection is generally provided by the samples used as there are no full coverage strata for large units. To provide additional protection:
unweighted data should not be released, and sample weights should not be released
data custodians must ensure that spontaneous recognition of unusual individuals cannot occur, by ensuring that there are no tables released containing cross-tabulations corresponding to known population-uniques

Reliability Rules

In addition to confidentiality rules, reliability rules are usually applied to the output from sample surveys. These tend to specify a minimum cell size. For household sample surveys this minimum cell size is usually 10, but sometimes the reliability rules will specify a maximum relative sampling error (RSE) instead of a minimum cell size. Reliability rules are applied differently than confidentiality rules, in that secondary cell suppressions are not required to prevent recalculation of the suppressed value. See for a discussion of this.

The Survey Methods division should be consulted about the appropriate reliability rules.

Accuracy Rules
Data may be rounded to reflect the accuracy of collection and to aid ease of reading. For example, export volumes may be expressed in millions of tonnes. This is different from random rounding, which is motivated by confidentiality reasons, and is a method for adding a small amount of 'noise' to the data to protect individual responses.

Derived Data
Derived data from any type of survey (i.e. census, business or household) must be based on confidentialised data. See for some examples of why data derived from unconfidentialised data can pose a disclosure risk.

Trade Data
Currently, trade data is not confidentialised, unless a specific request to do so is received from the respondents. If a request is received, an (n,k) rule is applied to determine the confidentiality risk and, if the risk is significant, the data is suppressed for three months (and corresponding totals are deflated by the suppressed amount during the three month period). These rules are currently under review (see - no conclusions yet, but the document will be updated once decisions have been reached).
______________________________

1. Aggregated data is also known as 'output data', 'summary data', 'macrodata' or 'tabular data'.
2. Secondary suppression must also be performed to protect the primary suppressions


Appendix C. Microdata Rules1

Under the Statistics Act, microdata cannot be released for public use (refer to section 6 of this Confidentiality Protocol). The following rules apply to microdata provided for access through the Data Laboratory by researchers or government departments.

All direct identifiers must be removed (name, address, IRD number, business name, reference number, telephone number).
Regional variables corresponding to populations of less than 100,0002 persons must be collapsed or removed
Date of Birth should be removed. Age in months should not be included unless shown to be necessary to the research.
Data should be randomly resorted if the ordering of the data corresponds in any way to the sample design (because datasets are often sorted by region, which can lead to disclosure)3
In situations where it has been deemed suitable to release sample design variables to facilitate design-based analysis4, the following rules apply:

(i) sample design variables (typically PSU and stratum) should be randomly sorted and assigned a sequential integer code (or other recoding scheme which explicitly guards against any geographic or other association embedded in sample design variables); and
(ii) no use should be made of sample design variables beyond sampling error and variance estimation. In particular, sample design variables must never be used as a proxy geographic region variable.

All variables on microdata files must be justified as necessary to the research, and the data must be referred to the Analytical Support Division for assessment of any further confidentialising treatment required.
______________________________

Microdata is also known as 'unit record data'.
Two exceptions to this rule are the Southland and Northland Local Government Regions (LGRs), which correspond to approximately 80,000 and 98,000 respectively.
This is a new rule, and as such it may take some time to become SNZ practice.
While requests for sample design variables will be examined on a case-by-case basis, this rule will generally provide only for release of sample design variables for household surveys where stratification is not highly disproportionate. Economic or business surveys will normally be excluded from the scope of this rule due to their typically high degree of disproportionate stratification, in particular full coverage strata which regularly exist in economic and business surveys.


Back to top | Statistics Division Home | Contact Us | Search | Site Map
Copyright © United Nations, 2007