D. Ensuring the quality of the linked data sources and of the linked/integrated data
11.15. Linking data sources: matching rates–experience of the United States of America. The United States Census Bureau collects export data at the transaction level from two main sources: the Automated Export System (AES) and the Canada Data Exchange. Transaction-level data includes trade value, product codes, partner country, and the trader’s unique identifier. The trader’s unique identifier for AES records is the trader’s Employer Identification Number (EIN), issued by the United States Internal Revenue Service, and forCanada’s records, it is the trader’s name. Transaction-level data are linked to enterprise characteristics in the Census Business Register using the trader’s unique identifier and the EIN or company name in the Business Register. Enterprise characteristics include employment and industry classifications. While the AES linkages are fairly straightforward, the records ofCanada are associated with some complicated name matching routines and manual matching procedures. The quality of the linked data is very good as seen by the high match rates. The United States Census Bureau typically matches about 89 per cent of the export value to the Business Register, and AES match rates exceed 94 per cent, whileCanada match rates are lower, close to 74 per cent. Import transaction-level data are also matched to enterprise characteristics in the Census Business Register. Import traders’ unique identifiers are all reported as EINs, making these linkages also fairly straightforward. Initial match rates have been about 87 per cent, but should improve as matching routines are refined. Both export and import linkages are used to create the Exporter and Importer Databases and the Profile of US Importing and Exporting Companies, which is the publication that constitutes an offshoot of the databases. The first exporting company profile was published with 1987 data, with annual publications since 1996. The importing company profile was added to the publication beginning with 2009 data.
11.16. Linking of trade operator with the statistical unit: experience of the European Union. The feasibility of linking external trade data with business registers has been tested in a series of pilot data-collection rounds. The objective of these studies was twofold: first, to investigate to what extent and under what conditions microdata linkages are possible; and second, to define new statistics which could be derived from the combined data set. At the conceptual level, the methodology can be simplified through presentation in the following framework: First, a linkage is established between trade operators and legal units in business registers. Second, the trade value of each trader, by product code and partner country, is combined with the main enterprise characteristics (economic activity and number of employees) retrieved from the business registers. Third, specific indicators are calculated. The quality of statistics based on data linkages depends to a very large extent on the matching rates between source data sets. The results of the pilot data collection rounds have shown that, in most cases, the matching rates were very high, particularly when measured in terms of trade value.[9]
11.17. Business registers – experience of the European Union. Business statistics are usually derived from surveys of businesses. Business registers are normally used as a tool for the preparation and coordination of surveys. They detect and construct the active population of statistical units (enterprises, local units and enterprise groups) from administrative units (legal units) and include information on their identification, demographic, economic and stratification characteristics, the control and ownership of units, and links with other registers. Business registers are also used as a source of information for statistical analysis of the business population and its demography. Although business register data cover only a few key economic variables (e.g., employment and turnover), they can be used to obtain comprehensive data with detailed breakdowns across a full range of activities, in contrast with data collections such as those of structural business statistics which are largely based on surveys and are limited in scope. The business registers play an important role in bringing trade statistics closer to the business statistics. The links between legal units in the business registers and intra- and extra-Union trader identification codes need to be recorded in the business registers. Thus, the business registers provide a tool for linking detailed external trade microdata with the statistical units utilized in business statistics.[10]
[9] See http://epp.eurostat.ec.europa.eu/portal/page/portal/external_trade/documents/External_trade_statistics_by_enterprise_characteristics.pdf.
[10] Ibid.