Implementing the Decision 46/107 (c) of the UN Statistical Commission at its 46th session on international trade and economic globalizations statistics (http://unstats.un.org/unsd/statcom/46th-session/documents/statcom-2015-46th-report-E.pdf) to prepare a handbook on a system of extended international and global accounts, the expert group has been established in 2015. All former members of the FOC group reconfirmed their membership in the newly established expert group. In addition, Kazakhstan and the United Kingdom, both of which had expressed interest at the 46th session of the Commission, joined as well. Ireland was chosen as the new chair of the expert group. The first meeting of the expert group takes place on 26-28 January 2016 in New York. At that meeting the terms of reference of the group were discussed as well as its programme of work.
Members of the United Nations Expert Group on International Trade and Economic Globalization Statistics
- Europe: Ireland (Chair), Denmark, Italy, the Netherlands and the United Kingdom
- America: Canada, Colombia, Costa Rica, Mexico and the United States of America
- Africa: Cabo Verde, Morocco, South Africa and Uganda
- Asia: China, India, Iran (Islamic Republic of), Kazakhstan, the Republic of Korea, Thailand, and Viet Nam
- Agencies: The Asian Development Bank, Eurostat, IMF, OECD, the United Nations Conference on Trade and Development, the World Trade Organization, the Economic Commission for Europe and the Statistics Division of the Department of Economic and Social Affairs of the Secretariat
This handbook should build on existing work in this area, in particular by the UNECE, the OECD and Eurostat, and address issues of micro-data linking of business and trade statistics, as well as address the integration of economic, environmental and social dimensions of trade and globalization as an extension of the System of National Accounts 2008 (2008 SNA) and the System of Environmental-Economic Accounting 2012 (SEEA 2012). The Handbook will build upon the 2014 and 2015 reports of the Friends of the Chair (FOC) group on the measurement of international trade and economic globalization, the work of the UNECE Task Force on Global Production, of the OECD Expert Group on extended Supply and Use Tables, and of Eurostat on global value chains and economic globalization.Related documents/links:
The inter-secretariat working group for international trade and economic globalization statistics (ISWG-ITEGS) was established in 2015, based on the Decision 46/107 (e) of the UN Statistical Commission at its 46th session on international trade and economic globalizations statistics. In addition, the UN Statistical Commission requested the working group to develop a mandate that includes coordination of work undertaken by the various international and regional organizations in the area of international trade and economic globalizations, while ensuring proper cooperation regarding work programs and activities worldwide, taking account of existing work and reducing duplication. The group should also promote the development of databases, at international, regional and national level for international trade and economic globalization statistics; and coordinate and promote capacity building activities to improve these statistics at micro-level for the better calculation of statistics at the macro-level.
Members of the ISWG-ITEGS are Eurostat, IMF, OECD, UNCTAD, UNSD and WTO. The first meeting of this group takes place on 29 January 2016 in New York. At that meeting the terms of reference of the group are discussed as well as its programme of work.
The report of the global working group (GWG) on Big Data for official statistics was prepared in accordance with Economic and Social Council decision 2015/216 and past practices. Since its inception meeting in October 2014 in Beijing, the GWG has delivered on advocacy materials, papers on access and partnerships for Big Data, scoping papers on Big Data classifications, methodology and repository of projects, a global survey on Big Data and pilot projects involving satellite imagery data and mobile phone data. The priority list and work programme for the short term include pilot projects on Big Data for SDG indicators, use of the Big Data Sandbox for training, the launch of the Big Data repository and further progress on principles for access to data owned by private sector. The report also includes the outcome of the 2nd Global conference on Big Data for official statistics, which took place in Abu Dhabi in October 2015.
In 2015, the GWG on Big Data for official statistics delivered outputs on advocacy and communication, namely a strategy document, brochure, videos and newsletter. Also, three papers (available from the website of the GWG meeting) on access and partnerships were prepared, including draft principles for access to Big Data sources, which will be further developed in close consultation with relevant stakeholders. As a consolidated effort of two of the task teams, an inventory of Big Data projects was compiled, in which projects were also mapped to the SDGs and their targets. Also, scoping papers were drafted for a Big Data classification and a Big Data quality framework, which seeks to bring together a bottom-up (examples) and top-down (theory) approach and to test the framework, which should be adjustable to specific Big Data sources and processes.
Looking forward to 2016, the GWG would continue working on the areas of skills and training while involving more developing countries, quality frameworks for Big Data, access to Big Data by further developing principles for access to proprietary data, and guidance on Big Data methodologies and estimation methods as identified by 2015 global survey and the conclusions from the 2nd global conference in Abu Dhabi. Furthermore, GWG would initiate and participate in pilot projects in the use of Big Data for official statistics including those on the measurement of selected SDG indicators.
The 3rd global conference on Big Data for official statistics is expected to take place in mid-2016 in Dublin, Ireland. The provisional agenda is organized along the same lines of the Big Data quality framework, namely Theme 1: “Data access and partnerships: the win-win scenarios”, Theme 2: “Statistical production: the role of Big Data and skills and capacity required”, and Theme 3: “Big Data for official statistics and SDG indicators”.Related documents/links:
At its biennial meeting in May 2011 in New York, the Expert Group (EG) on International Statistical Classifications agreed to establish a technical subgroup (TSG) for the revision of the Classification of Broad Economic Categories (BEC). As further explained below, the revised BEC was not submitted to the Commission in 2014, but is now intended for submission in 2016.Whereas a first full draft of the manual of the BEC, fifth revision, was completed by June 2013, several iterations were necessary to arrive at a draft manual which was ready for global consultation.
This revision process of the draft manual moved the preparation of the questionnaire of the global consultation into 2014. The global consultation was finally conducted from July to September 2014, and a report of this global consultation has now been submitted to the EG at its biennial meeting in May 2015 as a background document. The timing of the global consultation did not allow enough time to finalize the BEC manual in 2014.
The technical subgroup submitted a report to the Expert Group on International Statistical Classifications giving an overview of work undertaken in the review of BEC in 2015. The Expert Group then endorsed the manual and recommends to the UN Statistical Commission (the 47th session in 2016) that BEC Rev 5 be approved for use as an international statistical classification and be included in the International Family of Statistical Classifications. The unedited version of the manual will be submitted as a background document.
The new classification, BEC Rev 5, has more levels than the previous classification and provides better guidance on end-use categories for analytical purposes. The new dimensions are:
With the completion of the manual, the attention is now shifting to finalizing the correspondence tables between BEC Rev.5 and HS/ CPC/ EBOPS /ISIC. Those correspondence tables will be posted on the website of UNSD as soon as possible.Related documents/links:
Decision 46/101 (http://unstats.un.org/unsd/statcom/46th-session/documents/statcom-2015-46th-report-E.pdf) approved the Global Working Group on Big Data for Official Statistics to conduct a global survey on Big Data. The objective of the survey was to assess the situation regarding the steps undertaken thus far by statistical agencies in relation to Big Data. The survey was conducted in June-August 2015. The questionnaire was sent to all national statistical offices with the request to consult with the relevant stakeholders in their national statistical system. A total of 93 countries completed the questionnaire (32 OECD and 61 non-OECD countries). In addition, a reply was received from Eurostat.
As shown in Figure 1, statistical offices see ‘timelier statistics’, ‘reduction of response burden’ and ‘modernization of statistical production’ as the main benefits of using Big Data, followed by ‘new products and services’ and ‘cost reduction’. Interestingly, whereas two thirds of the developing countries see meeting the new SDG demands as a benefit, only one third of the OECD countries share that vision.
As for the urgent needs for guidance, the respondents indicated that the ‘skills and training for Big Data’, ‘quality framework for Big Data’, and ‘access to Big Data” are the three top priorities for guidance, followed by ‘estimation methods’, ‘use of web-scraping data’ and ‘use of mobile phone data’. This result is consistent with the indicated needed skills in the areas of Big Data methodologies, estimation methods and data science.
UNSD conducted a survey in 2015 that requested information from respondents on the national compilation practices related to external trade indices. In total 96 responses were received, of which 70 from non-OECD countries, 26 from OECD countries. In terms of regional groupings, 36 responses were received from Europe, 22 from Asia, 17 from Africa and the Americas, and 4 from Oceania.
Among the 96 respondents, 67 reported that are currently compiling and disseminating external trade indices. 19 countries, among them one OECD country, are currently not compiling and disseminating external trade indices according to the response in the survey. Some countries replied that they are currently compiling external trade indices, but they are only used internally and not disseminating them yet due to data quality issues.
As indicated by figure above, the most commonly cited challenge in the compilation of external trade indices are the heterogeneity of products in the same commodity code, the frequent changes of products and the high volatility in time series of unit values. It is interesting to note that OECD and non-OECD countries encounter many of the same challenges.
The main uses of the external trade indices according to the survey results are the production of national accounts, followed by being used as input to macro-economic analysis and informing policy makers for monetary and fiscal policy. Similarly, the main users are ministries, Central Bank and other governmental agencies, followed by international organizations and research institutes and academia.
According to the survey results, most countries see the need for support in the compilation of external trade indices by the provision of description of practices in other countries, followed by regional training courses and workshops and e-learning material. In particular, some countries expressed the need for more "exchange of good practices" in workshops rather than "training". In terms of future plans, many countries are planning to change the base-year in the coming years.
The UNSD survey conducted in 2015 requested information from respondents on national practices in linking trade statistics to the SBR. Responses were received from 98 NSOs, including 54 developing economies, 10 economies in transition, and 34 developed economies, representing 30 OECD countries and 68 non-OECD countries. Among the 98 national statistical offices that responded, 92 report that they currently have a functioning SBR. In addition, 46 percent report that they have linked international merchandise trade statistics (IMTS) to the SBR; 29 percent report linking statistics on international trade in services (SITS) to the SBR; and 28 percent report linking foreign direct investment statistics (FDI) to the SBR. Of those currently linking, the type of trade by enterprise characteristic (TEC) data that is disseminated was also presented. Most respondents report they are disseminating value and number of enterprises by size class of the enterprise (number of employees) and by economic activity. TEC data by foreign ownership relationship of the enterprise is also commonly cited. See the figure below.
The following challenges are cited the most by the respondents when attempting to link trade statistics to the SBR:
“Matching enterprises or establishments” is the most commonly reported challenge, cited by one third of respondents. More specifically, two respondents report that enterprise groups or VAT groups are assigned one identification number, which makes it difficult to allocate them to an enterprise-level statistical unit. One respondent reports that identification numbers are sometimes missing, which necessitates the use of business name to conduct the link, which may be an imperfect procedure. Another respondent reports that the Customs authority internally assigns its own identification numbers to enterprises which cannot be linked to the SBR.
In 2015, the SDMX-IMTS Working Group initiated a review of the proposed global SDMX-IMTS data structure definition. With regard to the content of the data structure, the respondents were asked to review the concepts and the associated code lists, and, in addition, to discuss the challenges faced by national statistical offices and capacity building needed in the implementation of SDMX. A total of 87 countries, economic areas and organizations have responded. Out of 87 respondents, 3 were international / supranational organizations (SESRIC, UNWTO, and WTO). SDMX Secretariat Technical Working Group (TWG) provided a general feedback separately. Considering the technical nature of SDMX, the response of 87 respondents indicates the high interest in countries. In general, the working group received a positive feedback on the proposed data structure definition.
The respondents indicated that SDMX is considered part of information infrastructure and can facilitate external reporting including for international reporting purposes. Around twenty-five percent of them noted the plan to implement SDMX-IMTS in 1-2 years. However, a concern was raised on lack of technical skills and the feasibility of implementing SDMX for IMTS due to data volumes. EU countries reported that Trade by Enterprise Characteristics (TEC) data have been regularly produced and submitted to Eurostat using SDMX.
These following challenges are identified in the global consultation:
Based on the experiences from the implementation of SDMX Balance of Payments, NA, countries emphasized the importance of the stability of DSD once enforced; otherwise it would be costly to change. Furthermore, countries indicated that the capacity building should be focused on the SDMX tool and its usage especially for the developing countries.
More about SDMX-IMTS at http://comtrade.un.org/sdmx
An inter-agency working group comprising of Eurostat, the International Trade Centre (ITC), the Organization for Economic Cooperation and Development (OECD), the United Nations Statistics Division (UNSD - as secretariat and chair of the working group), and the United Nations Conference on Trade and Development (UNCTAD) has been established in 2013 to support the implementation of SDMX standards in International Merchandise Trade Statistics (IMTS). The working group, which develops these new standards, seeks to specify uniform structures, concept definitions and code lists for IMTS data and metadata which comply with the latest version of the SDMX standard (2.1), and which follow the latest recommendations for IMTS (IMTS 2010). To the extent possible, SDMX-IMTS reuses concepts and code lists that have been already agreed internationally. The design of zero draft Data Structure Definition (DSD) for SDMX-IMTS was completed in early May 2015 and was released for global consultation and public review in order to receive broad input on the proposed DSD. As of October 2015, the draft DSD consists of 31 concepts of which 18 dimensions and 12 attributes. The draft of DSDs is accessible from http://comtrade.un.org/sdmx.
The SDMX-IMTS global consultation (or public review) was conducted in May and June 2015 to obtain feedback on the contents, implementation plan and capacity building needs. A total of 87 countries and organizations have responded. Considering the technical nature of SDMX, those responses may indicate the high interest in countries. Out of 87 respondents, 3 were international / supranational organizations (SESRIC, UNWTO, and WTO). SDMX Secretariat Technical Working Group (TWG) provided a general feedback separately. It is expected that results of the global consolation are to be analyzed and integrated into the final global SDMX-IMTS.
The working group and other relevant partner organizations have been discussing the implementation strategies and capacity building activities in countries. Some of the ideas as follows: establishment of pilot implementation project in countries, alignment with WCO Data Model (SDMX-IMTS as Derived Information Package), or creation of SDMX-IMTS extraction module in commonly used trade data processing system (i.e., EUROTRACE). Furthermore, raising awareness of SDMX-IMTS should be part of regular technical assistance activities.Related documents/links:
The Chief of the International Merchandise Trade Statistics Section attended a meeting in Putrajaya, Malaysia, as part of the of ADB capacity building project on “Statistical Business Registers for improved information on small, medium and large enterprises” in five project countries - Lao PDR, Cambodia, Bhutan, Sri Lanka and Malaysia. The meeting comprised of a) in-country discussion on the linking trade and business statistics in Malaysia, and b) a regional meeting on the development of the Statistical Business Registers system for other four project countries. The meetings were closely linked to the work of the UNSD’s Trade Statistics Branch as the upgraded version of UN Comtrade database has made provision for a field on the industry sector of the enterprise, which means that the linked trade by business statistics can be uploaded to UN Comtrade in the future. The regional meeting further trained participants on the guidelines/best practices for linking Statistical Business Registers and trade statistics; discussed their practical implementation in Malaysia as a case study, and considered the Asian Statistical Business Registers prototype system, developed by ADB for the project. It is expected that UNSD continues to work together with ADB in the implementation of SBR and linking trade and business statistics in 2016.
On 21 September 2015, the 56th Harmonized System Committee held a special half-day session to raise awareness of the importance of international trade statistics and their relationship with the HS and Customs declarations. The event was a joint effort on behalf of the members of the Task Force on International Trade Statistics (TFITS). The objectives for the special session were to find ways for Customs administrations to support trade statistics and statisticians.
Furthermore, this session is meant as a reminder to the community of Customs officials that the HS classification and Customs declarations are still important instruments and data sources for the community of official statisticians, even if measurement approaches have changed. UNSD, OECD, FAO and OECD highlighted the aspects of the Customs work, which are of special concern, including changes in HS nomenclature (for instance, in the area of fisheries, the identification of second-hand goods and IT waste), the proper recording of quantity information, the recording of special trade regimes, such as inward and outward processing, and the availability to statistical offices of additional electronic filings, such as advanced warning information, shipping manifest, etc.
The WCO has dedicated 2015 to promoting Coordinated Border Management (CBM) under the slogan “Coordinated Border Management - An inclusive approach for connecting stakeholders”. This stresses cooperation and communication, not only within Customs, but with other agencies as well. Noting that not every Customs administration is linked with statistical agencies, it is the hope of the WCO that Members recognize the importance of such cooperation to improve the quality of their work in this regard.
The International Trade Centre with the support of UNSD organized a 3-day workshop on the detection of outliers and calculation of external trade indices. The workshop reviewed the theoretical dimensions of the calculation of trade indices, and presented the advantages and disadvantages of various statistical methodologies based on a previous documentation prepared by UNSD. UNSD also presented the key findings of a recent global assessment on external trade indices, focusing on the current practices and challenges in calculation these indices in developing countries. The workshop then moved on to the technical implementation of the detection of outliers and calculation of unit value indices using the statistical software SAS, and the participants learned through a variety of exercises how to work with the SAS programme for this purpose.
UNSD and INEGI jointly organized an international workshop on Economic Census, Statistical Business Registers and Integrated Economic Statistics in Aguascalientes, Mexico from 29 September to 1 October 2015. The objective of the workshop is for participating countries to share experience and knowledge in maintaining and linking the SBR to economic statistics for creating an integrated economic statistics programme, in running an Economic Census, and using administrative data.
The workshop will allow for discussion and sharing of examples, knowledge and best practices on the following themes:
For more information, visit the workshop website.
On 20-22 October 2015, the Second Global Conference on Big Data for Official Statistics took place in Abu Dhabi, UAE, and was attended by about 250 statisticians from around the world. The Conference was opened by the UAE Minister of Culture, Youth and Community Development, who placed the relevance of Big Data in the context of the 2030 Agenda for Sustainable Development and made reference to the "Better Data. Better Lives" theme of World Statistics Day, which coincided with the opening of the Conference. The Conference showed advances of Big Data projects using Mobile Phone data, Social Media data and Satellite data for a variety of statistical applications. Progress was also shown on topics of capacity building, data access and partnerships, quality and methodology, and on how to communicate better the value of Big Data. High on the priority of work on Big Data is the link to the indicators for monitoring progress on Sustainable Developments Goals. UNSD was the main organizer of this event in cooperation with the Australian Bureau of Statistics (Chair of the GWG) and the host country.
For more information, visit the conference website.
In 2015, UNSD and Eurostat have started to explore possible ways for closer collaboration and cooperation between two agencies in the areas of economic globalization, trade in goods and services, and big data, from substantive methodological issues to technical challenges such as data exchange. The international trade statistics related topics under discussion are as follows:
As part of our exchange programme with Eurostat, Albrecht Wirthmann, one of the leaders of the Big Data team at Eurostat worked at the Trade Statistics Branch/UNSD for three weeks in December 2015. During the time he supported the finalization of the Secretary General's report to the UN Statistical Commission "Big data for official statistics" and other documents and he also worked on the restructuring of the Task Teams within the Global Working Group on Big Data for Official Statistics. He held a presentation "Big Data for official statistics at Eurostat and Big Data collaboration between Eurostat and UNSD" on 16 December with the participation of DESA colleagues.
About 1800 statisticians gathered in Rio for the 60th World Statistics Congress, among which a large number of official statisticians. The congress contained around 200 sessions over the course of the week, again many of them on topics related to official statistics. The Chief of the Trade Statistics Branch represented the Statistics Division as speaker in a number of these sessions, namely in a special session on the Data Revolution (organized by Enrico Giovannini, co-chair of the IEAG on Data Revolution), as well as in two sessions on Big Data and in sessions on administrative data, the SDG indicator framework and measuring Global Value Chains. Moreover, he chaired a session on measuring well-being and sustainable development, and a session on measuring sustainable tourism at sub-national level. Throughout the week Sustainable Development Goals and Big Data were recurring themes in many of the sessions related to official statistics.
Learn more about the Congress at http://www.isi2015.org.
On 9 Sep 2015, Statistics Denmark together with the University of Copenhagen organized a Conference on the topic of how Big Data can be important for the prosperity of the society as a whole. The event was opened by the Danish Minister of the Interior and Social Affairs, Ms. Karen Ellemann who spoke about how the use of Big Data can contribute to increasing growth and policy space and create a more focused and efficient public service. Other speakers included Hal Varian, the Chief Economist of Google, who showed how relevant and timely analyses can be created on the basis of Google searches, and the Chair of the Danish Industry Association, who presented a number of products, developed using Big Data. The Chief of UNSD’s Trade Statistics Branch described the work of the United Nations Global Working Group on Big Data for Official Statistics, as well as the various layers of a country’s statistical data infrastructure, including traditional data sources such as census and surveys, administrative data sources such as registers and other official records, and Big Data sources such as mobile phone, social media and satellite imagery data.
The Task Force on International Trade Statistics (TFITS) had its regular meeting with the participation of the international organizations members and countries and was hosted by OECD. The Task Force discussed the most current issues in the area of international trade in goods and services statistics. UNSD contributed to selected items including the asymmetries in trade statistics, the TFITS institutional arrangements, SDMS for IMTS, Broad Economic Categories classification and the current status of the data collection in international trade in goods and services for the 2014 data collection cycle. The Task Force decided to focus on some recommendations for countries to work towards the reduction on the bilateral asymmetries and to promote the importance of providing more detailed metadata by the countries.
Detailed agenda and work programme: http://unstats.un.org/unsd/trade/taskforce/meetings.asp.
On 19 October, the UN Global Working Group (GWG) met in Abu Dhabi to discuss progress on its work through 8 different Task Teams and guided by a global consultation on needs and advances on Big Data management and Big Data projects in national statistical systems. The Task Teams delivered papers on access to data, classification and repository of Big Data, a paper on quality and methodology of Big Data, and a promotional brochure and video. Through the global survey information on 115 Big Data projects were collected which will be made available through the repository. Big Data projects are often based on multi-disciplinary teams with involvement from private sector, research institutes and possibly others. The GWG works on compelling business cases, why these partnerships are win-win situations for all involved. Going forward the GWG will focus on application of Big Data for the SDG indicators and for developing countries. In this sense the opportunities offered through the Sandbox in Ireland and the Global Pulse labs in Jakarta and Uganda will be fully taken advantage off. The meeting was attended by 34 representatives from countries and international organizations. UNSD serves as the secretariat of the GWG.Related documents/links:
The Conference was organized by Puerto Rico Tourism Company within the framework of InRouTe and in collaboration with UNWTO. The Conference focused on the following themes: tourism and territory, tourism and the environmental dimension (particularly water use and energy use by the tourism industries), Tourism Satellite Accounts (TSA) and Regional TSA. The draft document “Subnational Tourism: Basic Glossary” was presented and discussed during the Conference. Presentations were made by representatives of several countries and international organizations, including UNSD, underlying the importance of producing tourism statistics on three levels: local, regional and national. UNSD also participated at the 1st meeting of the UNWTO Working Group on Measuring Tourism for Sustainable Development which was organized on 20 November, discussing issues related to the three SDG targets covering aspects of sustainable tourism.
The latest newsletter of Interagency Task Force on International Trade Statistics is released in December 2015 [+add link]. The newsletter contains the latest developments in the area of international trade and economic globalization statistics (including tourism statistics). The 11th issue of the newsletter focuses on the topics of Sustainable Tourism Targets in the 2030 Agenda, Reconciling Bilateral Merchandise Trade Asymmetries, Global SDMX IMTS Data Structure Definition, TF-ITS Trade in Services Metadata Questionnaire, Classification of Broad Economic Categories, Rev. 5 and ICT Services and ICT-Enabled Services. In addition, it also describes current and past technical assistance activities and the latest publications and databases.
For details, visit Website of Interagency Task Force on International Trade Statistics.
The Big Data Sandbox project was initiated in 2014 in order to provide a shared platform to facilitate groups and individuals to collaborate on evaluating and testing new tools, techniques and sources which have the potential to be of use in modern statistical analysis. This project has proved very successful and has extended into 2015 with an improved sandbox infrastructure and addressing more ambitious and challenging goals. In 2015, four task teams were established to work on specific projects: 1) Trade Network Analysis using Comtrade data, 2) Job vacancy through enterprise web scraping, 3) Users sentiment using Twitter data, and 4) Analysis of World Heritage Sites, Cities, and Beaches through Wikistats.
In brief, three substantive goals were defined for Comtrade Project:
Substantive goals 1 and 2 were achieved, but due to time constraints the goal 3 was not able to be implemented. In addition, we are pleased to report that it is possible to use big data tools and technologies (Hadoop, Rhadoop, Pig, Hive, Spark and Gethpi) to process and analyse huge volume of trade data. The easiness of setting up the data environment, powerful computing power and availability of built-in libraries to analyse networks may change the way trade analysts work, and how UN Comtrade will offer services to users in the future (in addition to purely data services, UN Comtrade might also offer an analytical platform through the use of Hadoop, Pig, Hive, Spark, and other big data technologies).
Distributions of the degrees of the networks between 2000 and 2013: mean, st dev, max for degrees (top panel), indegrees (medium panel), outdegrees (bottom panel)
Big data tools used in the project and findings of their use
HADOOP: Trade data are characterized by their huge volume. They have many variables, but are highly structured. This combination of data characteristics fit with the Hadoop technology: easy to process and analyse data in raw text format, without the need to import such data files into a specialized database. Used as data storage.
PIG: This scripting-based language is suitable to clean incoming data files (updating the text qualifiers). Used in data cleaning phase.
HIVE: This tool makes it easy for SQL-specialists that have been using traditional relational database management systems (RDBMS) to query and analyse data. Complex SQL queries with sub-queries and nested statements are supported, and it enables calculation of complex methods (such as data estimation). Regarding the performance, the SQL queries that run under sandbox are quite good, and probably better compared to traditional RDBMS (with good design of data structure supplemented by indexes). Used to perform data preparation.
RHADOOP-MAP/REDUCE: Open source R package that allows to write and trigger MapReduce jobs in R. It was conveniently used to write the MapReduce script for symmetry analysis directly in RStudio, and to retrieve datasets from HDFS in R for producing visualizations. Negative aspects are a general instability and the lack of a coherent documentation. However, the practical experience from this project could be part of material for training. Used to carry out analysis of bilateral asymmetry.
SPARK: An open source cluster computing framework. In contrast to Hadoop's two-stage disk-based MapReduce paradigm, Spark's multi-stage in-memory primitive provides performance up to 100 times faster for certain applications. Used to calculate trade network properties, and visualize them.
GEPHI: This software provides an interactive visualization and exploration platform for the easy creation of social data connectors to map community organizations and networks. It runs on Windows, Linux and Mac OS X and is based on NetBeans UI. Used to visualize network structures.
In 2016, it is expected that UNSD and partner organisations continue to work on this project, taking into account conclusions and findings in 2015, such as using big data tools to automatically detect of trade clusters, and refining the procedures to acquire raw data (now through API) and streamlining data cleansing activities. In addition, other groups may be interested to use Comtrade in Sandbox for other purposes (i.e., analysis of unit values, analysis of bilateral asymmetries).
Finally, we would like to express sincere thanks and appreciations to the task team members for their active participation and contribution.
Task team members:
Michael Behrman (un)
Stéphanie Combes (fr)
Markie Muryawan (un)
Pilar Rey-del-Castillo (es)
Toni Virgillito (it)
Read more at Report on Sandbox Comtrade Project (forthcoming).
UNSD welcomes initiative of AUC and Eurostat to improve features, performance, scalability and reliability of Eurotrace, a trade statistics data processing and dissemination system, within the Pan African Statistics (PAS) programme in 2016 and beyond. As the custodian of both trade statistics concepts and definitions; and official global repository of international trade statistics database (UN Comtrade), it is expected that UNSD would benefit greatly from the improvement of Eurotrace. In that context, taking into account the importance of having good coordination and alignment between the needs of countries and global programme on trade statistics, UNSD takes the lead to coordinate the project on the improvement of Eurotrace.
The first activity in 2016 is to conduct a global survey for existing and potential Eurotrace users to take stock the training needs and to catalogue features/functionalities that are relevant and important for countries. The result of the survey will be taken as input to design training modules, and to set priority of improvement and/or addition of Eurotrace modules, in order to ensure high re-usability and avoid duplication of efforts.
Based on the preliminary assessment with the main stakeholders of Eurotrace, the following modules are identified as main features/functionalities that are relevant to users:
Download Eurotrace from Eurotrace software package site.
Thanks to UN Comtrade Public API, there are more and more institutions developing advanced and innovative data visualization using Comtrade data. These data visualizations are being catalogued and made publicly available at UN Comtrade Labs. Thus, it is a place to showcase innovative and experimental uses of UN Comtrade data. Several trade data visualization initiatives have been added to Comtrade Labs in the period July – December 2015:
BIS International Trade in Goods Visualization by the UK Department for Business, Innovation & Skills (BIS) The UK Department for Business, Innovation & Skills has created this interactive visualization which reflects the very latest data available in UN Comtrade. Countries’ exports, imports and trade balances are displayed in a user-friendly, color-coded world map, along with downloadable time-series data and information on the top 10 trading partners and traded commodities. The visualizations can be further customized by selecting specific trading partners and/or commodities. The tool utilizes the UN Comtrade Application Programming Interface (API), which currently allows up to 100 requests per hour.
The Globe of Economic Complexity by the Center for International Development at Harvard University The Globe of Economic Complexity dynamically maps all countries’ traded commodities, their volumes and export destinations, and allows users to navigate export networks and intricate connections between products. The tool utilizes novel web technologies (WebGL) and design to visualize trade flows as 3-d “confetti,” with the ability to morph into bar charts, compounded country textures, and node-link diagrams.
Asia Pacific Energy Portal, Provided by UN Economic and Social Commission for Asia and the Pacific (ESCAP) The Asia Pacific Energy Portal provides interactive visualization of energy-related data and policy information for ESCAP member and associate member States, including information on pricing, international trade, investment, environment, access and related policies. It also provides maps and Sankey diagrams of energy imports and exports based through UN Comtrade Public Data API.
Recognising the importance of sharing experiences in use and analysis of Comtrade data among the users of UN Comtrade, UNSD jointly with the UN Sales and Marketing section of DPI are planning to organize an event for UN Comtrade users later in 2016, which will focus on the use and analysis of Comtrade data including data visualization.
Tell us what you think about this event at https://www.surveymonkey.com/r/comtrade_user_group_2016.
The 2014 International Trade Statistics Yearbook: Volume II - Trade by Product, provides an overview of the latest trends of trade in goods and services showing international trade for 257 individual commodities (3-digit SITC groups) and for the 11 main Extended Balance of Payments Services (EBOPS) categories. The publication is aimed at both specialist trade data users and common audience at large. The presented data, charts and analyses will benefit policy makers, government agencies, non-government organizations, civil society organizations, journalists, academics, researchers, students, businesses and anyone who is interested in trade issues. The figures provided in this publication are based on data directly reported by the countries to the UN Statistics Division.
The main content of the yearbook is divided into three parts. Part 1 consists of 11 detailed world data tables on merchandise trade. Part 2 contains the commodity trade profiles for 257 individual commodities. Part 3 contains profiles of service trade for the 11 main EBOPS categories. The profiles offer an insight into the trends in individual commodities and service categories by means of brief descriptive text, concise data tables and charts using latest available data. The information on commodity trade in this year’s edition of the yearbook is based on data provided by 148 countries (areas), representing 95.9% of world trade in 2014 and the information on service trade for 2013 is based on data provided by 161 countries (areas). For further information on data availability, please see the sources section of this Introduction.
UN Comtrade has attracted 400,000 users in 2015, which is almost double the figure of 2010. On average, there are 1172 users downloading 270 million of Comtrade data each day. The trend of increasing use of UN Comtrade can be seen in the graph below. As part of the ongoing upgrade of UN Comtrade, the Bulk Data API feature was released in November 2015. This enables users to access full data set downloads in a compressed CSV format (thus making it easy to download whole data sets). As an example of how the Comtrade team of UNSD supports specific users, UNSD is assisting the independent panel of experts of the Security Council, which investigates trade in specific commodities and for specific years of UN member states with North Korea in reference to UNSC resolution 1874. This investigation is done on the basis of UN Comtrade data.
New UN Comtrade features: http://comtrade.un.org/data/doc/releasenotes
In December 2015, as part of exchange programme with Eurostat, Albrecht Wirthmann, one of the leaders of the Big Data team at Eurostat worked at the Trade Statistics Branch/UNSD for three weeks. Even though it was a short duration, however, it was a pleasure to have him as our branch colleague!