Background document for the Note by the Secretary-General transmitting the report of the Global Working Group on Big Data for Official Statistics (E/CN.3/2020/24)


Statistical Commission
Fifty-first session

3–6 March 2020
Item 3 (t) of the provisional agenda*
Items for discussion and decision: Big data




Background document
Available in English only









Global assessment of institutional readiness for the use of big data in official statistics

Executive summary


This report outlines findings from Project 1: An assessment of NSO readiness for the use of big data in official statistics from the UN GWG Task Team on Training, Competencies and Capacity Development. The results show that, of those NSOs that responded to the survey, a large proportion are already embracing big data / data science:


  • Key point 1 - Strategic coordination: Strategic coordination capacities are fairly established. Many NSOs are actively engaged in big data projects. Ethics and quality frameworks are fairly established. Most NSOs view coordination with Big Data source owners inside their NSS as the lowest challenge.

  • Key point 2 – Legal framework: Overall, respondents are aware of the fundamental role legal frameworks play in establishing big data projects. Many NSOs appear to have well developed legal frameworks that penalize data disclosures and allow accredited to access their data.

  • Key point 3 – IT infrastructure: The analysis shows a heterogeneous picture over the IT infrastructure. NSOs stated that basic IT infrastructure such as power supply, and air-conditioning mostly met their needs, but they outlined struggles with storage facilities and computing power.

  • Key point 4 – Human Resources: NSOs recruit significantly more analysts than data scientists and prioritize up-skilling over hiring external staff to perform big data/data science techniques.


Overall, the findings present a positive picture in terms of ensuring that the required foundations are in place and illustrates the ambition across NSOs to incorporate big data / data science into its core business. There are areas in which NSOs may require further information, guidance, development and knowledge to ensure that barriers to working with big data are removed:


  • Key point 1 - Strategic coordination: Only a third of all NSOs have overarching big data strategies in place and Chief Data Officers only exist in some NSOs. The biggest challenge for NSOs is the collaboration with Big Data source owners outside the government.

  • Key point 2 – Legal framework: Legal frameworks are still insufficient to regulate big data applications. Only a small share of NSOs rely on legal frameworks that guarantee access to big data.

  • Key point 3 – IT infrastructure: IT infrastructure appears as central barrier to develop big data capacity; onsite and offsite storage capacity needs improvement for many. Only a few NSOs consider cloud storage a relevant option.

  • Key point 4 – Human Resources: Most NSOs lack a competency framework to develop new skills to cope with big data (mobile phone, geospatial data) and new methodologies (machine learning).

Banner

Recommendations


Based on the analysis conducted by the Task Team, the following recommendations can guide the work of international organizations, development cooperation partners and national statistical offices to adapt to big data requirements:


Strategic Coordination:

  • Promote the sharing of training resources on the United Nations Global Platform

  • Promote the exchange of Big Data projects from all regions through the United Nations Global Platform

  • Advocate big data strategies as one pillar of National Statistical Development Strategies (NSDS)

  • Facilitate partnerships and exchange with data owners outside the NSS

Legal frameworks:

  • Develop legal frameworks that include data sharing agreements between NSOs and private sector data owners

  • Advocate for the importance of data privacy and data protection laws

IT Infrastructure:

  • Advocate for cloud storage facilities in countries with necessary pre-requisites

Human Resources:

  • Develop an overarching competency framework for big data skills development and HR strategies

  • Investigate the potential for defining a data scientist pathway

  • Foster partnerships with higher education institutes to design skill profiles for future employees

  • Identify training pathways that allow up-skilling of staff available to share their knowledge in their teams (in collaboration with academic institutes)

Mobile_phone

Background


Big Data is, by definition, different from the traditional data sources used by National Statistical Organisations (NSOs). The new data sources pose new challenges across a range of expert areas, including methodology, quality assurance, technology, security, privacy, legal matters and skills. The breadth of challenge adds to the complexity of incorporating big data into regular research or organisational operations and ensures that the transition to their use is difficult, or hindered, for many NSOs. The UN GWG Task Team on Training, Competencies and Capacity Development, is tasked with delivering projects in five specific areas:

A. Assessment of institutional readiness for big data in official statistics;

B. Development of a Competency Framework for new data sources in official statistics;

C. Analysis of the supply and profiles of specialists in areas related to the analysis of new data sources and big data;

D. Development of a curriculum and associated training courses;

E. Capacity building and sharing experiences through innovation centres via a global network.

This report presents results from the first project, (A) an assessment of institutional readiness. The project aims are to explore and understand the readiness of NSOs for the use of big data in official statistics, as well as to gather useful insights that might feed project strands (B) – (E). For the purpose of this project, an institution’s “readiness” is defined by its maturity within four strategic areas:

  1. Strategic Data Science Coordination: The presence of, or future plans for, strategic data science coordination within the NSO and across the National Statistical Service (NSS). This will have also considered the budgetary requirements for financing big data analytics within the organisation.

  2. Legal Framework: The presence of, or future plans for, a legal framework for data access and data sharing within the NSO, the NSS, and potentially wider.

  3. IT Infrastructure: The extent of, or future plans for, the IT infrastructure to enable big data analytics in a secure environment.

  4. Human Resources: The number of data science posts within the NSO/NSS, the skills gaps and the future plans for recruitment and growth. This includes the skills needed to develop and maintain appropriate methodologies.

Data collection was undertaken via a questionnaire. This was designed to collect data from across these four areas, to enable an assessment of institutional readiness. The questionnaire was issued to 160 NSOs during the period from 4th October 2019 to 15th November 2019. Responses were received from 109 statistical organisations. After data cleaning (removal of non-complete responses and larger non-national organisations) 100 National Statistical Organisations (NSOs) were then used for our analysis. The overall response rate was 63%.

In order to support the work of other UN Task Teams, the results of the analysis will be fed across the UN network, to ensure that important findings from the data are shared for constructive use by others.

Survey respondents:
Africa Americas Asia.and.Pacific Europe
Botswana Antigua and Barbuda Afghanistan Albania
Burkina Faso Bolivia (Plurinational State of) Armenia Austria
Burundi Brazil Azerbaijan Belarus
Cabo Verde Canada Bahrain Belgium
Guinea Chile Bangladesh Bosnia and Herzegovina
Mauritius Colombia Brunei Darussalam Bulgaria
Morocco Cuba Cambodia Croatia
Mozambique Ecuador Macao Czechia
Senegal Mexico Hong Kong Estonia
Sierra Leone Curaçao China Finland
Somalia Panama Cyprus Germany
South Africa Paraguay Georgia Hungary
Sudan Peru Indonesia Iceland
Tunisia Saint Kitts and Nevis Iran (Islamic Republic of) Ireland
Zimbabwe Suriname Iraq Italy
Montserrat Israel Latvia
United States of America Japan Lithuania
Jordan Luxembourg
Kuwait Montenegro
Maldives Netherlands
Mongolia North Macedonia
Myanmar Portugal
Nepal Republic of Moldova
State of Palestine Romania
Philippines Russian Federation
Qatar Slovakia
Republic of Korea Slovenia
Saudi Arabia Spain
Singapore Sweden
Thailand Switzerland
Turkey Ukraine
United Arab Emirates United Kingdom of Great Britain and Northern Ireland
Uzbekistan
Vanuatu
Viet Nam
Yemen

Main report

Strategic Data Science Coordination


The Strategic Data Science Coordination section of the questionnaire aimed to assess the establishment of (or plans for) strategic data science coordination within NSOs and wider (such as their NSS).

1. Big data/data science projects established

Many NSOs provided qualitative information about the type of big data / data science projects that have been established at their organisation. Some of the more common projects involve alternative data sources, for example, web scraped data, mobile phone data and scanner data. More information on projects can be found in the following inventory: https://unstats.un.org/bigdata/inventory/.
Almost half of the respondents undertake big data or data science projects. 47% of NSOs currently undertake big data projects, 32% do not undertake any of those projects, but are trying to establish. Around 21% do not plan to undertake those projects at all.

2. Big data/data science strategy in place

28 NSOs indicate to have a big data/data science strategy in place, with 35% of respondents to the survey indicating that they have implemented such a strategy.
A third of respondents have a strategy for big data in place, almost two thirds (60%) of respondents do report to try to establish a big data strategy in their NSO. Only 5% of respondents do not have any strategy established.

3. Chief Data Officer/Data Science Lead available

20 NSOs indicate to have a designated Chief Data Officer/Data Science Lead in place, with 25% of the respondents to the survey confirming this post.
A quarter of the respondents to the survey have a designated Chief Data Officer. About 42% trying to establish this post, while about 30% do not plan do so.