The Statistical Commission agreed at its 45th session to create the Global Working Group (GWG) on Big Data for Official Statistics to further investigate the benefits and challenges of Big Data, including the potential for monitoring and reporting on the sustainable development goals. In this context, the GWG and the greater official statistical community recognize the need to adequately address issues pertaining to methodology, quality, technology, data access, legislation, privacy, management and finance, and provide adequate cost-benefit analyses on the use of Big Data.
4th UN Conference on Big Data
Bogota, Colombia 8 - 10 Nov 2017
Data collaboratives are a new challenge and new opportunity for the community of official statistics - in relation to Big Data, to the SDGs, to the sharing of data, services, technologies, and know how.
- How can statistical offices, technology companies and data owners collaborate in a mutually beneficial way in a changing world, in which data are seen as the most important source for creating wealth and development for all?
- What are the experiences and lessons learned from existing data collaboratives in relation to coverage, inclusion (and exclusion) of partners, activities, management and financing?
- How can we share micro-data and other sensitive data in a federated cloud environment given regulatory frameworks for data privacy and statistical laws protection confidentiality?
- How could we effectively and collaboratively use modern tools and services, such as data lakes, integrated geo-spatial data and statistics, or open source elastic stack while adapting job profiles and skills sets in the statistical office?
These and similar questions will be addressed at the 4th UN Conference on Big Data.
The GWG has compiled a Inventory of Big Data projects (including exploratory research, feasibility studies, pilot projects and projects currently in production) that have implications for compiling official statistics and/or supporting the measurement of the SDG indicators. The aim is to share broad information about potential Big Data projects in the statistical community and share specific information about partnerships, data sources, and tools. The Inventory includes information such as the objective of the project; the Big Data source used; data access and the use of partnerships; applicability to specific domain(s) of official statistics and/or SDG indicators; methods and technology used; and assessment of quality, among others. The GWG collected this information from the statistical community in two surveys conducted in 2014 and 2015.Explore Inventory
Access to Big Data sources and forging partnerships with other public and private organisations in order to work with Big Data is becoming ever more important to national statistical systems (NSS) for fulfilling their mission in society. The national statistical systems (NSS) should collaborate rather than compete with the private sector, in order to advance the potential of official statistics. At the same time, the NSS should remain impartial and independent, and invest in communicating the advantages of exploiting the wealth of available digital data to the benefit of the people. Building public trust will be the key to success. The objectives of the task team are to facilitate access to Big Data sources for official statistics and facilitate forming partnerships with other public and private organisations in order to work with Big Data.
The recent report of the Independent Expert Advisory Group (IEAG) on the Data Revolution for Sustainable Development defines the data revolution for sustainable development as the integration of data coming from new technologies with traditional data, in order to produce relevant high-quality information, with more detail and at higher frequencies to foster and monitor sustainable development. This revolution also entails the increase in accessibility to data through much more openness and transparency, and ultimately more empowered people for better policies, better decisions and greater participation and accountability, leading to better outcomes for the people and the planet.
Mobile Phone Data has surfaced in recent years as one of the Big Data sources with a lot of promise. It is expected that Mobile Phone data could fill data gaps especially for developing countries given their high penetration rates. In its 2014 'Measuring the Information Society Report', ITU shows that the average mobile subscription rate is 96.4 per 100 inhabitants world-wide, with some lower averages in Asia (89.2) and Africa (69.3). Nevertheless, these numbers show how pervasive mobile phone use is. ITU elaborates that rural areas are still lacking behind urban areas, and this should be considered in studies using Mobile Phone data, but it is clear that the coverage of these data is global. Almost every person in the world lives within reach of a mobile-cellular signal.
The demand for more diversified, sophisticated and rapid statistical services could be met by leveraging the emerging sources of Big Data, such as those relating to remote sensing imagery, transactional and social media data and mobile device data. Satellite imagery has significant potential to provide more timely statistical outputs, to reduce the frequency of surveys, to reduce respondent burden and other costs and to provide data at a more disaggregated level for informed decision making. The Task Team on Satellite Imagery and Geo-Spatial Data aims to provide strategic vision, direction and development of a global work plan on utilising satellite imagery and geo-spatial data for official statistics and indicators for post-2015 development goals. We are building on precedents to innovatively solve the many challenges facing the use of satellite imagery and geo-spatial data sources.
Scanner data is a Big Data source being increasing used in national statistical systems for the calculation of price indices as statistical offices explore ways to meet the expectation of society for enhanced products and improved, more efficient ways of working. Many of the price measurement issues and methods for scanner data from supermarket chains and other retailers apply also to other big data sources (for example, online prices obtained from webscraping). This task team plans to deliver: 1. An open source application for analysis, monitoring and index estimation from cleaned and classified Pricesâ€™ big data; 2. Accompanying training and instructional material; and 3. Accompanying methodological guidance including recommendations and cataloging good practice.
Big Data is by definition different from traditional data sources used by national statistical systems (NSSs). This implies that new methodologies need to be developed to work with Big Data. The kind of sources of Big Data poses challenges both in how to approach their processing and analysis, but also the mere technological way of dealing with them. This means that new skill sets are necessary to successfully work with the new Big Data sources. Some of these new skill sets could be hired temporarily, others will need to become in an integral part of the institution. It is up to the senior management to decide what will be done by the institute itself and what will be outsourced. An additional complication is that there is not just one kind of Big Data source, and each kind of Big Data may have different requirements as far as new skill sets are concerned. We need therefore to develop tools to identify and assess the needs for new skills.
Building on the best practices of public and private Big Data initiatives, and offering the technology infrastructure and a network for data innovation to the official statistical community, the Global Platform could address the needs for (a) a global hub for official statisticians, data scientists and domain experts from the public and private sector to exchange ideas and methods for processing, analysing and visualising Big Data; (b) a global hub for storing Big Data, and related processing, analysing and visualising methodology, and services and applications for continuous development and re-use; (c) a global hub for demonstrating the value of Big Data in better decision making through official statistics through pilots and case studies; and (d) a global resource hub for training materials and workshops on Big Data for capability building.
- Korea, Republic of
- United Arab Emirates
- United Kingdom
- United States
- UN Global Pulse
- World Bank