4th International Conference on
Big Data for Official Statistics
Organized by the UN Global Working Group, DANE and the Colombian Ministry for ICT

Overview

Whereas the statistical community has made good progress on using Big Data, many questions and challenges remain. Statistical Offices still need to make further progress in delivering timelier, more frequent and more granular data; the private sector remains a formidable competitor; more engagement and collaboration with private sector and other partners is necessary to combine and benefit from each other's strengths.

This means that additional steps have to be taken: results from the work of the existing Task Teams should become usable for the statistical community; Big Data, administrative data and traditional statistical data sources should really be treated together via a multi-source approach; the challenge is not to overcome problems of using Big Data in itself, but to achieve trusted data collaboration; and in that respect the UN Global Working Group on Big Data needs a collaborative Global Platform for the overall organization of its work.

In short, the strength of the community of official statistics is trusted data, and the real need of the community is close collaboration with private sector, academia, research community and civil society.

Conference theme: Data collaboratives and trusted data

Data collaboratives are a new challenge and new opportunity for the community of official statistics – in relation to Big Data, to the SDGs, to the sharing of data, services, technologies, and know how.

  • How can statistical offices, technology companies and data owners collaborate in a mutually beneficial way in a changing world, in which data are seen as the most important source for creating wealth and development for all?
  • What are the experiences and lessons learned from existing data collaboratives in relation to coverage, inclusion (and exclusion) of partners, activities, management and financing?
  • How can we share micro-data and other sensitive data in a federated cloud environment given regulatory frameworks for data privacy and statistical laws protection confidentiality?
  • How could we effectively and collaboratively use modern tools and services, such as data lakes, integrated geo-spatial data and statistics, or open source elastic stack while adapting job profiles and skills sets in the statistical office?

These and similar questions will be addressed at the 4th UN Conference on Big Data in Bogota, Colombia from November 8th to 10th – organised in collaboration of the UN Global Working Group, DANE and the Colombian Ministry for ICT.


Click here for Agenda for Meeting of the GWG on Big Data for Official Statistics



Wednesday, 8 November 2017

08:00
Registration — Coffee

Opening Session
09:00
Welcome Address
Mr. Mauricio Perfetti del Corral, Director-General of DANE, Colombia
  • Mr. David Luna Sánchez, Minister of Information and Communication Technologies, Colombia
  • Ms. Alicia Bárcena, Executive Director, UNECLAC (video message)
  • Mr. Niels Ploug, Chair of the UN GWG on Big data
  • Mr. Rafael de la Cruz, Inter-American Development Bank
  • Mr. Martin Santiago, UN Resident Representative in Colombia
Innovation and modernization of national statistics systems through “trusted data collaboratives”

The Cape Town Global Action Plan [Annex I] for Sustainable Development Data emphasized among others the strengthening of innovation and modernization of national statistical systems. This innovation effort calls for a rethinking of the partnerships of the community of official statistics with private sector, academia and civil society through an interconnected ecosystem of data and technology collaborations at national, regional and global level. In this context, “Trusted Data Collaboratives” are a new way of working together with proper definition of the interests of the various stakeholders, proper definition of responsibilities and access, and appropriate protocols to safeguard confidentiality.

09:50
Keynote address
Mr. Alexandre Barbosa, Head of the Regional Center for Studies on the Development of the Information Society
10:15
High-level Panel discussion
Moderator: Mr. Niels Ploug, Statistics Denmark
  • Mr. Carlos Prada, Deputy Director of DANE, Colombia
  • Ms. Heather Savory, Deputy National Statistician, United Kingdom
  • Ms. Sylvie Michaud, Deputy Chief Statistician, Statistics Canada
  • Mr. Gerardo Leyva, INEGI, Mexico
  • Mr. Albrecht Wirthmann, Eurostat
11:25
Coffee break
 
Defining the context – the 2030 Agenda, "leaving no one behind" and official statistics

On 25 September 2015 world leaders committed to the 2030 Agenda for Sustainable Development, including many ambitious goals and targets to be achieved by 2030. The statistical community was charged with defining appropriate indicators to monitor the progress towards these targets. It was stressed that differentiation by population groups, sub-national location and smaller time intervals (“Leaving no one behind”) would make the information base more useful for policy decisions. The 2030 Agenda explicitly calls for enhancing capacity building to support national plans to implement the sustainable development goals. Along with the SDG monitoring, the underlying microdata should also become accessible through national, regional and international open data platforms and become discoverable through standard metadata documentation. These platforms should apply open data protocols for the creation and use of interoperable APIs based on ISO standards to allow for trusted data collaboratives, improved governance, citizens engagement and inclusive development and innovation (see open data charter).

11:45
High-level Panel discussion
Moderator: Mr. Ivo Havinga, UNSD
  • Ms. Diana Nova, SDG Working Group, DANE
  • Ms. Alla Morrison, World Bank
  • Mr. Fessou Lawson, African Development Bank
  • Mr. Philipp Schönrock, CEPEI, Colombia
  • Mr. Aditya Agrawal, Global Partnership for Sustainable Development
  • Mr. Emmanuel Letouzé, Director and Co-Founder, Data-Pop Alliance
13:00
Lunch

New ways of working together
14:30
Session A: Local hubs and City Data

The World Council on City Data (WCCD) is the global leader in standardized city data - creating smart, sustainable, resilient, and prosperous cities. WCCD hosts a network of innovative cities committed to improving services and quality of life with open city data and provides a consistent and comprehensive platform for standardized urban metrics. WCCD is a global hub for creative learning partnerships across cities, international organizations, corporate partners, and academia to further innovation, envision alternative futures, and build better and more liveable cities. WCCD is implementing ISO 37120 Sustainable Development of Communities: Indicators for City Services and Quality of Life.

Moderator: Mr. Ivo Havinga, UNSD
Panel:
  • Mr. James Patava, World Council of City Data
  • Mr. Antion Avendano, City of Bogota
  • Ms. Sylvie Michaud, Statistics Canada
  • City of Buenos Aires (tbc)
Session B: Data collaboratives on SDG monitoring, Part I

The most promising way forward of compiling data for SDG monitoring is the integration of Big Data coming from new technologies with traditional data, in order to produce relevant high-quality information, with more detail and at higher frequencies to foster and monitor sustainable development. This implies also an increase in accessibility to data through much more openness and transparency, which should ultimately more empower people for better policies, better decisions and greater participation and accountability, leading to better outcomes for people and planet.

Moderator: Mr. Ronald Jansen, UNSD
Panel:
  • Mr. Joaquim Barris, Climate Change indicators, UNFCC
  • Ms. Alla Morrison, Collaborative Data Innovations for Sustainable Development, World Bank
  • Mr. Misha Lokshin, Household Survey Network, World Bank
  • Mr. Manuel Francisco Lemos, ESRI, Open Data Portals for SDG monitoring
16:00
Coffee break
16:20
Session A: National Data Centers involving official statistics

Digital services are becoming increasingly important in our lives. Services such as cloud, mobile apps, and other digital applications are facilitated by data centers. They have become the main enabler of the digital economy by facilitating a wide range of activities across government, business and society. Data centers therefore are an important part of the national critical infrastructure. As enablers of the digital economy, data centers play an important role when it comes to trust. Data should not only be accessible and available 24/7, secure data storage and privacy must be guaranteed. Data centers provide a platform for organizations to compute, run and store their services and data. In the Netherlands, municipalities join forces with Statistics Netherlands in urban data centers to use data more effectively in local administration.

Moderator: Mr. Bert Kroese, Statistics Netherlands
Panel:
  • Mr. Jorge Caldas Gallo, Alianza Caoba
  • Ms. Heather Savory, ONS, United Kingdom
  • Ms. Sylvie Michaud, Statistics Canada
  • Mr. Setia Pramana, BPS Indonesia, Jakarta
Session B: Data collaboratives on SDG monitoring, Part I

The most promising way forward of compiling data for SDG monitoring is the integration of Big Data coming from new technologies with traditional data, in order to produce relevant high-quality information, with more detail and at higher frequencies to foster and monitor sustainable development. This implies also an increase in accessibility to data through much more openness and transparency, which should ultimately more empower people for better policies, better decisions and greater participation and accountability, leading to better outcomes for people and planet.

Moderator: Mr. Misha Lokshin, World Bank
Panel:
  • Mr. Alexandre Barbosa, CETIC/NIC - Improving data availability for SDGs
  • Ms. Dias Rahwidiati, UN Global Pulse
  • Mr. Joao Azcevedo, World Bank – Poverty measurement
  • Mr. Estaban Pelaez Gomez, Fundación Corona
  • Ms. Ana Lucía Martínez & Mr. Carlos Mazariegos, OPAL Project, Colombia
17:30
Closing day 1

Thursday, 9 November 2017

09:00
Opening
Mr. Niels Ploug, Chair of the UN GWG on Big data

Standards for Trusted Data Collaboratives

Within the community of official statistics “trusted data” is defined in terms of ‘compliance with quality standards’. There are national quality assurance frameworks, statistical codes of practice and compliance with international standards, such as the System of National Accounts. IMF developed a Special Data Dissemination Standard and a General Data Dissemination Standard, which can be generally used as a measure for “trusted data”. At a technical level, the statistical community has defined protocols for data exchange and interoperability of data systems, like SDMX and DDI. More broadly, the private sector has defined security standards and associated protocols for data transmission, data storage and the like. These are taken up as ISO standards. In the business world, assurance is given through the certification that something is ISO compliant.

What kind of certification do we need for a “trusted data collaborative”?

09:10
Panel discussion
Moderator: Mr. Brant Zwiefel, Principal Architect, Microsoft
11:00
Coffee break

Use cases for Data, Services and Applications
11:15
Earth observation data and official statistics

The GWG task team on satellite imagery, geospatial data and remote sensing developed a handbook that contains information on sources of Earth observation data, methodologies for producing crop statistics and other statistics through the use of satellite imagery data, outlines of pilot projects and guidance for national statistical offices in exploring the use of Earth observation data for the first time. The pilot projects include an application of satellite imagery data in the production of agricultural statistics. This session also discusses a hands-on course to teach methods for using Earth observation data in generating agricultural crop statistics and in monitoring the SDGs.

Moderator: Ms. Sylvie Michaud, Statistics Canada
Panel:
  • Mr. Brant Zwiefel, Microsoft
  • Mr. Paulo Cunha, Amazon Web Services
  • Mr. Manuel Francisco Lemos, ESRI
  • Ms. Kerrie Mengersen, Queensland University of Technology, Australia
  • Mr. Zhou Wei, NBS, China
  • Ms. Yineth Acosta & Ms. Sandra Liliana Moreno, DANE, Colombia
13:00
Lunch
14:30
Scanner data and on-line data and official statistics

This session will provide information on how to access online data, selecting data sources, preparing raw data for use, classifying data and processing data for use in the CPI. It also discusses different methodologies, describe the status of scanner data implementation in different countries. Further, it gives a status update on scanner and online data integration projects and provides guidance for NSOs considering using this data source for the first time.

Moderator: Mr. Ivo Havinga, UNSD
Panel:
  • Mr. Michael Holt, Statistics Australia
  • Mr. Niels Ploug, Statistics Denmark
  • Mr. Jonathan Wylie, Statistics Canada
  • Mr. Manuel Bertolotto, Pricestat.com
  • Ms. Dana Childerhose, Nielsen
16:00
Coffee break

Use cases for Data, Services and Applications
16:15
Mobile phone data and official statistics

This session gives an overview of data generated by mobile communication technologies and choices, which clarifies the trade-offs between size, complexity and usefulness. The session will discuss the importance to understand stakeholders and partnership models for mobile data projects and the logical order of steps in the process of data extraction. It will also discuss how to calculate tourism statistics and how to identify tourism indicators, calibration and inference.

Moderator: Mr. Margus Tiru, Positium
Panel:
  • Ms. Julieth Solano, DANE, Colombia
  • Mr. Juan David Olarte, MINTIC, Colombia
  • Mr. Jose Luis Fajardo, CLARO, Colombia
  • Mr. Fernando Reis, Eurostat
  • Mr. Dan Bogdanov, Cybernetica, Estonia
17:45
Closing day 2

Friday, 10 November 2017

09:00
Opening
Mr. Niels Ploug, Chair of the UN GWG on Big data

Use cases for Data, Services and Applications
09:10
Session A: Multi-source data for official statistics

Trusted Data Collaboratives is about the use of big data and its integration with administrative sources, geospatial information and traditional survey and census data. The use of multi-source data requires collaboration by a number of stakeholders, such the statistical office, government agencies, research institutes, civil society and private sector. Assuring quality of the outcome therefore requires quality assessment at all levels of the collaborative.

Moderator: Mr. Niels Ploug, Statistics Denmark
Panel:
  • Ms. Mara Brigitte Bravo & Mr. Javier Mauricio Jacome, DANE, Colombia
  • Ms. Margarita Ramirez, DANE, Colombia
  • Ms. Cornelia Hammer, IMF
  • Mr. Marcelo Pitta, NIC, Brazil - Digital economy
Session B: Enterprise and Trade Data Lake

A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. While a hierarchical data warehouse stores data in files or folders, a data lake uses a flat architecture to store data. Each data element in a lake is assigned a unique identifier and tagged with a set of extended metadata tags. When a business question arises, the data lake can be queried for relevant data, and that smaller set of data can then be analyzed to help answer the question. The term data lake is often associated with Hadoop-oriented object storage. In such a scenario, an organization's data is first loaded into the Hadoop platform, and then business analytics and data mining tools are applied to the data where it resides on Hadoop's cluster nodes of commodity computers. Like big data, the term data lake is sometimes disparaged as being simply a marketing label for a product that supports Hadoop. Increasingly, however, the term is being accepted as a way to describe any large data pool in which the schema and data requirements are not defined until the data is queried.

Moderator: Mr. Setia Pramana, BPS Indonesia, Jakarta
Panel:
  • Enterprise Data Lake
    • Ms. Irene Salemink, Statistics Netherlands
  • Trade and Transport Lake
    • Mr. Ronald Jansen, UNSD
    • Mr. José Anson, UPU (via Skype)
    • Mr. Sai Ananthanarayan, ICAO
    • Mr. Tim de Jong, Statistics Netherlands
10:40
Coffee break
11:00
Modernizing official statistics in Latin America and the Caribbean– the way forward
Moderator: Mr. Gerardo Leyva, INEGI, Mexico
  • Mr. Carlos Prada, DANE
  • Mr. José Antonio Mejia, IADB
  • Mr. Gian Marcos Aguilar, Statistical Institute of Belize
  • Mr. Statchel Edwards, Antigua and Barbuda
  • Mr. César Vicuña, INEC, Ecuador
  • Ms. Philomen Harrison, Caricom
13:00
Lunch
14:30
Proof of Concepts for Collaboratives for Data, Services and Applications
Moderator: Ms. Heather Savory, UK ONS
  • Mr. Mark Craddock, UK ONS - Global Platform on Data, Services and Applications
  • Mr. Louis Kouakou, African Development Bank - Regional Platform: African Information Highway
16:00
Coffee break
16:20
Bogota Statement on Trusted Data Collaborative
Moderator: Mr. Niels Ploug, Chair of GWG
  • Ms. Heather Savory, UK ONS
  • Ms. Sylvie Michaud, Statistics Canada
  • Ms. Philomen Harrison, CARICOM
  • Mr. Bert Kroese, Statistics Netherlands
  • Mr. Mauricio Perfetti, DANE
  • Mr. Ivo Havinga, UNSD
  • Mr. Georgy Oksenoyt, Rosstat
17:30
Closing of the Conference

PHOTO ATTRIBUTION: