SPRINT ON
Artificial Intelligence and Data Science
for Economic Statistics


WEBINAR 1 - 7 Nov 2024
WEBINAR 2 - 12 Dec 2024
SYMPOSIUM - 20 - 22 Jan 2025 - Dubai, United Arab Emirates

WEBINAR 1 7 Nov 2024 07:00 - 10:00am (GMT-4)

This AI and Data Science Sprint for Economic Statistics plays an important role in the ongoing efforts of the statistical community to integrate advanced data science techniques within the framework of national and international statistical systems. By highlighting successful use cases and discussing the strategic integration of new technologies, the sprint will help in shaping the future of economic statistics production. This Sprint will also be an important step in the further development of the Data Science Leadership Network playbook and seeks to empower NSOs with cutting-edge tools and methodologies for statistical production.

The sprint has four main objectives: (1) Present a series of impactful AI and data science projects in statistical production utilizing reproducible analytical pipelines; (2) Present AI and data science projects for the compilation of economic indicators to inform economic policies; (3) Present the possible use of generative AI technologies for the dissemination and interpretation of statistics; and (4) Address strategic and cross-cutting issues.

Presentation Annotations Agenda Download all

Session 3

Reproducible Analytical Pipelines (RAP) strategy and implementation at ONS Presentation
Martin Ralphs, Head of Analysis Standards and Pipelines, Methodology and Quality Directorate, Office for National Statistics (ONS), United Kingdom

Summary: Reproducible Analytical Pipelines (RAP) bring software engineering good practices into statistical production. In this talk we'll discuss our approach to embedding RAP in ONS to build statistical outputs, thinking particularly about three themes of capability, culture and tools. We'll discuss how we're building and embedding capability, different models of RAP support we're using and the challenges we faced and lessons we're learning as we scale up our use of RAP.


RAPs for the Swiss Federal Pension Fund Presentation
Christopher Sulkowski, Data Scientist, Data Science Competence Center (DSCC), FSO Switzerland

Summary: The Data Science Competence Center (DSCC) implemented for the Swiss Federal Pension Fund a complete reproducible and versioned pipeline architecture to process the data of their clients, and allow them to carry out analytical work. This presentation will focus on the technologies and methods used to ensure the replicability of the procedures and provide access to this data across the pension fund.


RAPs example - PortWatch and UNGP Presentation
Mario Saraiva and Alessandra Sozzi, IMF

Summary: PortWatch is an open platform for monitoring and simulating disruptions to maritime trade flows, enabling policymakers and the public to assess the impacts of trade shocks, such as those caused by natural disasters and conflicts. The presentation will showcase the analytical pipeline that takes real-time AIS data from the UN Global Platform and produces daily port activity and trade estimates data for over 1,600 ports and 24 significant maritime passages worldwide, ensuring cost-efficient data handling and reliable analytical outputs.


Utilization of AIS Data for Transportation Statistics and Greenhouse Gas Emission Presentation
Setia Pramana, BPS Indonesia

Summary: This study utilizes AIS data from the United Nations Global Platform and ship register data from IHS Markit to track vessel movements. In addition to these two data, this study uses the World Port Index, BPS maritime statistics, and the Fourth IMO Greenhouse Gas Study (4th IMO GHG) for emission factors. Data pre-processing involved removing duplicate AIS records, matching them with IHS data, retaining specific vessel types, calculating voyage durations, and filtering out inactive or noisy messages, including default values for speed and draught.


Session 4

Earth Observations and ML for Estimating Agricultural Activity Presentation
Abel Coronado Iruegas, Deputy Director of Research in Data Science, INEGI, Mexico

Summary: This presentation will demonstrate how INEGI is leveraging satellite imagery and remote sensing data, in conjunction with machine learning algorithms, to accurately classify crop types. The primary focus will be on how this approach delivers timely and precise information regarding crop distribution and land use, which is indispensable for economic planning, food security, and effective policy-making.


AI/ML for estimating firm-level supply chain network Presentation
Gert Buiten, Statistics Netherlands

Summary: Supply chains have become very important in recent decades, boosting economic growth, employment and income in many countries around the world. At the same time, the world has become vulnerable to supply chain shocks. But supply chains may also help us tackle some of the major challenges such as the energy transition and improvement of productivity and innovation. In the past years, this led to a lot of research on creating and analyzing firm-level supply chain datasets. However, observed data on buyer-supplier relations for all firms are only available for a limited number of countries, and in a number of those cases only available to specific institutions e.g. the Central Bank. This situation gave rise to the development of several network reconstructing techniques, some of which are based on Machine Learning (ML). Work package 11 of the AIML4OS project aims at developing an ML-based model that will allow all EU NSI’s to reconstruct a firm-level supply chain network dataset with a basic quality level and that can be run on input data that is available at all EU NSI’s. Starting point is an ML model developed recently at the Institute for New Economic Thinking (INET) at Oxford University. WP11 will enhance and improve the Oxford model to make it more generally applicable, create a firm-level network dataset based on Portuguese administrative data and use that for training the ML-models. The resulting network reconstruction models will then be run on data of other NSI’s e.g. Statistics Netherlands to check their applicability.


Solar Panel Detection using Ortho-imagery and Deep Learning Presentation
Nina Hofer, Statistics Austria

Summary: During the last year Statistics Austria has raised a broad interest for geodata on solar panels, however no such data are available for us in a sufficient quality and with enough spatial detail. Filling this data lack is especially interesting for enriching energy statistics and our building register, not to mention the public interest regarding renewable energy and possible data integration for science and politics. We are therefore now evaluating whether high-resolution ortho-imagery from Austria and sophisticated deep learning models can be used to detect solar panels on building rooftops, which methodological approaches are necessary and what quality and limitations are. Besides a methodological background and first practical results, we will present our lessons learned and future prospects for the generation of a nation-wide data set on geolocated solar panels.


AI-Driven Economic Indicators: Charting a Course for Responsible Implementation in Economic Statistics Presentation
Hadi Susanto, BPS Statistics Indonesia

Summary: Application of AI/ML in economic indicators Analysis and Estimation will disrupt the paradigm of Economic Statistics. The presentation is designed to highlight the changes AI-enabling methods can bring to accuracy, speed and depth of economic estimation / analysis. We will discuss the boon in better real-time and high frequency processing of data along with ability to extract deeper insight from large swaths of information, but critically review what may plague this technological trend. These challenges include problems related to data, including the quality and bias of that information, lack of transparency in models being used for these purposes, privacy concerns over who has access to sensitive personal or business-related economic aspirational conditions.

To help us chart a responsible implementation course, we introduce in this presentation a multi-pronged solution to mitigate each of these challenges. We support the creation of formalized data governance mechanisms, increased model transparency and interpretability efforts, as well as the implementation of ethical frameworks with regulatory standards for AI application in economic statistics. Finally, we show that interdisciplinary collaborative efforts among economists, data scientists, and ethicists are needed more than ever to foster a fairer future in AI-driven economic analysis as well as the importance of investments made around each other’s academic profiles through education and skill-building. We hope that by looking closely at both the potential achievements and pitfalls, we can inform a balanced approach to AI technologies for labor production, as well as in building strong economic indices through these state-of-the-art tools — so rather than undermining them they would reinforce integrity and relevance of statistical material used to shape policy decisions in reality.


Shedding light on Economic Growth: GDP Nowcasting with Satellite Data Presentation
Iyke Maduako, IMF

Summary: Nowcasting of economic activities in many developing economies is very challenging due to lack of timely and high-frequency macroeconomic indicators. To address this challenge, we employ an approach that transcends traditional and conventional methodologies. This presentation outlines our method that uses Machine Learning (ML) to successfully nowcast countries GDPs based on a combination of traditional and non-traditional (satellite data) indicators. The study further investigates the advantages and disadvantages of many of these existing ML methods with focus on Decision Tree-based models, Model Stacking and Dynamic Factor methods. Our approach seeks to address the critical issues of data gaps in GDP nowcasting.