Big Data Project Inventory

Home   Inventory


The GWG Big Data Inventory is a catalog of Big Data projects that are relevant for official statistics, SDG indicators and other statistics needed for decision-making on public policies, as well as for management and monitoring of public sector programs/projects. This inventory is a joint product of the World Bank and the United Nations Statistics Division (UNSD) put together on behalf of the UN Global Working Group (GWG) on Big Data for Official Statistics. The tasks related to the content of the inventory are led by the World Bank and UNSD, and the technical side is serviced by the UNSD technical team.


Search

If you are working on a project that you would like to be considered for inclusion in this Inventory, even if the project is in an initial phase, please fill out this application form.

Please note that the project should either use Big Data sources and/or utilize Big Data techniques, and ideally have some relevance or implications for official statistics, SDG indicators or other statistics needed for decision-making on public policies. The Global Working Group will review submissions and include those projects that meet these criteria, or possibly contact you for further information. Please note that the information submitted below, once approved, will be made public on the GWG Big Data Project Inventory website.

Country/Area: Belgium
Institute / Dept: Belgium - Statistics Belgium
    Data sources:
  • Satellite imagery or aerial imagery data
Project description:

Study the feasibility of using geographical data from web services, either open (e.g. Nominatim, OpenStreetMaps) or proprietary (e.g. Google maps) for the geocoding of static objects not covered by other sources (such as Registry Office or Population Register). The objective is improved geographical localization of statistical units (for linking) and maximally-detailed geographical breakdowns in a wide range of statistical domains.

Read More
Country/Area: Belgium
Institute / Dept: Belgium - Statistics Belgium
    Data sources:
  • Mobile phone data
Project description:

Assess the feasibility of using mobile phone data to supplement or even replace data sources for statistical products, mainly in the areas of tourism or transport statistics. Explore the possibility of recording phenomena not yet accessible through traditional methods.

Read More
Country/Area: Belgium
Institute / Dept: Belgium - Statistics Belgium
    Data sources:
  • Web scraping data
Project description:

Explore the possibility of using web scraping as input for labor market indicators, focusing on job vacancies but not excluding other potentially interesting information. Statistics Belgium has some experience and knowhow on web scraping: it is already used for scraping online prices of several products and services for the calculation of CPIs.

Read More
Country/Area: Belgium
Institute / Dept: Belgium - Statistics Belgium
    Data sources:
  • Smart meter electricity data
Project description:

Explore the possibility of using electricity and gas and water smart meter data for energy, environment and other statistics (e.g. non-occupancy rates, household consumption). As smart meters are not yet widespread (but expected to become so in the near future), the study will be an intermediate step and also focus on using 'dumb meter' data available from distributor companies (many customers already input meter readings online).

Read More
Country/Area: Austria
Institute / Dept: Austria - Statistics Austria
    Data sources:
  • Scanner data
Project description:

Scanner data from big retail chains can be used to collect prices and quantities. Currently a pilot with a small data snapshot is being conducted. Negotiations with retail chains are ongoing.

Read More
Country/Area: Austria
Institute / Dept: Austria - Statistics Austria
    Data sources:
  • Web scraping data
Project description:

Development and implementation of automatic price collection procedures using Web Crawlers for price index compilations. Assessment of automatically-collected data in terms of quality and efficiency gains and their applicability for other statistics.

Read More
Country/Area: Belgium
Institute / Dept: Belgium - Statistics Belgium
    Data sources:
  • Scanner data
Project description:

Using very detailed scanner data from the major supermarket chains as input for monthly consumer prices (national CPI and very soon European HICP), particularly for food and personal hygiene products.

Read More
Country/Area: Cameroon
Institute / Dept: Cameroon - National Institute of Statistics
    Data sources:
  • Other
Project description:

Capacity building to develop national skills in processing Big Data for officials statistics. Share experience and benchmarking with other countries.

Read More
Country/Area: Cameroon
Institute / Dept: Cameroon - National Institute of Statistics
    Data sources:
  • Other
Project description:

As a developing country, Cameroon needs to build capacities and skills in this new domain in statistics. We are exploring these opportunities to learn and adapt methodologies and processing. We expect it be less costly than classic surveys, but challenges seem to be huge in terms of coverage and public acceptance to collaborate.

Read More
Country/Area: Canada
Institute / Dept: Canada - Statistics Canada
    Data sources:
  • Other
Project description:

We completed a feasibility study for the development of a Canadian inventory of non-residential building (commercial, industrial, government and institutional buildings). Two prototypes were developed (one for a municipality; the second for a typology of buildings). In its most complete form, it is envisioned that the inventory will be used for storage and query, analysis and dissemination, with a specific attention to open data considerations and integration of big data in a spatial framework. It is expected that the inventory can be used to generate statistics from big data sources.

Read More
Country/Area: Canada
Institute / Dept: Canada - Statistics Canada
    Data sources:
  • Other
Project description:

The goal of the project is to implement a set of indices of physical accessibility to services and a general index of remoteness. These indices could be used in research, policy analysis and program delivery. The methodology used in this project integrates non-official data sources with official statistical sources. The implementation showed that, for the specific application, the non-official data source used in this analysis was shown to be more suitable than comparable data from official statistical sources.

Read More
Country/Area: Canada
Institute / Dept: Canada - Statistics Canada
    Data sources:
  • Smart meter electricity data
Project description:

In fall 2014, Statistics Canada funded its first big data pilot project. The two key objectives of the project were: 1) to use smart meter data as an example of big data, to explore what is and isn't feasible, as well as the tools and skills required, and the potential benefits and pitfalls of utilizing data of that magnitude at Statistics Canada, and 2) to test the feasibility of replacing and/or supplementing Statistics Canada's residential electricity consumption survey data with smart meter data. Currently, there are several surveys and statistical programs at Statistics Canada that either collect or utilize residential electricity consumption data or data related to residential electricity consumption; these include the Electricity Disposition - Quarterly Sector Survey, Electricity Supply Disposition Annual Survey, Survey of Household Spending, Quarterly Household Final Consumption Expenditure, Detailed Household Final Consumption Expenditure, Consumer Price Index, Purchasing Power Parities, InterCity Indexes of Price Differentials, and the Census. These nine surveys could potentially benefit from smart meter data and each represents an opportunity that could be explored. The potential of smart meter data is that in the future, instead of surveying individual utilities or households, we could collect the data directly from the smart meter entities, which would greatly reduce response burden while increasing relevance, timeliness and accuracy.

Read More
Country/Area: China
Institute / Dept: China - National Bureau of Statistics
    Data sources:
  • Web scraping data
Project description:

Crawling the particular cellphone price data by Crawler program and establishing the daily price index as a reference for the monthly price data.

Read More
Country/Area: China
Institute / Dept: China - National Bureau of Statistics
    Data sources:
  • Satellite imagery or aerial imagery data
Project description:

Build up the spatial sampling frame by using the data from land use surveys and agricultural census. Then update the sampling frame by satellite and aerial remote sensing. With the samples selected by spatial sampling method, we estimate the crop planting area and output every season.

Read More
Country/Area: China
Institute / Dept: China - National Bureau of Statistics
    Data sources:
  • Credit card data
Project description:

We get the year-on-year data and the chain data of the credit card transaction amounts in different industries from the headquarters of UnionPay on a monthly basis. Then we use the data to verify the growth trends of the retail sales.

Read More
Country/Area: China
Institute / Dept: China - National Bureau of Statistics
    Data sources:
  • Road sensor data
  • Ships identification data
Project description:

In 2014, the Joint Transport Ministry has studied the networks of the toll highway system and marine visa system. They found a way to apply the massive administrative records data of these two systems to the highway and waterway transport statistics, and now the method has been applied in most of the country on a trial basis.

Read More
Country/Area: Czech Republic
Institute / Dept: Czech Republic - Statistical Office
    Data sources:
  • Scanner data
Project description:

Data sharing of scanner data with the most important retail chains. The project aims at creating a new data source for price statistics and foresees potential secondary use for national accounts, household income and expenditure or business statistics.

Read More
Country/Area: Czech Republic
Institute / Dept: Czech Republic - Statistical Office
    Data sources:
  • Mobile phone data
Project description:

This pilot project aims at assessing the feasibility of using mobile positioning data for generating statistics on inbound and domestic tourism flows in the Czech Republic. The six localities have been selected where the records from mobile operator (positioning data) will be compared with the number of visitors gathered by face-to-face survey. The results will be used for setting the methodology of the inbound (and domestic) tourism survey in the future (2016 or 2017).

Read More
Country/Area: Denmark
Institute / Dept: Denmark - Statistics Denmark
    Data sources:
  • Scanner data
Project description:

The purpose of this project is to test whether scanner data from the two major supermarket chains in Denmark can be used in the production of the CPI.

Read More
Country/Area: Ecuador
Institute / Dept: Ecuador - National Institute of Statistics and Censuses
    Data sources:
  • Social media data
Project description:

Develop index of happiness based on the use of data from social networks

Read More
Country/Area: Ecuador
Institute / Dept: Ecuador - National Institute of Statistics and Censuses
    Data sources:
  • Web scraping data
Project description:

Build on prices published on websites, in order to make various technological and methodological exercises that can generate different types of analysis and development of indicators or indices, such as the consumer price index, based on information posted on the web.

Read More
Country/Area: Global
Institute / Dept: UN - Economic and Social Commission for Asia and the Pacific
    Data sources:
  • Other
Project description:

To use 'big data' in producing official statistics, statisticians and managers in national and local statistical systems need to closely examine and understand the potential use of various types of big data as well as the issues and limitations of their usage for producing official statistics and indicators. Statisticians and managers also need to gain and/or improve their knowledge and skills for working with such data and integrating such work in the standard statistical business processes. This project aims at developing a training curriculum to address capacity-building requirements on understanding and assessing the potentials for utilizing big data for official statistics, particularly in developing statistical systems of Asia and the Pacific, based on an assessment of knowledge and skill levels of their human resources. The curriculum will serve as an integrating framework for developing and conducting training modules focusing on big data utilization in specific domains of official statistics.

Read More
Country/Area: Europe
Institute / Dept: European Commission - Eurostat
    Data sources:
  • Mobile phone data
Project description:

Use call detail records (CDRs) for population, mobility and urban statistics estimation and inferences

Read More
Country/Area: Europe
Institute / Dept: European Commission - Eurostat
    Data sources:
  • Other
Project description:

Exploring Wikipedia page view counts as a source for official statistics.

Read More
Country/Area: Europe
Institute / Dept: European Commission - Eurostat
    Data sources:
  • Other
Project description:

Assessment of the potential of the Google Trends product

Read More
Country/Area: Europe
Institute / Dept: European Commission - Eurostat
    Data sources:
  • Other
Project description:

Explore the potential of flight reservation systems data for official statistics

Read More
Country/Area: Europe
Institute / Dept: European Commission - Eurostat
    Data sources:
  • Scanner data
Project description:

Eurostat supports a number of projects aimed at integrating different price statistics and using collected prices for multiple purposes. The projects fall under a broader project 'Multipurpose consumer price statistics'. In particular one sub-project is relevant in this context. This is the work done on obtaining and implementing the use of scanner data in consumer price statistics in the Member States. Eurostat supports Member States in this work. Eurostat also supports further methodological development on the use of this data source. Scanner data is transaction data generated by retailers in point of-sales terminals. This data is considered big data as it describes all supermarket transactions for a given retailer and period of time, the data is very detailed and delivered frequently (often data is supplied per week). Several Member States use scanner data in their consumer price statistics. These are The Netherlands, Sweden, Norway and Switzerland. Eurostat currently supports a further 17 Member States in obtaining and testing scanner data. Ten of these receive scanner data in some form. Eurostat is responsible, together with the Member States, to produce the HICP (Harmonized Index of Consumer Prices). Given this responsibility, Eurostat is keen to ensure that the introduction of scanner data does not jeopardize the comparability of the national HICPs. To this end Eurostat is in the process of drafting guidelines on obtaining and using scanner data.

Read More
Country/Area: Europe
Institute / Dept: European Commission - Eurostat
    Data sources:
  • Web scraping data
Project description:

The aim of this project was to assess the feasibility of employing modern and enhanced methodologies and indicators for collecting high quality statistics from non-traditional data sources such as the Internet () or Big Data Sources. Concrete objectives were to elaborate an up-to-date conceptual framework of ICT statistics and related indicators from data collected using Internet as a data source or big data repositories. The framework should take into account methodological considerations such as definition of the universe, definition of observation and statistical units, selection of observation units and definition of characteristics or indicators, to exploit the feasibility of extending Eurostat's methods and indicators for collecting ICT official statistics from non traditional sources. The feasibility of collection of data directly through the user should be explored and the collection of statistical data from enterprises web-sites should be elaborated. Another objective was to implement and test the methods and indicators developed for the two different approaches as a proof of concept and to collect information on a possible large scale implementation and to exploit the feasibility of using Big Data Repositories as official data sources and examine their potential of either supplementing official statistics or even completely replacing official statistics indicators, especially to describe in the form of use cases five data repositories related to statistical domains and assess their feasibility across the technical, organizational, methodological, cost-benefit, legal and socio-political dimension. Another objective was to elaborate an accreditation procedure which will analyze and assess the required quality aspects of a board range of possible resources in order to qualify statistical data as official statistics. The accreditation procedure should be in-line with the principles and guidelines of Eurostat related to the quality assurance as well as the European Statistics code of practice.

Read More
Country/Area: Europe
Institute / Dept: European Commission - Eurostat
    Data sources:
  • Mobile phone data
Project description:

The aim of the current study was to assess the feasibility of using mobile positioning data for generating statistics on domestic, outbound and inbound tourism flows, and to address the strengths and weaknesses related to access, trust, cost, and the technological and methodological challenges inherent in the use of such a new data source. The international consortium that conducted the study concentrated on the various aspects involved in the use of mobile positioning data in terms of tourism statistics and other domains: an overview of the situation involving the use of mobile data; accessibility to the data from the legal, technological, financial and business aspects, including possible cost and burden implications; methodological principles of statistical data collection and compilation, including evaluation by using different quality aspects and comparing the results against existing traditional methods; opportunities offered by, as well as limitations inherent in, the use of the data source. The outcomes can to a large extent be generalized beyond tourism statistics (i.e. applicable for other areas of statistics) and beyond the source of mobile positioning data (i.e. applicable for other 'big data' sources) in particular the comprehensive discussion on feasibility of access.

Read More
Country/Area: Finland
Institute / Dept: Finland - Statistics Finland
    Data sources:
  • Web scraping data
  • Scanner data
Project description:

On-going pilot project on consumer prices searching opportunities for web scraping and scanner data usage.

Read More
Country/Area: Finland
Institute / Dept: Finland - Statistics Finland
    Data sources:
  • Road sensor data
  • Public transport usage data
Project description:

The project produced new commuting time estimates based on many data sources. Models are based on comparative studies and built for both commuting time and distance by driving, cycling and using public transport within the national route network.

Read More
Country/Area: Germany
Institute / Dept: Germany - Federal Statistical Office
    Data sources:
  • Web scraping data
Project description:

Read More
Country/Area: Hungary
Institute / Dept: Hungary - Central Statistical Office
    Data sources:
  • Web scraping data
Project description:

The Hungarian Central Statistical Office (HCSC) uses the web scraping technique to get price data from a general retailer's website. The project is in exploration status but in the future the HCSO plans to integrate this data source into the price statistics business processes. Other price information than the general retailer's data is already considered for use for official statistics (e.g. information on flight prices). Search for the suitable IT tools is currently an ongoing task. The HCSO is also currently investigating the use of road sensor data (from the National Toll Payment Services Plc) and scanner data (from the National Tax and Customs Administration of Hungary) but as the HCSO receives monthly aggregated datasets for the current methodological studies, these are not considered as 'traditional' Big Data uses.

Read More
Country/Area: Hungary
Institute / Dept: Hungary - Central Statistical Office
    Data sources:
  • Web scraping data
  • Scanner data
Project description:

The main objective is to support the estimation of the job vacancies of Hungary. By the usage of this additional online information, increase in accuracy of the job vacancies is expected. According to the current plans, this additional data source would provide an auxiliary variable for the production of statistics. This auxiliary information is also planned to be used for validation purposes. Web scraping tools are planned to be used to collect the necessary information from the web.

Read More
Country/Area: Hungary
Institute / Dept: Hungary - Central Statistical Office
    Data sources:
  • Other
Project description:

The main objective is to support the estimation of retail sales statistics. By the usage of this additional online information, increase in accuracy of the retail trade statistics is expected. In the long term, the project is aimed at the production of statistics based fully on this Big Data source. For the upcoming years, production of statistics will run in parallel (based on the 'traditional'' data collection and the online cash register data) We are currently in consultation with NTCA to receive monthly data of online cash registers to be used for our retail trade statistics and possibly other purposes. Based on the outcome of current and future discussions, it will be a real big data source if more detailed (e.g. daily) datasets are received in the future.

Read More
Country/Area: Hungary
Institute / Dept: Hungary - Central Statistical Office
    Data sources:
  • Road sensor data
Project description:

The main objective is to support the estimation of the incoming and outgoing traffic at the borders of Hungary. By the usage of this additional online information, increase in accuracy of the tourism statistics is expected. According to the current plans, this additional data source would provide an auxiliary variable for the production of statistics. The location of the cameras is currently an open issue therefore this auxiliary information is currently foreseen for validation purposes only. Data will be received from the National Toll Payment Services Plc. Acquired data on vehicles will be used for the analysis of the traffic.

Read More
Country/Area: Albania
Institute / Dept: World Bank Group
    Data sources:
Project description:

To improve the transparency of information systems used to manage road safety and maintenance activities in Albania, which has the highest rate of road-related fatalities in SE Europe. We seek to create a two-way virtual data platform whereby citizens can submit feedback on the road network, and also receive information on road conditions. This will enable decision-makers to assess efficiency of past interventions, and respond quickly to emerging road maintenance and safety issues. Resulting in a cost-effective tool for improving transparency in road management operations, informing and empowering citizens, while improving the condition and safety of the road network.

Read More
Country/Area: Colombia, Chile
Institute / Dept: World Bank Group
    Data sources:
Project description:

Just in time analysis in disaster risk management (DRM) use the technological platform of big data and analytics to address the constantly changing data and "reality". This project aims to create a pilot platform based on recommendations of an earlier study which provided a need assessment and high level system architecture design on using big data for just-in-time analysis. The proposed pilot countries are Colombia and Chile (to be confirmed).

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Nighttime lights satellite data have proven to be a useful proxy for human development in numerous contexts, including urban development, population, and gross domestic product. In many data-poor regions of the world, the data derived from these satellite images are unparalleled in scope, accuracy, and coverage. The consistency of the data across the globe makes the human development estimations derived from the nighttime lights satellite data comparable between countries, which is often not the case when comparing other measures of human development. Currently, the nighttime lights data are delivered as yearly composites of the DMSP-OLS satellite (and are freely downloadable online). These products are useful, but suffer from a coarse spatial resolution (a single cell covers ~ one square kilometer) and are subject to several important sources of measurement error, emanating from the problem of sensor saturation and the phenomenon of over-glow. A new satellite launched in 2011 called the Visible Infrared Imaging Radiometer Suite (VIIRS) is remedying this situation with a new sensor that has a much finer spatial resolution (~ 300 m), as well as a more frequent data delivery schedule and more advanced, multi-spectral imaging capabilities. This proposal is to collaborate with scientists at the National Geospatial Data Centre.

Read More
Country/Area: Colombia, Haiti
Institute / Dept: World Bank Group
    Data sources:
Project description:

Recent research in Europe and Africa has shown that the attenuation of the electromagnetic signal between cell phone towers which is caused by rainfall can be used for measuring precipitation and are especially useful in areas where few or no radars or rain gauges are available. The objective of this work is to modify the algorithm elaborated in other countries in order to make it applicable to the Caribbean context and integrate the obtained information with the measured data from the weather stations. This data shall then be used in the lending operations for designing risk reduction measures such as early warning systems, flood mitigation measures, design of bridges, culverts and drainage systems etc.

Read More
Country/Area: Ireland
Institute / Dept: Ireland - Central Statistics Office
    Data sources:
  • Mobile phone data
Project description:

The project will use roaming records to produce tourist information statistics.

Read More
Country/Area: Ireland
Institute / Dept: Ireland - Central Statistics Office
    Data sources:
  • Smart meter electricity data
Project description:

The Central Statistics Office (CSO) obtained access to data from the electricity provider. CSO worked with University College Dublin researchers in a project to estimate household occupancy and composition.

Read More
Country/Area: Ireland
Institute / Dept: Ireland - Central Statistics Office
    Data sources:
  • Social media data
Project description:

The office is currently investigating the statistical information that wikistats can provide, such as sleep patterns. It is looking for correlations with current statistics.

Read More
Country/Area: Israel
Institute / Dept: Israel - Central Bureau of Statistics
    Data sources:
  • Web scraping data
  • Scanner data
Project description:

We are in the execution phase of the scanner data for Price statistics project.

Read More
Country/Area: Israel
Institute / Dept: Israel - Central Bureau of Statistics
    Data sources:
  • Mobile phone data
  • Road sensor data
Project description:

The project has not been planned yet.

Read More
Country/Area: Italy
Institute / Dept: Italy - National Institute of Statistics
    Data sources:
  • Scanner data
Project description:

Use of scanner data provided by the largest supermarket chains for estimating the consumer price index. Scanner data should be integrated with survey data collected in the small outlets. Currently, ISTAT received two years of data for some provinces related to food and grocery products.

Read More
Country/Area: Italy
Institute / Dept: Italy - National Institute of Statistics
    Data sources:
  • Web scraping data
Project description:

Explore the possibility to use web scraping techniques in the estimation phase (apply to text and data mining algorithms), with the aim of replacing traditional instruments of data collection and estimation, or to combine them in an integrated approach. Aim: produce information in particular on the use of Internet and other networks for various purposes (e-commerce, e-skills, e-business, social media, e-government, etc.)

Read More
Country/Area: Italy
Institute / Dept: Italy - National Institute of Statistics
    Data sources:
  • Other
Project description:

Currently flash statistics on unemployment rate are produced after one month. The project aims to reduce this time-lag

Read More
Country/Area: Italy
Institute / Dept: Italy - National Institute of Statistics
    Data sources:
  • Mobile phone data
Project description:

1. The project focuses on the production of the origin/destination matrix of daily mobility for purpose of work and study at the spatial granularity of municipalities. 2. In particular, it aims to produce statistics on the so-called city users, that is "Standing resident", "Embedded city users", "Daily city users" and "Free city users". 3. The project has the objective of comparing two approaches to mobility profiles estimation, namely: (i) Estimation based on mobile phone data and (ii) Estimation based on administrative archives. 4. Production purposes.

Read More
Country/Area: Japan
Institute / Dept: Japan - Ministry of Internal Affairs and Communications
    Data sources:
  • Web scraping data
  • Scanner data
Project description:

Price collection using web scrapping. Quality adjustment using scanner data for compiling price index.

Read More
Country/Area: Luxembourg
Institute / Dept: Luxembourg - National Institute of Statistics
    Data sources:
  • Scanner data
Project description:

The project consists in receiving scanner data from retailers which should be used in the Consumer Price Index (CPI) compilation.

Read More
Country/Area: Mexico
Institute / Dept: Mexico - National Institute of Statistics and Geography
    Data sources:
  • Social media data
Project description:

Exploration of different topics to review the feasibility of using information from Twitter to produce statistical and geographical information

Read More
Country/Area: Mexico
Institute / Dept: Mexico - National Institute of Statistics and Geography
    Data sources:
  • Satellite imagery or aerial imagery data
Project description:

Today in Mexico we use different types of satellite imagery to produce several kinds of data: topographical, geological, land use and geostatistical cartography

Read More
Country/Area: Netherlands
Institute / Dept: Netherlands - Statistics Netherlands
    Data sources:
  • Road sensor data
Project description:

On the basis of road sensor data for Dutch motorways, monthly traffic intensities are calculated for each road per COROP region. The project results in official statistics.

Read More
Country/Area: Netherlands
Institute / Dept: Netherlands - Statistics Netherlands
    Data sources:
  • Mobile phone data
Project description:

The aim of the project is to produce statistics on the spatial distribution of the Dutch population during the day, as opposed to the spatial distribution as registered by the municipality of residence. This will be based on mobile phone location data. The mobility aspect remains out of scope for the time being, but it is envisaged to use the same data source for an enrichment of tourism statistics in coming years. Before the start of the project, research had been done on the feasibility and methodology for these statistics.

Read More
Country/Area: Netherlands
Institute / Dept: Netherlands - Statistics Netherlands
    Data sources:
  • Social media data
Project description:

This is a research project that explores the usability of public social media messages for selected statistics, the methodology to be used for such statistics, and the reliability of the results. At the beginning of this research, an important research question was whether the results of the existing survey-based consumer confidence index could be replicated using only this source, while reducing its production time. The research contributes to the body of methodological knowledge and expands it beyond sampling theory. This data source is widely considered to have a huge potential for shedding light on a range of social phenomena. The current research has turned to indicators for social coherence. However, success does not automatically mean the application to official statistics, which requires an assessment that goes beyond the question of technical feasibility.

Read More
Country/Area: Netherlands
Institute / Dept: Netherlands - Statistics Netherlands
    Data sources:
  • Web scraping data
  • Scanner data
Project description:

For the CPI and other price indices, several sources may be used. Apart from price observation in shops, Statistics Netherlands uses scanner data from retail businesses. In addition, price information may be available on the websites of retail businesses or on websites of third parties that provide price comparisons. Price information for specific products is already collected manually from websites, and increasingly by internet robots. The aim of the project is to systematically collect price information by internet robots for a limited number of retail chains, so that the observation in the shops can be stopped.

Read More
Country/Area: Norway
Institute / Dept: Norway - Statistics Norway
    Data sources:
  • Web scraping data
Project description:

Identify areas where internet trade is significant and where prices are obtainable. Develop possible automated processes for collecting data, both to replace the present manual work, but also to expand into new areas. A discussion on how to identify scope and how to sample. Analyze collected online data. Evaluation of the results.

Read More
Country/Area: China
Institute / Dept: China - National Bureau of Statistics
    Data sources:
  • Web scraping data
Project description:

The project is for research purposes at present and it has not applied to the statistical production.

Read More
Country/Area: China
Institute / Dept: China - National Bureau of Statistics
    Data sources:
  • Other
Project description:

The project is mainly for research purposes, offering references for researchers in economic and statistical production.

Read More
Country/Area: Poland
Institute / Dept: Poland - Central Statistical Office
    Data sources:
  • Web scraping data
Project description:

The goal of the experimental project is to decide whether the data on job offers (labor market statistics) can be gathered from websites. In pilot project the largest portals were used as a data source for web scrapping.

Read More
Country/Area: Poland
Institute / Dept: Poland - Central Statistical Office
    Data sources:
  • Web scraping data
Project description:

The goal is to use web scraping to prepare the survey frame for training institutions. At the moment there is no reliable frame for new survey that will cover several aspects of training institutions activities.

Read More
Country/Area: Poland
Institute / Dept: Poland - Central Statistical Office
    Data sources:
  • Mobile phone data
Project description:

Migration statistics based on data from mobile operators

Read More
Country/Area: Poland
Institute / Dept: Poland - Central Statistical Office
    Data sources:
  • Web scraping data
Project description:

Collaboration in big data sandbox on web scraping on enterprise data. Evaluation of technologies and methodologies for web scraped enterprise data.

Read More
Country/Area: Poland
Institute / Dept: Poland - Central Statistical Office
    Data sources:
  • Other
Project description:

The Central Statistical Office currently does recognize the sources of Big Data and entities that can provide this type of data in terms of their use for production of statistics.

Read More
Country/Area: Korea Republic of
Institute / Dept: Korea Republic of - Statistics Korea
    Data sources:
  • Web scraping data
Project description:

The project is to compile price index using the price data through the website

Read More
Country/Area: Korea Republic of
Institute / Dept: Korea Republic of - Statistics Korea
    Data sources:
  • Mobile phone data
Project description:

Analysis of the situation of migration to the city, the district, etc. using mobile call detail record (CDR) data of three provinces

Read More
Country/Area: Romania
Institute / Dept: Romania - National Statistics Institute
    Data sources:
  • Scanner data
Project description:

We intend to use scanner data for improvement of price statistics and other economic statistics indicators. The project is in the conception phase and the results will be used for developing new statistical techniques, monitoring new products, making comparisons between different regions.

Read More
Country/Area: Singapore
Institute / Dept: Singapore - Department of Statistics
    Data sources:
  • Other
Project description:

Singapore's National Environment Agency (NEA)'s Integrated Environment System (IES) is an integrated platform that enables the centralised collection of various environmental and weather data for processing, analysis, visualization and distribution. It aims to harness environmental sensing systems coupled with data analytics and modelling to provide real-time environmental information and thus improve predictive capabilities in order to provide early warning, pre-emptive and sensing capabilities in support of field operations for NEA officers (e.g., preventive measures can be put in place prior to the arrival of the trans-boundary haze).

Read More
Country/Area: Singapore
Institute / Dept: Singapore - Department of Statistics
    Data sources:
  • Other
Project description:

Singapore adopted the de jure concept for Singapore's population estimates based on a person's place of usual residence. Under the de jure concept of "usual residence", Singapore residents (citizens and permanent residents) with valid local addresses and who were not away from Singapore for a continuous period of 12 months or longer were included in the total population count. Non-residents comprising foreigners who were working, studying or living in Singapore but not granted permanent residence were also included in the total population. The transient population, such as tourists and short-term visitors, are excluded. Basic characteristics of Singapore's population estimates are compiled using administrative records from multiple sources. The merged administrative records provided the basic population count and characteristics such as age, sex, type of dwelling and geographic distribution in Singapore.

Read More
Country/Area: Slovenia
Institute / Dept: Slovenia - Statistical Office of the Republic of Slovenia
    Data sources:
  • Web scraping data
  • Scanner data
Project description:

The aims of the project is: modernization of collecting the price data; testing the web scraping tools for price d; and usage of scanner price data in regular production of price statistics.

Read More
Country/Area: Slovenia
Institute / Dept: Slovenia - Statistical Office of the Republic of Slovenia
    Data sources:
  • Mobile phone data
Project description:

The aim of the project is testing usage of mobile data for geospatial statistics.

Read More
Country/Area: South Africa
Institute / Dept: South Africa - Statistics South Africa
    Data sources:
  • Scanner data
Project description:

Assessing the transactional data of large retail chains with the aim of determining their suitability for transforming into data for the Consumer Price Index. They will also be assessed for suitability for the generation of sales values for statistics on Retail trade sector.

Read More
Country/Area: Spain
Institute / Dept: Spain - National Statistical Institute
    Data sources:
  • Mobile phone data
Project description:

This project aims at the construction of origin-estimation mobility matrices using mobile phone data. The project has very recently begun by contacting an operator company to have access to data and jointly with them to set up an adequate methodology. However, no analysis has been conducted yet, since access to data has not been yet granted.

Read More
Country/Area: Spain
Institute / Dept: Spain - National Statistical Institute
    Data sources:
  • Web scraping data
Project description:

This project aims at collecting price data from the web for specific articles in the CPI whose availability is progressively more difficult.

Read More
Country/Area: Sweden
Institute / Dept: Sweden - Statistics Sweden
    Data sources:
  • Ships identification data
Project description:

Pilot study to identify if the data can be used for new or improved statistics, by means of linking AIS to geocoded data

Read More
Country/Area: Sweden
Institute / Dept: Sweden - Statistics Sweden
    Data sources:
  • Web scraping data
Project description:

Identifying sources and usage. Just beginning to look at it, and partly in connection with UNECE Sandbox.

Read More
Country/Area: Sweden
Institute / Dept: Sweden - Statistics Sweden
    Data sources:
  • Credit card data
  • Scanner data
Project description:

Starting up work to explore what sources and methods can be used to improve HBS

Read More
Country/Area: Switzerland
Institute / Dept: Switzerland - Federal Statistical Office
    Data sources:
  • Scanner data
Project description:

Some of the biggest convenience stores give us aggregated data in the food sector.

Read More
Country/Area: Switzerland
Institute / Dept: Switzerland - Federal Statistical Office
    Data sources:
  • Scanner data
Project description:

Since July 2008 in addition to traditionally collected prices, the Swiss Federal Statistical Office (FSO) has also been using scanner data for the consumer price index calculation of the commodity groups food and near-food (products for personal care, washing and cleaning products as well as animal food). The FSO thus aims to achieve an improvement in the quality of data, savings on collection costs and a reduction in the administrative burden on retail chains. Scanner data not only contain extremely detailed information about prices and sales, they are also available after a very short period of time. Basically, completely new perspectives are opened up. However, as price collection using scanner data already represents a major challenge in itself, scanner data in the consumer price index should, for the time being, be used exclusively for improving the existing price collection system. This is why currently the same collection and index calculation methods are used as for the traditional collection in retail outlets. This concerns in particular the calculation and weighting of the indices, the number of items in the sample and the procedure for missing prices. Therefore for the time being scanner data is considered more as an alternative data source with improved quality than as a new collection method.

Read More
Country/Area: Global
Institute / Dept: UN - Economic and Social Commission for Asia and the Pacific
    Data sources:
  • Social media data
Project description:

The link between broadband access and socio-economic development has been examined from a variety of approaches. However, most approaches do not incorporate consideration of the delivered broadband quality as experienced at the consumer level. Because the quality and performance of internet connectivity varies greatly based on local conditions, a robust analytical approach should incorporate these nuances into research exercises. This research activity draws upon large-scale user data to derive industry standard indicators, such as jitter, lag, packet loss and delivered speeds. These indicators enable the researcher to draw up on this quantitative data to gain a more detailed picture of ICT connectivity, which enhances the accuracy of analytical exercises.

Read More
Country/Area: Tunisia
Institute / Dept: Tunisia - National Institute of Statistics
    Data sources:
  • Web scraping data
Project description:

Unlock the potential of Big Data to strengthen the monitoring of at least one SDG indicator. We will start using social media and web content .

Read More
Country/Area: Turkey
Institute / Dept: Turkey - Statistical Institute
    Data sources:
  • Other
Project description:

We are currently analyzing some logs being produced by different resources by using big data technologies such as Hadoop. In doing this, our goal is to become more familiar with big data technologies to get ready for possible prospective big data scenarios.

Read More
Country/Area: United Kingdom
Institute / Dept: United Kingdom - Office for National Statistics
    Data sources:
  • Other
Project description:

Counts of individuals by age band and sex were obtained from the data provider Experian. The counts were based on their commercial marketing database - a foundation of edited electoral roll plus various other data sources including large scale continuous surveys fielded by Experian. The counts were compared with Census data.

Read More
Country/Area: United Kingdom
Institute / Dept: United Kingdom - Office for National Statistics
    Data sources:
  • Mobile phone data
Project description:

To source aggregated data from one of the main mobile phone providers for comparison with worker flow estimates from Census. The aggregated data will be based on the movement patterns of the provider's customers. Specific areas of the country will be selected for comparison - and segmentation by age, sex and main mode of transport will be requested. If comparison is successful, then further research will probably be required to assess the potential of using such data for more timely estimates of worker flows.

Read More
Country/Area: United Kingdom
Institute / Dept: United Kingdom - Office for National Statistics
    Data sources:
  • Smart meter electricity data
Project description:

This is exploratory research, commissioned out to academia, into the potential of electricity smart meter type data to identify household structure and size. A second objective is to research models to see if probability of occupancy by time of day might be derived. Smart meter data will be collected on all households in England by 2020. The minimum specification is energy usage every 30 minutes per meter. Data will be centralized and might be available for research (details/legislation still to be formally agreed). This research is being conducted on data from trials of energy use.

Read More
Country/Area: United Kingdom
Institute / Dept: United Kingdom - Office for National Statistics
    Data sources:
  • Smart meter electricity data
Project description:

Very much exploratory research. The Office for National Statistics (ONS) has acquired electricity smart meter data from trials of energy usage. This data has various potential use within official statistics - the focus for our work is currently on occupancy. Another objective is to familiarize ONS with the methods and technologies needed to handle this type of data. Awareness of long-term vacant properties would help within survey sampling - knowing which areas of the country have high levels of unoccupied housing would be of benefit to field force logistics as well has improved sample designs. Extension of this research, collaboratively across government, might be to compare estimates derived from smart meter data with the alternative official sources of data on vacant properties: to highlight if smart meter data has any advantage in producing in terms of cost, timeliness, geography or accuracy.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Scanner data
  • Other
Project description:

Purchase data from secondary sources.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Health records
Project description:

Collecting data on physicians' use of EHRs and considering how EHRs might be used to obtain data on patients more efficiently.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Health records
  • Scanner data
Project description:

Data purchased from IRI are used in food economics research, seven years of U.S. coverage, with location and date. Retail sales data total over 40 billion records.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Credit card data
Project description:

Comparing estimates from credit card spending to estimates of consumer spending by state.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

Multiple uses of publicly available/administrative data.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Satellite imagery or aerial imagery data
  • Scanner data
Project description:

Research and mapping of weather and climate data, and use of satellite imagery, to support research in rural and resource economics and in markets and trade economics.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Health records
Project description:

Develop spending and prices by disease for health care in the U.S.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Web scraping data
Project description:

Explore the possibility of scraping websites to obtain relevant data.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

Collect large amounts of data from physical activity monitors

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

Technology classifications assigned to US patents are extracted, and automatically indexed to one of 13,141 mainline subclass categories. With this information, we can track the combinatorial history of technological evolution.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

Autocoding job titles to allow receipt of electronic files with job titles and wage levels for each employee that can be converted to the US Standard Occupational Classification system. The goal is for firms to provide data in their format from their payroll / HR records and significantly reduce response burden

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

The project obtains, parses and standardizes computerized criminal history records (CCHR) and transforms the relatively unstructured record of arrest and prosecution, which varies widely among the 50 State, into a standardized statistical research database. The project has created a comprehensive set of rules that reshape the data files in each State into a common format, and it also has compiled a set of rules for translating information contained in the CCHR into a common format for variables such as type of offense, disposition, and other elements in the CCHR.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

Collect genetic samples from survey respondents

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

This project uses high-frequency market trading data to study information shocks and resulting price, volatility, and market quality effects in important agricultural markets.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

Investments in science have become central to U.S. innovation and competitiveness policy. Indeed, legislative efforts such as the Frontiers in Innovation, Research, Science, and Technology (FIRST) Act and the America Competes Act explicitly link funding in science to the national innovation agenda. Yet as Ben Bernanke has pointed out, scholars are not able to explain how federal support for R&D affects economic activity. The nation has no data to guide it on the direct impact of science on the economy, so it is not known either how much money to spend on science or what kind of science. Policy makers are forced to rely on expenditure multipliers, rather than true measures of impact The Census Bureau can fill the current information gap by expanding the resources dedicated to measuring the micro foundations of scientific and technical knowledge production and their impact on the broader economy (the Bureau does jointly perform the Business R&D and Innovation Survey along with NCSES at NSF). This initiative provides a low-cost approach to fill this gap by producing a new series - the Innovation Measurement Indicators. These indicators provide direct information about the effect of research on the economy by tracing the links between research training and the subsequent earnings and career trajectories of trainees as well as the characteristics of the firms where they work and that they start up. The indicators also provide direct information about the links between federal research funding and the characteristics of firms supplying the research inputs

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

The project uses a combination of open source (e.g., Google Alerts) and direct survey methods to capture all deaths that occur in contact with police officers. The project is a pilot test that runs from June 2015 through December 2015 in which the open source data will be used to nominate events that will be verified and confirmed through direct survey with police agencies, medical examiners' offices, and State bureaus of investigation.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

Working with BLS respondents to provide corporate data.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

This project matches national compensation survey (NCS) data with publicly available Form 5500 filings to augment information not collected as part of the NCS, to compare information collected in both sources, and to evaluate whether respondent burden can be reduced through the use of Form 5500 records. Additionally, we are now in the initial stages of trying to apply the matching approaches we have developed to new NCS Survey data as it is collected.

Read More
Country/Area: United States
Institute / Dept: U.S. Office of Management and Budget
    Data sources:
  • Other
Project description:

Two case studies 1. Assess data quality for alternative sources of data for measuring housing quality and other characteristics about housing units. 2.Methodological challenges in using state level data to examine education success and education to career trajectories

Read More
Country/Area: Latin America & the Caribbean
Institute / Dept: World Bank Group
    Data sources:
Project description:

The objective is twofold: Explore the potential of Big Data to address development challenges and promote cooperation and discussion on the topic of Big Data for Development. Big Data exploration (Big Data stocktaking + 3 pilots using satellite data, internet data, and social network data). Encouraging partnership and cooperation, both internally and externally (collaboration with ICT Beam, WB Finances, DECDG, United Nations Global Pulse). Promote knowledge exchange (presentation to LAC RLT + 3 BBLs). Client: World Bank staff interested in the potential of Big Data. External clients might also benefit from the deliverables produced (stocktaking report, pilots).

Read More
Country/Area: Indonesia, Brazil, Morocco
Institute / Dept: World Bank Group
    Data sources:
Project description:

This project introduces substantive technical developments upon current proof of concept applications to human mobility. The CDR trace of road freight (urban delivery truck, long distance, port drayage) is distinct from that of a pedestrian or a taxi driver, and can be automatically classified (combining algorithmic and field knowledge on transportation). Hence such critical project and reform information as freight O-D matrices can be estimated from Big data, instead of costly field survey that cannot be replicated on a regular basis. The team will pilot the concept in countries where we have a strong engagement in the policy areas (transportation infrastructure, logistics, urban transport), and we have arrangements or likelihood to get the data: Indonesia (Jakarta), Brazil (Rio) and potentially Morocco.

Read More
Country/Area: Colombia
Institute / Dept: World Bank Group
    Data sources:
Project description:

Official measures of poverty and inequality are currently produced with a multi-year time lag and have varying levels of coverage across countries. This project aims to evaluate techniques that use Call Detail Records (CDRs) to offer more timely and complete estimates of these variables. With the support of this innovation grant, the project will then explore if and how these new techniques can be incorporated into the routine work of agencies in a client country government, in this case, that of Colombia. More timely and disaggregated socio-economic measures are vital to responsive policy design and implementation.

Read More
Country/Area: China
Institute / Dept: World Bank Group
    Data sources:
Project description:

We propose to implement a big data analytics prototype platform that specialized in analyzing medical insurance data to monitor the insurance cost and potentially detect insurance fraud. The prototype platform includes a) medical insurance metadata repository resulted from the data integration and categorization, b) tailored predictive modeling algorithm software, c) big data visualization tools. Through the proof of concept platform, we will demonstrate the methodologies to discover utilization pattern from insurance data that could lead to cost monitoring and control.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Can we find a methodology to analyze data to identify indicators of corruption, fraud or collusion in Bank financed projects or public spending? Can we find enough data to feed the methodology? INT will use this methodology to identify investigative leads and risk areas. We will share this methodology with partners inside and outside the Bank. The aim is to proactively detect fraud, corruption and collusion risks in Bank-financed projects. Client: INT and partners within the Bank, External anti-corruption authorities with similar objectives. The team is currently working with data scientists at the University of Cincinnati on a project done in conjunctions with OPSOR on Entity Resolution that will be integrated into the new World Bank procurement process, and with the Data Science for Social Good Fellowship at the University of Chicago on modeling of risk using procurement data and case history data.

Read More
Country/Area: Bangladesh
Institute / Dept: World Bank Group
    Data sources:
Project description:

The proposed activity draws from modeling data readily available on the Google cloud platform, including elevation, satellite imagery, and census data to dynamically refine a surface of risk within a flood prediction zone produced by weather services. The research aims at better identifying the population most at risk in case of flood, based on geographical and socio-economic data, in order to better define emergency/DRM planning.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Test the applicability of new/big data techniques to address long-standing challenges in Bank operations. Build a community of internal and external data practitioners with a shared interest in new/big data for development. Collate skills/techniques necessary to apply new/big data at the Bank. 3 test questions (macro, aspirational). Is it possible to predict which projects in the portfolio are likely to succeed? Is it possible to measure poverty faster, at a more local level, and relatively less expensively? Is it possible to predict fraud/corruption in Bank projects. Clients: Bank management, Research teams, Operations teams.

Read More
Country/Area: Colombia
Institute / Dept: World Bank Group
    Data sources:
Project description:

This project aims to understand the relationships between urban infrastructure characteristics and six different types of crime in Bogotá, Colombia. (Crimes: Homicide, Assaults, Theft to persons, Automobile thefts, Motorcycle thefts, Residential property burglaries, Commercial property burglaries). 1. Mapping crime in space-time and correlating it with urban characteristics, land use, city equipment, and social variables 2. Evaluating the effect of the development of Bus Rapid Transit System routes on crime 3. Evaluating survey-collected social variables and BRT use and its effect on crime Client: City of Bogotá, BRT System, Bogotá Chamber of Commerce has been a partner.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Comparing partner data stores of national/sub-national aid distributions against opened country system disbursement data to track: Improved visualizations and custom reporting structures for data-driven decision-making . Follow the money campaigns, complementary to data from country systems on aid [client OAP]. Relationship between conflict and aid (Sierra Leone, Colombia, Afghanistan, Horn of Africa).

Read More
Country/Area: Brazil
Institute / Dept: World Bank Group
    Data sources:
Project description:

Vendor lock-in with proprietary systems and lack of access to their bus fleet AVL data. Implementation of Onebusaway software to replace existing software for estimating bus arrival in real-time. Analyze historic data and produce performance indicators using BI. Respond to citizens' protests and clamor by providing transparency and accountability in the transport sector. Client: Sao Paulo's public bus company -Transit Agencies.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Improve procurement outcomes, the quality of service delivery, and private sector development by enhancing the ability of governments, civil society, and the private sector to access data relating to procurement, analyze data relating to procurement, and alter practices and behaviors around procurement.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

The delivery of the program components will create an environment that helps to address constraints that inhibit the effective use of big data. The enabling environment will provide expertise and accelerate learning to build capacity in big data analytics.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Increase understanding among WBG staff and clients of opportunities and challenges related to advanced data analytics for urban energy management and electricity system optimization in emerging and developing countries. Including a desk study, documentation of four small pilot projects, and an online questionnaire.

Read More
Country/Area: Jamaica
Institute / Dept: World Bank Group
    Data sources:
Project description:

Increase energy efficiency and security through the implementation of the Borrower's National Energy Policy. In relation to the component working on the reduction of non-fuel costs I'm leading a Bank team that is collaborating with the electricity utility in Jamaica. Together we're capturing smart meter data to design a dedicated algorithm analyzing electricity consumption patterns among large commercial and industry customers, as well as to train the model with machine learning properties in order to improve the detection rates of non-technical losses that contribute to prohibiting high non-fuel costs of the electricity system in Jamaica.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Supporting the IFC EDGE team to demonstrate examples of building energy efficiency interventions in Green Certified Buildings by equipping target groups (EDGE certified buildings) and control groups (non-certified buildings) with smart meters and home energy monitors to capture energy consumption data. In addition, specific nudges would be developed and the various participating households would be exposed to different types of messaging to see if this can cause behavior change around the electricity consumption, supporting demand response - and to measure if we can see a difference among the various groups. The EDGE tool is available for use in 100 countries globally.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

The overall objective of the pilot project is to study the feasibility of capturing granular high-frequency price data using modern ICT required for PPP estimates, and making these data available to better inform poverty measurements. Basket of goods and services to be surveyed will include approximately 150 items, covering the most relevant headings of household consumption. For each item, detailed specifications will be used to ensure both intra-country as well as cross-country comparability. The pilot countries are Brazil, Indonesia, and Nigeria. Sampling frames will cover sub-national urban areas as well as rural areas - in particular for food items - in order to arrive at representative national averages prices and estimate urban/rural price differentials where applicable. The active price data capture phase will be one quarter (90 days), with the aim to establish reliable quarterly average price for each item. Captured data will include granular prices, and related price specific metadata, including geographical location.

Read More
Country/Area: Latin America & the Caribbean
Institute / Dept: World Bank Group
    Data sources:
Project description:

The Roadshow will engage public, private, and civil society actors in select Central American countries on a number of successful examples of the use and management of big data in the agricultural sector via a series of two-day workshops entailing presentations of leading initiatives in this area carried out by three external partners (notably, the International Center for Tropical Agriculture, the International Research Institute for Climate and Society of the Columbia University Earth Institute, and GlobalG.A.P.) and related round-table discussion. Presentations will focus on two key issues: (1) leveraging national agricultural information systems to facilitate evidence-based decision-making; and (2) leveraging food safety certification standards to facilitate market access. More broadly, the presentations will touch on themes of agricultural decision-making, agricultural productivity, and farm-level/sector-wide resilience. On a strategic level, the Roadshow aims to build a strong foundation for future dialogue and collaboration between the visited countries, the visiting organizations/firms, and the World Bank towards the potential adoption of data solutions related to information and decision support systems as well as market access and certification systems.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Read More
Country/Area: Nigeria, Brazil, Indonesia
Institute / Dept: World Bank Group
    Data sources:
  • Mobile phone data
Project description:

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
  • Satellite imagery or aerial imagery data
Project description:

Clearer sense of where population and assets are located, built-up areas and population distribution

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
  • Satellite imagery or aerial imagery data
Project description:

Standard definition of urban based on population densities allows true global coverage and comparison; spatial disaggregation allows defining what is good urban form

Read More
Country/Area: Western Balkans
Institute / Dept: World Bank Group
    Data sources:
Project description:

Presenting official gender disaggregated data on land ownership

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

The basic idea is this: The World Bank partners with UNCTAD and other organizations to make a widely used database on non-tariff measures, which is disseminated in WITS. Building data for a new country requires a consultant to sort through a large number of laws and regulations and parse them into Excel records which can be parsed by product code (HS 6-digit) and type of policy measure (UNCTAD NTM classification). This can take on the order of nine months. The Big Data project will take data from previous projects and use machine learning techniques to see if the first stage of the document analysis can be substantially speeded up, leaving a smaller number of matters for the consultant to exercise human judgment on. The potential is that it could reduce by two-thirds the amount of time used to build a new NTM database for a new country.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

This initiative is to develop a diagnostic tool to help WBG clients assess their cities' innovation capacity against their aspired competitors. It helps inform and prioritize investments that will improve the competitiveness and innovation capacity of a city. Clients are asking WBG: How to increase the productivity and innovation capacity of existing firms in my city so that better-paying jobs are created? How can I encourage and nurture entrepreneurship to have a healthy pipeline of innovative new firms? These questions hinge on having a reliable and up-to-date measurement of a city's innovation capacity-whether the city has an enabling environment that nurtures new and existing firms to become more competitive and productive. Currently, there is a lack of a rigorous diagnostics and an intuitive visual that evaluate and compare cities' innovation capacity globally, mainly because of a lack of reliable and comparable city-level data on innovation capacity. This initiative is to develop a city innovation capacity diagnostic tool using a big data approach. The tool, data and interactive graphs are to be published on competitive cities website www.worldbank.org/competitivecities, accessible by both internal and external stakeholders.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Building a database on cartel decisions and cartel characteristics is a key part of the work of the competition policy team. This data is typically collected manually from public sources (i.e. Competition Authorities, media, company websites, and annual reports). Building and maintaining such a repository "by hand" is both time consuming and can yield imperfect results. We propose to develop a tool that combines machine learning techniques and web scraping to build such a database in an efficient manner.

Read More
Country/Area: Colombia
Institute / Dept: World Bank Group
    Data sources:
Project description:

Latin America and the Caribbean (LAC) is a net importer of rice mainly because LAC farmers lack adequate knowledge and current information to adapt their cropping systems to increasingly variable climate. Recent climate change analytical studies by the WBG revealed that LAC could benefit from a more suitable future climate for rice. The International Center for Tropical Research (CIAT) in Cali, Colombia has recently tested an innovative 'Big Data' approach to create a dynamic Decision Support System for rice farmers in Colombia. The methodology proved its capacity to generate relevant insights about the rice system by using available commercial data in Colombia! The next step is to include soil and crop management factors and to test and scale up the system for farmers in Latin America and ultimately the world.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

How good are CDR-derived measures of income and inequality, and can governments systematically use them?

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Targeting can be posed as a straightforward supervised learning problem, in the language of Machine Learning (ML). We will apply these techniques to see how we can improve on classification accuracy over existing programs. In this project we plan to use a wide array of data to attack this problem. We will ask: Are there better prediction functions that can be used on the existing survey platforms that would significantly improve targeting of poverty. A key to this proposal is the combination of two elements: machine learning tools and experimental methods. We will seek to use experiments (specifically at Give Directly and the historical experiment with Progressa) where individuals are randomly given transfers to independently and experimentally validate the efficacy of new targeting rules, both in targeting poverty and in targeting impact.

Read More
Country/Area: India
Institute / Dept: World Bank Group
    Data sources:
Project description:

We propose a novel data-intensive strategy to improve the monitoring of electricity service provision to rural areas in India and across the developing world. We collect and analyze a unique historical archive of nighttime satellite imagery to track the supply and stability of electricity service at the local level spanning nearly 8,000 nights since 1993. Drawing upon this massive dataset and using computationally-intensive methods, our project is developing the ability to identify regional instability in power supply, increases in the frequency or incidence of power cuts, and other signatures that indicate problems with electrical service delivery, particularly in rural and remote regions where traditional monitoring is difficult.

Read More
Country/Area: Congo - Democratic Republic of, Cote d'Ivoire, Ghana, Uganda, Zambia
Institute / Dept: World Bank Group
    Data sources:
Project description:

Leveraging access to IFC clients, the project will collect data such as call detail records and (mobile) financial transaction data from Mobile Network Operators and Financial Institutions to understand customer profiles. Using sophisticated econometric techniques and cloud computing (e.g. Amazon Web Services), the project will crunch around 100,000 GB of data. The expected results will be used to determine which variables are significantly correlated to usage of (mobile) financial services to determine profiles of likely users and lists of concrete individuals who have a high likelihood score. This intelligence can then be used for product development and marketing by WBG partners to increase the supply of financial services to the previously unbanked. Through the integration of experimental evaluation techniques with several rounds of household surveys and big data collection, the team will measure the impact of using financial services on household expenditure as well as produce innovative poverty maps.

Read More
Country/Area: Brazil
Institute / Dept: World Bank Group
    Data sources:
Project description:

This exploratory research project, which supports the strategic priorities of the Governance Practice, aims to gain new insights into the relationship among citizens' sentiment about governance institutions, trust in Government, and civil unrest. The approach will be to conduct sentiment analysis of social media feeds over a period of one year in one country (e.g., Brazil) looking at specific institutions (e.g., transport, police, health) to gain insights into how citizens are feeling about their governance institutions, how that translates into feelings about their Governments in the political sense, and how this corresponds to observed citizen behavior.

Read More
Country/Area: Philippines
Institute / Dept: World Bank Group
    Data sources:
Project description:

Roads lie very much at the heart of an effort to double the public infrastructure spending to 5% of GDP by 2016. The national government intends to fully pave/cement the national network by 2016, with the objective of poverty reduction. The major challenges to this effort are that the roads assessments of the national network need to be validated with independent data through third party monitoring, and that the data capture concerning sub-national road networks and investments needs to be rapidly improved through cost effective means. The Governance & TICT global practice teams propose to link currently available sources of authoritative data with readily available crowd-sourced geo-coded video & image data from mobile devices to validate the state of the Philippines road network. The proposed work will show how the ballooning capture of geographical referenced image/video "big data" overlaid with Open Data through data.gov.ph can be used to close the loops for the more transparent and accountable delivery of public road infrastructure.

Read More
Country/Area: Guatemala
Institute / Dept: World Bank Group
    Data sources:
Project description:

Cellphones generate large datasets of "digital footprints" from a population, which can be analyzed using data mining and computer-learning techniques to reveal behavioral patterns that can then be used to estimate and forecast poverty and shared prosperity. This proposal presents an affordable, practical, and scalable solution for mapping poverty based on the aggregate behavioral patterns of cellphone users, which will be piloted by the Government of Guatemala. The World Bank has initiated an innovative partnership with Telefonica Research and the iSchool of the University of Maryland to develop computer algorithms based on anonymized cellphone call records gathered by Movistar Guatemala, the country's largest cellphone provider. This initiative aims to produce detailed and reliable information at a lower cost.

Read More
Country/Area: Pakistan
Institute / Dept: World Bank Group
    Data sources:
Project description:

We propose to use high resolution satellite imagery (< 1m) and detection algorithms to improve traditional poverty mapping techniques to better measure and monitor poverty in Pakistan. Specifically, we propose to acquire multi-spectral satellite data and develop an algorithm to identify the type of roofs used in local housing (Abelson et al., 2014). The main goal is to learn a) the extent to which high-resolution data improves the accuracy of poverty predictions, and b) the extent to which changes in poverty over time are captured by these satellite derived poverty predictors.

Read More
Country/Area: Nigeria
Institute / Dept: World Bank Group
    Data sources:
Project description:

Governments can act to make markets work better for the poor. The first step is to identify the size and nature of the problem. Our proposal focuses on using largely unexploited, micro-level price data, to provide policymakers with near real time capacity to assess how well markets are working for the poor. Our proposal is to combine spatially detailed and disaggregated price data with publicly available satellite lights data to geographically pinpoint markets serving poorer regions. This combined data can be used to track trends and conditions in markets and can alert policymakers of changes that may negatively affect the poor.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Building on our previous open-source work - the OpenTraffic platform, which won first prize in the 2013 Philippines National E-Governance Competition, the Task Team will partner with Uber, an on-demand taxi service that generates taxi GPS data in more than 200 cities across 45 countries, to scale up the initiative so as to provide counterparts, globally, with a viable, inexpensive alternative to traditional travel time and congestion data collection and analysis, empowering resource-constrained agencies to make better, evidence-based decisions that previously had been out of reach - decisions about traffic signal timing plans, public transit provision, roadway infrastructure needs, emergency traffic management, and travel demand management.

Read More
Country/Area: Colombia
Institute / Dept: World Bank Group
    Data sources:
Project description:

Using rich and robust data we intend to quantify the association of crime with specific built environment characteristics measured through street audits as well as using existing infrastructure information. Using Bayesian Maximum Entropy (BME) and Risk Terrain Modeling (RTM) we will quantify with incidence rate ratios, what specific features of the environment are associated with 6 different crimes (two against persons and four against property), as well as describe the temporal and spatial features of areas with high crime and predict which areas of the city are more likely to experience crime in the future.

Read More
Country/Area: Belarus
Institute / Dept: World Bank Group
    Data sources:
Project description:

This proposal aims to pilot a technical innovation that will allow the determination of user-focused road condition indicators and road safety concerns by extracting information from big data collected through crowdsourcing among drivers and other road users. This collaborative approach provides wider coverage of road networks at frequent intervals and collects uniform data to support strategic and network level asset management decision making.

Read More
Country/Area: United Republic of Tanzania
Institute / Dept: World Bank Group
    Data sources:
Project description:

The objective of this project is to collect high-resolution and high-frequency data on intra-city movements of a randomly selected group of individuals that will be interviewed as part of a planned household survey in Dar es Salaam, Tanzania. The project will combine detailed socio-economic information solicited on individuals and households as part the 3,000 household Measuring Living Standards within Dar es Salaam Survey (MLDS) with (i) follow-up phone interviews and (ii) sensor-embedded smartphone based high-frequency data collection on time- and GPS-stamped intra-city movements of a randomly selected sub-sample of MLDS respondents.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Using Sophisticated Content Analytics to Learn from World Bank Project Documents.

Read More
Country/Area: India
Institute / Dept: World Bank Group
    Data sources:
Project description:

In India, there is an online job matching platform called Babajob, which connects job seekers and employers, including in the informal sector. Babajob holds a great number of user data and job posting data, creating a Big Data of labor demand and supply. Through the utilization of the Big Data on skills demand and supply in India, the proposed project will provide real time labor market data in a visual report format and aims to foster the demand-driven skills development by better informing training providers, policy makers, employers, and job seekers.

Read More
Country/Area: Vietnam, Indonesia
Institute / Dept: World Bank Group
    Data sources:
Project description:

Accurately predicting student performance early allows mitigating interventions to be effectively designed and applied. Prediction of student achievement is therefore highly valuable to policymakers. This proposal seeks to test whether existing Learning Outcome Predicting Artificial Neural Networks (LOPANNs) can perform with the same degrees of accuracy in lower-income settings as in higher-income settings. Using large data sets from Vietnam and Indonesia, it would determine LOPANNs could reproduce the accuracy they have achieved in the US, Belgium, and Argentina.

Read More
Country/Area: Ghana
Institute / Dept: World Bank Group
    Data sources:
Project description:

As part of the Maternal Child Health Nutrition Improvement Project (P145792; MCHNP), the Government of Ghana will pilot a community performance-based financing (CPBF) project in 4 regions where MCHN outcomes are particularly poor. An accompanying impact evaluation (IE) will measure the effectiveness and cost-effectiveness of the project (P151684). CPBF is a novel approach whereby community health teams are incentivized to improve care seeking behavior and health outcomes of communities. Performance payments are based on monthly reported results on key MCHN indicators. Due to the inefficiency of paper-based reporting, android-based software survey tools for smartphones are proposed to report on performance directly from the community level. This Big Data platform would circumvent the time delay, capacity constraints, and data quality challenges associated with paper-based reporting. Based on this pilot, the innovation could be scaled up nationwide.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

Borrowing from the concept of Doing Business (measuring business regulations worldwide), the "What's for Dinner" project would create an interactive 'scorecard' about how easy it is to prepare healthy dinners in different countries (including northern counties) and subnational locations around the world from the point of view of the person who cooks dinner. It's simple, and would make visible the immense amount of time and work that goes into dinner every day, and would recognize the value of 'dinner-makers' around the world.

Read More
Country/Area: Chile, Colombia, Guatemala
Institute / Dept: World Bank Group
    Data sources:
Project description:

This project proposes a Big Data solution to increase the income tax base and boost public revenue. The idea is to leverage consumption data obtained from credit card transactions, ATM withdrawals and online purchases to construct an income proxy, and compare it to reported income from administrative tax records. Consumption data will be obtained through collaboration with the largest credit-card supplier and banking regulator for at least one country. To design an algorithm for mapping consumption measures into income proxies, using non-parametric estimation and statistical machine learning. We propose to randomly notify taxpayers of discrepancies between proxied and reported taxable income to estimate the causal effect of the program on tax payments.

Read More
Country/Area: Mexico, Brazil
Institute / Dept: World Bank Group
    Data sources:
Project description:

Our proposal intends to use data from Easy Taxi, one of the largest e-hailing services (27 countries and 120 cities) for Mexico City, Sao Paulo and possibly Rio de Janeiro to feed an array of tools with data. We have already started negotiations with the founders and the possibility of a promising partnership is on its way. We believe leveraging such-difficult to acquire data from private service provider can be a cost-effective solution for cities in developing countries, benefiting from the possibility of easily scaling-up to other cities.

Read More
Country/Area: Morocco
Institute / Dept: World Bank Group
    Data sources:
Project description:

The proposed project will address the shortfall of data on (i) transport congestions, (ii) commuting inefficiencies, and (iii) access to public transportation in disadvantaged neighborhoods by applying Big Data analysis to a rich dataset of cellphone and mobility data. In a novel approach, the analysis will be directly linked to survey data from commuters and also unemployed. The transport constraints among non-commuters will be analyzed to quantify the potential benefit from transport investments.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

75% of the world's population does not have affordable access to formal systems to register and secure their land rights. The proposal aims to apply (big) geospatial data produced during previous World Bank innovation work to develop a new methodology for recording land rights using real-time user-generated boundaries captured with the help of orthophotos, GPS-enabled mobile phones, and open source software on tablets. This information can be used to (a) update already existing cadastral/registration maps or (b) complete cadaster records in cases where such maps and land rights information do not exist.

Read More
Country/Area: Global
Institute / Dept: World Bank Group
    Data sources:
Project description:

The Global Urban Extents work (GPSURR) utilizes Big Data (high-resolution earth satellite observation and built up area, population) to create a standardized map of global human settlements, based on a methodology developed by the OECD -- which identifies urban areas as a function of density and settlement size. This will allow the World Bank, its clients and others to monitor and compare urban areas. Given the larger agenda to collect and process this data, we identify the opportunity to enhance our understanding of sustainable and optimal urban form.

Read More
Country/Area: Sri Lanka
Institute / Dept: World Bank Group
    Data sources:
Project description:

As urban populations grow, managing urban growth in a way that fosters cities' resilience to natural hazards and the impacts of climate change requires detailed, up-to-date geographic data of the built environment. Crowdsourcing and big data such as OpenStreetMap offer an innovative, cost-effective solution. This grant will serve to equip the government of Sri Lanka with the required tools and processes to monitor new data collection in real-time and update national maps, systems that can be replicated in mapping agencies around the world.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

The project assessed the impact that aggregating mobile data to protect privacy has upon the utility of the data for transportation planning and pandemic control and prevention. The proposed methodology allows to determine what level of data aggregation is the minimum required to adequately protect individual privacy while preserving its value for policy planning and crisis response. This project was done in collaboration with the MIT Connection Science.

Read More
Country/Area: Uganda
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This project developed interactive data visualization tools that were used during a typhoid outbreak in Uganda to analyze dynamic information about case data and risk factors in support of the national task force managing the outbreak. This project was done in collaboration with WHO and the Ugandan Ministry of Health. (Project webpage)

Read More
Country/Area: Indonesia
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This project extracted and analyzed tweets related to vaccines and immunization in Indonesia. Findings included the identification of perception trends including concerns around religious issues, disease outbreaks, side effects and the launch of a new vaccine. This project was done in collaboration with the Ministry of Development Planning and the Ministry of Health in Indonesia, UNICEF, and WHO. (Project webpage) [ PDF ]

Read More
Country/Area: India, Kenya, Nigeria, Pakistan
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This multicountry study analyzed perceptions about immunization from multiple social media channels and news sources in India, Kenya, Nigeria and Pakistan. The project shows how methods including sentiment analysis, topic classification and network analysis can be used to support public health workers and communication campaigns.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This study analyzed public Facebook data to gain real-time awareness of changing trends in attitudes about contraception methods and family planning discourse among Ugandans. This study was done in collaboration with UNFPA.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This study produced a baseline of Twitter activity on topics related to sanitation. Findings revealed a large proportion of conversations related to cholera and increasing public engagement around gender issues and its intersection with sanitation. This study was done in collaboration with the UN Millennium Campaign and WSSCC.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This study analyzed the temporal evolution of the content and sentiment of tweets related to a fuel subsidy reform in El Salvador. The study demonstrated that public opinion as expressed in social media could complement existing methodologies for opinion mining based on survey data. This study was done in collaboration with the World Bank.

Read More
Country/Area: Australia
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This study explored whether online search data could be analyzed to understand migration flows and produce a proxy for migration statistics, using Australia as case study. This project was done in collaboration with UNFPA.

Read More
Country/Area: Indonesia
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This project deployed data analysis and visualization tools to structure and combine data from the Indonesian national citizen reporting complaint system and a local SMS based feedback system (representing active citizen complaints), together with public Twitter posts (representing passive opinions). This project was done in collaboration with the NTB Provincial Government in Indonesia.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This project supported the Uganda's Post2015 consultation process on national priorities by analyzing 3.1 million messages from UNICEF's citizen reporting platform. A data visualization tool was developed to map and classify the views of Ugandan youth around development topics. This project was done in collaboration with UNDP and UNICEF.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This project developed a real-time online dashboard that shows the volume and priority topics discussed in social media around the world related to the Post2015development agenda. This project was done in collaboration with the UN Millennium Campaign and DataSift.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This feasibility study explored the relationship between Twitter trends during major forest fires or haze events, in relation to on the ground events in Indonesia. The study aims to show how social media signals could support emergency response management. This study was done in collaboration with UN Office for REDD+ Coordination in Indonesia.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This project developed a real-time social media monitor to explore online discourse about climate change in support of the United Nations Climate Summit in 2014. The publicly accessible monitor analyzed tweets in English, Spanish and French on a daily basis to show the volume and content of tweets about climate change across a range of topic areas such as economy or energy.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This feasibility study tested a method for filtering keywords from public tweets related to discrimination against women in the workplace, identifying some topics with significant volume of discussions such as discriminatory job requirements. This study was done in collaboration with the ILO Country Office for Indonesia and Crimson Hexagon.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This feasibility study explored how data mining of largescale online news data could complement existing tools and information for conflict analysis and early warning. Taking Tunisia as a test case, analyzing news media archives from the period immediately prior to and following the January 2011 government transition, the study showed how tracking changes in tone and sentiment of news articles over time could offer insights about emerging conflicts. This project was done in collaboration with UNDP.

Read More
Country/Area: Mexico
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This project combined the analysis of mobile phone activity data with remote sensing data during severe flooding in the Mexican state of Tabasco as a method to inform emergency management response. This project was done in collaboration with WFP, the Government of Mexico, the Universidad Politecnica de Madrid and Telefonica Research.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This feasibility study used crowdsourcing to track commodity prices in near real-time in areas where the availability of other data sources is limited. The study involved recruiting a trusted network of local citizen reporters to submit food price reports via mobile phones. This project was done in collaboration with FAO, WFP and Premise.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This project quantified seasonal mobility of populations in different regions of Senegal, based on analysis of anonymized mobile phone activity data. This project was part of the Orange D4D Challenge, and done in collaboration with WFP and the Universidad Politecnica de Madrid.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This project investigates how people's self reporting of commodity prices through Twitter can be used to provide real-time price indicators. The project showed that the prices reported on Twitter were closely correlated with official food prices. This project was done in collaboration with the Indonesian Ministry of Development Planning, WFP and the Korea Advanced Institute of Science and Technology.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

This study assessed the potential use of mobile phone data as a proxy for food security. Results showed high correlations between airtime credit purchases and survey results referring to consumption of several food items. In addition, models based on anonymized mobile phone calling patterns and airtime credit purchases were shown to accurately estimate multidimensional poverty indicators. This project was done in collaboration with WFP,Université Catholique de Louvain and Real Impact Analytics.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

The project aims to build a tool to support response to disease outbreaks using primary data routinely collected and data collected during the outbreak, secondary data related to contextual factors and Big Data. The tool allows data visualizations and data correlations to better understand and monitor the outbreak, build prediction modeling and support respond to the disease. Historical data on outbreaks and human mobility patterns are being mapped out to establish baselines as a starting point. The tool is designed to be incorporated at the Ministry of Health database (HMISH).

Read More
Country/Area: Uganda
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

In the Northern Uganda region, as in many African countries, where poverty levels are high and the majority of population is rural, a proxy indicator of poverty is the type of roof at the household. As the household economy improves, families often upgrade their dwelling places by changing the type of roof from the traditional grass thatch to iron sheets. A tool is under constructions using image processing software to count the roofs and identify the type of material they are constructed from. A user-friendly tool (on-line dashboard) that provides proxies for poverty monitoring based on household' s roof counting will be built. The indicators will be built on baseline data and will be automatically refreshed every week with data aggregated at the district/county level. The indicators will be calibrated with data on poverty levels from the Uganda Bureau of Statistics.

Read More
Country/Area: Global
Institute / Dept: UN - Global Pulse
    Data sources:
Project description:

National Biodiversity Strategies and Action Plans (NBSAPs) define set of 20, time-bound, measureable targets aimed to reduce biodiversity loss at the national and global level. Representing and analyzing spatial data (data that includes geographical coordinates - latitude and longitude) is crucial to formulate policies to protect ecosystems, such as rainforests, mangroves, coral reefs, wetlands, drylands, and grasslands underpin human life on Earth. While spatial data is available online from a variety of sources, it is not always easily accessed. Pulse Lab Kampala, in collaboration with the NBSAP Forum (www.nbsapforum.net), UNDP and the Government of Zimbabwe, has created a pilot tool to support the formulation of NBSAPs. The web-based tool makes accessible spatial data in a user-friendly way for decision-making.

Read More
Country/Area: Bangladesh, Sri Lanka, Mozambique
Institute / Dept: University of Tokyo
    Data sources:
  • Mobile phone data
  • Satellite imagery
Project description:

Develop a method to create a human mobility dataset by analyzing mobile phone data with various secondary data such as land use and transportation networks. For collecting training and validation data, field surveys were conducted. Results obtained through the method are called Dynamic Census that is the human trajectory data and gridded-map data, representing the spatiotemporal distribution of both mobile phone users and non-mobile phone users. The data are labeled with predicted demographic attributes. We believe it can be good supplement data for conventional population and housing census data with the information on population movement at the high granularity and high frequency. Considering that the data structure of mobile phone data does not vary a lot according to the region and country, developed method will be scalable in other parts of the world. We just finished a pilot project in Bangladesh and are preparing for Sri Lanka and Mozambique.

Read More
Country/Area: Italy
Institute / Dept: University of Washington
    Data sources:
  • Mobile phone data
  • OpenStreetMaps, iSTAT Census data
Project description:

As increasingly more people seek to live in urban cities, governments and other organizations face the challenge of effectively identifying areas in most need of revitalization and intervention. One way of designing such interventions is by using “poverty maps”. Poverty maps are designed to simultaneously display the spatial distribution of welfare and different dimensions of poverty determinants. The plotting of such information on maps however heavily relies on data that is collected through infrequent national household surveys and censuses. However, due to the high cost associated with this type of data collection process, poverty maps are often inaccurate in capturing the current deprivation status. In this project, we address this challenge by means of a methodology that relies on alternative data sources from which to derive up-to-date poverty indicators, at a very fine level of spatial granularity. We validate our methodology for the city of Milano. Based on our methodology and design requirements gathered from stakeholders we design and implement a poverty mapping tool for policy makers.

Read More
Country/Area: United States
Institute / Dept: Columbia Unviversity
    Data sources:
  • Mobile phone data
  • Social media data
Project description:

It is for our course project 'Advanced Big Data Analytics' that we are working on this topic.

Read More
Country/Area: Mongolia
Institute / Dept: National Statistical Office of Mongolia
    Data sources:
  • Satellite imagery or aerial imagery data
Project description:

The NSO Mongolia has planned to conduct its first agricultural by-census in 2017. For this time around, we are planning to use satellite imagery to identify crop types and estimate the production. In addition to this project, we are also planning to pilot the Census of Building and Housing by integrating different Geo-spatial database with administrative registration database and population database.

Read More