Big Data Project Inventory
The GWG Big Data Inventory is a catalog of Big Data projects that are relevant for official statistics, SDG indicators and other statistics needed for decision-making on public policies, as well as for management and monitoring of public sector programs/projects. This inventory is a joint product of the World Bank and the United Nations Statistics Division (UNSD) put together on behalf of the UN Global Working Group (GWG) on Big Data for Official Statistics. The tasks related to the content of the inventory are led by the World Bank and UNSD, and the technical side is serviced by the UNSD technical team.
If you are working on a project that you would like to be considered for inclusion in this Inventory, even if the project is in an initial phase, please fill out this application form.
Please note that the project should either use Big Data sources and/or utilize Big Data techniques, and ideally have some relevance or implications for official statistics, SDG indicators or other statistics needed for decision-making on public policies. The Global Working Group will review submissions and include those projects that meet these criteria, or possibly contact you for further information. Please note that the information submitted below, once approved, will be made public on the GWG Big Data Project Inventory website.
Modernization of price collection and compilation with web and scanner data
Organization / Dept: Finland - Statistics Finland
- Web scraping data
- Scanner data
- Division: Standards and Methods
- Name: Pasi Piela
- Email: email@example.com
On-going pilot project on consumer prices searching opportunities for web scraping and scanner data usage.
- Pilot intended to go to production to supplement existing data
- Pilot intended to go to production to replace existing data
- Price statistics
- Data providers:
- Intermediary big data provider
- Other partners:
- Technology partner
- Partnerships Comments: Direct data providers in certain cases.
- SDG Goals: Not Specified
- SDG Comments: Not Specified
- SDG Relevance: Not Specified
- Data Access Rights: Only for this project
- Data Access Comments: After the pilot phase, broader access rights will be negotiated.
- Intermediary: No
- Coverage Period: Previous month in question
- Data Coverage: All available data
- Coverage Geo Pop: Whole country / high % of market
- Cost Implication: Free
- Validation With Training Data: No
- Quality Framework Comments: Not yet.
- Data Quality Concerns: No
- Quality Aspects Evaluated:
- Privacy and Security
- Completeness, Usability, Time Factors
- Accuracy, including selectivity
- Coherence, including linkability to other sources
- Accessibility, Relevance
- Methods Used:
- Supervised learning
- Data visualization methods
- Traditional statistical methods
- Developed New Methods: Yes
- Technologies Used: No detail provided
- Timeframe To Produce Indicator: NA