Big Data Project Inventory

Home   Inventory

The GWG Big Data Inventory is a catalog of Big Data projects that are relevant for official statistics, SDG indicators and other statistics needed for decision-making on public policies, as well as for management and monitoring of public sector programs/projects. This inventory is a joint product of the World Bank and the United Nations Statistics Division (UNSD) put together on behalf of the UN Global Working Group (GWG) on Big Data for Official Statistics. The tasks related to the content of the inventory are led by the World Bank and UNSD, and the technical side is serviced by the UNSD technical team.


If you are working on a project that you would like to be considered for inclusion in this Inventory, even if the project is in an initial phase, please fill out this application form.

Please note that the project should either use Big Data sources and/or utilize Big Data techniques, and ideally have some relevance or implications for official statistics, SDG indicators or other statistics needed for decision-making on public policies. The Global Working Group will review submissions and include those projects that meet these criteria, or possibly contact you for further information. Please note that the information submitted below, once approved, will be made public on the GWG Big Data Project Inventory website.

The application of Big Data for highway and waterway transport statistics

Country/Area: China
Organization / Dept: China - National Bureau of Statistics
    Data sources:
  • Road sensor data
  • Ships identification data

Contact information

Project description:

In 2014, the Joint Transport Ministry has studied the networks of the toll highway system and marine visa system. They found a way to apply the massive administrative records data of these two systems to the highway and waterway transport statistics, and now the method has been applied in most of the country on a trial basis.


  • Pilot intended to go to production to replace existing data

Statistics Area:

  • Transportation statistics

  • Data providers: Not Specified
  • Other partners:
    • Technology partner
    • Government institute
  • Partnerships Comments: Not Specified

SDG Indicators
  • SDG Goals: Not Specified
  • SDG Comments: Not Specified
  • SDG Relevance: Not Specified

Data Access
  • Data Access Rights: Only for this project
  • Intermediary: No

Data Coverage
  • Coverage Period: Monthly since 2013
  • Data Coverage: Only a portion of all data
  • Coverage Geo Pop: Part of country / high % of market
  • Cost Implication: Free

Data Quality
  • Validation With Training Data: No
  • Quality Framework: Quality of source/input
  • Quality Framework Comments: Compare the internal original data with the external data. Analyze the volatility of the aggregated data and establish review periods.
  • Data Quality Concerns: No
  • Data Quality Concerns Comments: We designed a special audit platform according to the structural characteristics of the original data and aggregate processing data and set up the conditions for approval.
  • Quality Aspects Evaluated:
    • Completeness, Usability, Time Factors
    • Accuracy, including selectivity
    • Coherence, including linkability to other sources
    • Validity
    • Accessibility, Relevance

  • Methods Used:
    • Data visualization methods
    • Traditional statistical methods
  • Developed New Methods: No

  • Technologies Used:
    • NoSQL database
    • Column store database
    • Data mining tools
    • Data visualization tools

  • Timeframe To Produce Indicator: NA