Innovations in technology, broad use of electronic devices and the all-around generation of digital information brings fundamental changes to the availability of real-time information, such as data from GPS devices, from mobile phones or from social media. These high-volume and sometimes loosely structured data sources, commonly referred to as Big Data, are different from conventional data sources for official statistics, and have many challenges in their application. The business case still needs to be made for many statistical offices if, why and how Big Data are useful for official statistics.
The statistical community has recognized the potential use of big data for official statistics. The UN Statistical Commission therefore established in March 2014 a global working group mandated to provide strategic vision, direction and coordination of a global programme on Big Data for official statistics, to promote practical use of sources of Big Data for official statistics, while finding solutions for their challenges, and to promote capacity building and sharing of experiences in this respect.
Currently, mostly developed countries have started using various Big Data sources encountering a number of challenges. Mobile phone data, Traffic loop data, Twitter data and Satellite imagery have all been successfully used. Whereas national circumstances can be fairly unique, it is still intended to generalize the use of such Big Data sources to other countries, especially also in developing parts of the world.
Day 2 started with a fascinating presentation from the invited expert Mr. Siu Ming-Tam (Australia Bureau of Statistics) on Satellite Imagery. He explained their work with satellite imagery data in order to replace surveys for measuring agricultural crop production. Following his presentation, representatives from DANE (Colombia), NBS China and INEGI (Mexico), presented three different applications of satellite imagery.Mr. Ronald Jansen presented the key findings of the joint UNECE/UNSD Survey on Big Data projects. The survey collected information on overall Big Data strategies as well as detailed information about Big Data projects. The presentation complemented the individual presentation of Big Data projects, by providing a comprehensive synopsis of the status of Big Data projects in the statistical community.
Day 2 concluded with a captivating discussion of the use of twitter and other social media for official statistics. The invited expert, Mr. Andrew Schwartz (University of Pennsylvania) discussed the process of acquiring social media sources, which he claimed is the largest dataset on human behavior, and demonstrated various projects using social media. Statistics Netherlands, INEGI (Mexico), ISTAT (Italy), NBS China and Baidu all presented projects that explore the use of social media to replace or reduce the reliance on surveys. The chair of the session, Mr. Trevor Sutton (Australia Bureau of Statistics) summarized by saying that the session showed that the statistical community is making strong progress in this area, and even if there are numerous methodological challenges, this is a natural part of the modernization process and that the Staticial Community should work to overcome the challenges.
The opening of the International Conference on Big Data in Beijing was inspiring with Mr. Ma, Chief Statistician of China, Mr. Stefan Schweinfest, Director of UNSD, and Mr. Trevor Sutton, Deputy Australian Statistician, setting the tone. China has been working on Big Data since 2012, and follows a dual track strategy, developing Big Data applications in parallel to its production work. UNSD placed Big Data in the context of the recent call by the international community for a Data Revolution for Sustainable Development, and ABS, as Chair of the UN Global Working Group on Big Data, recognized that the GWG will need to deliver a global work programme for the coming years. Mr. Steven Landefeld (consultant to UNSD) gave a Key note speech which was cautiously upbeat about the promise of Big Data, seeing its potential especially in improving timeliness and relevance of official statistics, not unlike the role of, for instance, some administrative records for GDP estimates.
Mobile Phone positioning data, GPS and other tracking devices were discussed in detail in the afternoon session. Mr. Margus Tiru presented the difference between active and passive mobile phone location data, and showed a number of different statistical applications, such tourism, day time mobility and estimation of population census data. The key challenges remain public trust in the use of mobile phone data, and the protection of privacy and confidentiality.