Big Data for Sustainable Development: what does it take to get to the next level?

In today's world, livelihoods, social relations and knowledge are linked to the use of digital devices. Digital devices and applications are more and more easily accessible to millions of people that are poor and live in marginalized communities. To leave no one behind, we must use innovative ways of understanding data and how we can make the best use of it to inform, monitor and evaluate policies.

An adolescent girl living in Kisenyi, the largest slum in Kampala, might not have access to a latrine, but she will likely have access to a Facebook account to meet new friends. An elderly pastoralist living in the Karamoja region of Uganda might not have access to nutritious food, but could have access to banking services if he owns a telephone SIM card. A young refugee fleeing South Sudan might not have access to a school teacher, but can find out the results of Spanish Liga football matches from media outlets. Across socio-economic statuses, gender, age, and technical prowess, people are generating big data on a daily basis; this same data can be used to better understand and respond to their needs.

Research and inspiring small-scale projects conducted around the world in recent years prove that big data sources are available, reliable and accessible, and can make a valuable contribution to achieving the Sustainable Development Goals (SDGs).

The public and private sectors, along with academia, are exploring the potential uses of these new types of data. Risk factors (e.g., tobacco, alcohol, diet and physical activity) of non-communicable diseases (e.g, cancer, diabetes, depression) can be inferred from online internet searches. Citizen-generated data from social media platforms can inform humanitarian response to natural disasters, such as fire and haze events. Insights from public discussions taking place on the radio can inform early warning systems to respond to refugee crises, identify gaps in healthcare services, or gauge opinions about government performance.

If big data sources are available, reliable and accessible, what is hampering the adoption of big data by the public sector?

Data priorities

Many countries today are still working to improve their civil registration systems to produce quality vital statistics. Vital statistics refer to the number of births, deaths, and marriages in a population group. At the same time, the same countries are challenged to produce the underlying statistics to monitor and track progress on the SDGs. While big data offers alternative ways to measure processes, changes, dynamics and trends in territories and societies, many countries prioritize traditional data sources over these newer and more innovative data sources.

New types of data are challenging current frameworks, methods and jobs

New types of data demand new methods of analysis. Big data is in its essence different that small data. Small data is easy to understand, access and analyse by humans. Big data is not. The scientific community continues to face challenges in creating frameworks and methods to popularize the use of big data among scientists and data users. Since modern statistics were shaped in the mid 17th century, statistical science has been guiding the way we look at and analyse data. The term big data was created in the 21st century (around 2005) to refer to a new type of data. Data scientists have emerged as the curators of big data in the scientific community, and many statisticians around the world are still not familiar with big data.

With the application of probability sampling in the 1930s, surveys became a standard tool for empirical research in social sciences, marketing, and official statistics. Large scale analysis of people's voices was done in the last century with structured surveys. While digital devices such as mobile phones or iPads have replaced traditional paper questionnaires, the methodology remains the same as the one that was used a century ago. Large-scale analysis of people's voices is still done through the aggregation of individual responses to closed questionnaires.

Today, we can analyse, at a scale larger than ever before, people's voices from social media or public radio all over the world. We can mine vast amounts of anonymised data and tackle hate speech related to an ethnic group in a country, monitor crop losses reported by farmers in communities, understand voting intentions before elections, or learn about people's views around the issue of child marriage. As with any other type of data, biases and representativeness need to be understood in order to make use of big data. The large scale interrupted discourse of people's voices does not represent the views or opinions of a universe of analysis as a sample does, but the type of information might be more accurate, timely and informative than the results of static surveys.

Unlocking data collected by private sector

Right now, most big data is collected by private sector companies and used to increase revenue. While companies are starting to seize the opportunities of using big data for public good, they face challenges. Pulse Lab Kampala, with support from the Hewlett Foundation, engages in constant dialogue with the private sector in Uganda. From our interactions, the main challenges companies face are: a) Lack of Governmental legal frameworks that protect them from litigation by regulators or costumers; b) Fear of a confidentiality breach that results from information leaks; c) Lack of data processing capacities, including IT capabilities and technical skills; and d) Lack of regulatory frameworks and tools (including type of agreements) to enable data collaboratives.

The pervasive nature of big data and the rapidly evolving capabilities of AI hold great promise for social impact and can drive transformation across many domains. Scope therefore exists to expand the use of this technology, beyond current applications, leveraging big data in new ways that help us to achieve the 2030 Agenda. This is not without risk, however, and stakeholder cooperation is crucial.

We look forward to continuing this conversation at this year's World Data Forum in Dubai in October.

Please get in touch with Pulse Lab Kampala at: