How to harness private data towards meeting the SDGs? The need for data stewards

To meet the objectives set forth in the 2030 Agenda for Sustainable Development, it is clear, we need not only new solutions, but also new methods for arriving at solutions.

Towards that end statistical agencies, for example, are increasingly looking to non-traditional data sources from the private and civil sectors to complement and enhance official statistics. The vast streams of data generated through technology, social media platforms and other sensors, when analyzed responsibly, can offer unique insights into societal patterns and behaviors.

These insights could be harnessed to enable innovative ways to measure global progress on the SDGs and inform evidence-based policy decisions. At the same time, large datasets create their own problems — of complexity and noise, risks to privacy and security, and the potential to have a disparate impact on already vulnerable populations.

One of the greatest tasks of our era may be figuring out how to unlock and harness the value of this data to provide actionable insights for positive social and economic impacts.

This is where the role of data collaboratives comes in.

Data collaboratives refer to an emergent form of public-private partnership that allows for collaboration around new data sources across sectors, SDGs and geographies.

Data collaborative-type entities and initiatives have emerged in sectors as varied as agriculture, climate change mitigation, gender and health, among many others. In Estonia, anonymized mobile phone data is helping to understand the volume of tourism and foreign workers, and to tailor government services and transport options to them accordingly. In Namibia, satellite imagery and telecom businesses are sharing data to track the spread of malaria. If you are interested in learning about other examples of these data collaboratives and initiatives, please click here.

Yet, for all its promise, the practice of data collaboratives remains largely ad hoc and limited. In part, this is the result of a lack of a well-defined, professionalized concept of data stewardship within corporations that includes a clear mandate to explore ways to harness the potential of their data towards positive public ends.

Consequentially, the process of establishing data collaboratives and leveraging privately held data for policy making is onerous and prone to dissolution when the internal champions involved move on to other functions.

By establishing data stewardship as a corporate function, recognized and trusted within corporations, and by creating the methods and tools needed for responsible data-sharing, the practice of data collaboratives can become regularized and less risky.

With support from the Hewlett Foundation, the GovLab has embarked on a project,, to identify and connect existing practices of data stewardship and to define new approaches to data responsibility.

At a kick-off event held at the Cloudera Foundation in May 2018 — attended by representatives from Linkedin, Facebook, Uber, Mastercard, DigitalGlobe, Cognizant and Streetlight Data, among others — participants described their experiences with data collaboration, mapped key opportunities and challenges, and discussed the roles and responsibilities of "data stewards"  —  groups or individuals within corporations responsible for the handling of data.

Four key takeaways emerged that help us to better understand both the potential and challenges of data collaboratives. These are summarized briefly below (longer version here).

  • Maturity and capacity: Attendees pointed out that the field of data collaboratives is fledgling and ill-defined, posing certain challenges to more widespread adoption. In particular, public awareness is limited, leading to ambivalence or suspicion. Corporations themselves often don't appreciate the value of sharing data, which similarly limits the amount of private data shared for the public good. There often exists confusion on how to pursue public good in the context of a profit-driven business. Additionally, attendees pointed out that holders and recipients of data often lack the resources to use and maximize the potential of shared information. Increasing human and technical capacity  —  along with awareness  —  is therefore critical to expanding the use of data collaboratives.
  • Transaction costs: Conference participants also pointed to the high transaction costs of data sharing. These can come from preparing data; identifying and vetting potential partners; and negotiating the legal and commercial terms of sharing. Importantly, transaction costs are especially burdensome for entities that are under-funded or under-resourced. Mitigating these transaction costs is essential to creating a broader community of data collaboratives and to ensuring a level playing field.
  • Scaling: It is often difficult to scale or replicate data collaboratives. What works in one instance is frequently less successful in another. A small-scale sharing project may have trouble including new and more data. To overcome these challenges, participants suggested platforms to help groups find new opportunities or identify new collaboration partners. Pooling of efforts, expertise and experience is essential, especially for groups that lack resources.
  • Community of practice: There was broad consensus on the need for a community in which data stewards can share experiences and resources. Such an environment, formal or informal, would offer a safe environment for practitioners to trade stories, tools and lessons. Even MOUs, contract language and legal frameworks, as well as a mapping of firms pioneering data uses for public good, could be shared through the community of practice.

The UN World Data Forum provides a great opportunity to review, validate and find ways to make data collaboratives more systemic and sustainable. I hope you will join us at our panel on "data stewardship in action" that will look into options towards reducing barriers and unlocking beneficial partnerships between the private and the public sectors to return value to the population, directly improving quality of life.

In the meantime, if anyone is interested in joining the emerging data stewards community of practice, please get in touch with us through

Stefaan Verhulst is the Co-Founder and Chief of R&D at The GovLab which seeks to improve people's lives by changing how we govern using advances in science and technology.