Defining data stewardship

Shaida Badiee , Open Data Watch

Dominik Rozkrut , Statistics Poland

Why is it important to develop a common definition of data stewardship?

A conceptual framework and language around data stewardship should help us build a common understanding among different data and statistical communities on what it takes for establishing a system of resilient data governance that is built on strong partnerships, well balanced between providing effective data sharing and data privacy protection mechanisms and would help us reap the social and economic benefits of data for our wide range of users. This will ensure that we keep up with the changing landscape of the data ecosystem.

We must develop new data governance practices that are adaptable and responsive to local contexts. At the same time, we need a common framework and language so that we can communicate advancements of the global data and statistical community’s good practices along the Data Value Chain. Including those to share knowledge and solutions, monitor gaps, help build capacity, establish commitments, and promote data use and impact and value of data.

Such a common framework and definition is currently lacking. What we have found so far about existing definitions of data stewardship suggest that data stewardship takes different meanings and implications depending on the local context and the specific data community where it is applied. There is still little consensus on what the role of a 'data steward' entails and who should take on that role within the broader data ecosystem.

What are the main barriers to develop a common definition for Data Stewardship?

To better understand data stewardship and its components, the United Nations Statistical commission established the UN Data Stewardship Working group to develop recommendations with member states, data producers, and other stakeholders on data stewardship. The work streams within this group are: a) data governance around the legal frameworks and mandates for NSOs as data steward; b) inclusive and equitable data use; c) data sharing and collaboration; d) data stewardship at the city level; and e) the present effort around establishing a conceptual framework for data stewardship.

A stocktaking of data stewardship definitions in the development data community and organizations outside of the development data community demonstrated that there is no common definition. The interpretation of data steward ranges from complex definitions that include providing data, curating data, encouraging data use, facilitating data dissemination, and creating regulations, to others that only mention managing data with little description.

While it is crucial to find a common definition of data stewardship, organizations have had difficulty outlining the roles of 'data stewardship' because of its complexity. In addition to this, different organizations have various institutional arrangements and models, making it challenging to design a one-size-fits-all definition. For example, in organizations with few data producers or those where data collection efforts are still confined to a central agency, there are varied understandings of the role of data stewardship.

There are also variations in the translation of data stewardship, making it difficult for the term to be used and understood in a global context without a strong common definition. In a few languages, if translated directly, such as in French, the term used is "Intendance des données" which translates directly to "data stewardship." In other countries, though, the translation is more inconsistent. For example, in Spanish, the term for data stewardship is "administración de datos," which highlights the role of data management but not the other functions of data stewards. Similarly, in Polish, the term for data stewardship is "zarządzanie danymi w krajowych systemach informacyjnych" – which translates to "data management in state information systems." It is again a more descriptive translation. These descriptive translations describe the local context within which each data steward operates but are difficult to apply to a global definition of data stewardship.

The way forward

The increased data demands from the 2030 agenda have led to a need for innovation and a widening data ecosystem with a rapidly growing range of data actors. This has called for the role of National Statistical Office’s (NSOs) to evolve from data producers into coordinators and, finally, into "data stewards."

Efforts to understand data stewardships are paramount as data play an essential role in our lives, but their many benefits to improve lives come with challenges. We must develop a common framework that accounts for the cross-cutting nature of data so we could communicate better the important role of the Data Stewardship could play to ensure that data add value and are used for good.

As we come together for the World Data Forum, many of these topics will be reviewed extensively. Discussions will revolve around improving trust in data and data governance. It will be critical to outline the role of data stewardship in the way forward, join us in highlighting the importance of data stewardship in discussions throughout the Forum and beyond. Please send us any suggestions you may have for this work stream on trying to better define Data Stewardship.

Dominik Rozkrut & Shaida Badiee are currently the cochairs of the Working Group on Data Stewardship Workstream on Establishing a Conceptual Framework for Data Stewardship