"Leaving no one behind" - Beyond data disaggregation

Data, especially social data, play a critical role in raising awareness on the wellbeing of individuals and communities. Effectively utilized, it gives visibility to the nuanced and varied living conditions of individuals. However, one of the challenges associated with the use of data for understanding social wellbeing is that these data not only can perpetuate marginalization, but with the Data Revolution, could exacerbate it.

In our upcoming report "Engaging Citizens for Sustainable Development", which we will present at the Forum, we expound on the notion of data marginalization to discuss the different types and dimensions of data marginalization that exist for social indicators. The metaphor adopted in the report is that of individuals as "voices" within normative development processes that are listened to, heard, and engaged with. Marginalization in social indicators occurs as follows:

Unknown voices -- Individuals are unknown to the data collection entities (e.g. National Statistics Offices). Two broad categories exist within this group: the unknown unknowns and the known unknowns. The existence and characteristics of the latter group is more readily defined than the former group, and includes isolated and untouched communities, concealed individuals and, in some cases, victims of modern-day slavery.

Silent voices -- When individual lack, due to individual and personal factors, the ability for vocalizing -- so that while their objective wellbeing can be observed, their subjective wellbeing is difficult to ascertain. For example, the subjective social wellbeing of persons with disabilities might not be as easy to determine through standard data collection instruments.

Muted voices -- Population groups within specific societal contexts that are marginalized and excluded, and by extension, they are many times left out of data collection processes -- for example, women, children, immigrants, and individuals in "lower" social classes. In this case, the marginalization is attributable to structural factors such as social norms, societal values, and practices.

Unheard voices -- Groups that are excluded in the data collection, especially through the use of contemporary digital instruments, which perpetuate unrepresentative sampling. Examples of these include the digitally unconnected, illiterate, and linguistically excluded.

Marginalization and the resultant exclusion, in its various forms, beyond limiting the breadth of reach of the development efforts, also usurps the potential agency of individuals to realize their desired functioning and wellbeing. These forms of marginalization and exclusion should facilitate effective and targeted interventions to address the specific forms of marginalization.

In light of these forms of marginalization, thinking about the 2030 Agenda for Sustainable Development and about the core guiding principle to "leave no one behind" reveals a much more complex situation that demands more than just collecting disaggregated social data. Ensuring that we "leave no one behind" is going to take much more -- it is going to require us to explicitly engage with civil society organizations and community-based organizations that are on the ground working with those left behind; it is going to take the explicit incorporation of qualitative data, perception data, thick data, small data, and microdata; it is going to take engaging with citizens more broadly and deeply through citizen-generated data, community-based monitoring systems, and crowd-sourced indicators; it is going to take creativity and innovation coupled with explicit and formal partnerships with other actors within the data ecosystem.

Around the world, various initiatives and interventions are being undertaken that highlight effective mechanisms and partnerships for addressing the forms of marginalization noted above. In our work, we are engaged with a community-based organization that works with the homeless population to understand the opportunities and challenges for collaborating on data on the homeless population; we are working on identifying and documenting data gaps for digital gender equality indicators; and we are working on developing data tools and solutions to support organizations working with trafficked and forced-labor populations. The upcoming United Nations World Data Forum in Dubai on the 22-24 October 2018 has a dedicated thematic area on "Leaving no one behind" that includes a rich set of sessions. We are co-organizing one such session: "It takes a village to leave no one behind: Emerging best practices in community-based data collection" to facilitate further discussions and deliberations, beyond "data disaggregation", on mitigating marginalization in social data. The outcome of these deliberations will hopefully fuel global commitments towards ensuring that the much heralded "Data Revolution for Sustainable Development" does not perpetuate and exacerbate data marginalization, thereby inadvertently leaving many individuals and communities behind.

Mamello Thinyane, is Principal Research Fellow at United Nations University.