digital crossroads between social media, datafication and development
Data inequalities and development

Data inequalities and development

Several corporations, governments, media outlets, researchers, and non-governmental organizations have invested heavily in data to advance economic and social development following discussions about the ‘data revolution’ (Cinnamon, 2020; Center for Global Development, 2014; Kitchin, 2014; UN Data Revolution Group, 2014).
However, in Kitchin’s words (2014, pp. 25–26), ‘the topic of data itself has received little critical reflection and research until recently.’ Growing and diverse research has recently led to a clear message: producing, accumulating, and analyzing data has substantial implications for democratic societies since it leads to inequality of opportunity and harm. To put it simply, data inequalities matter (Cinnamon, 2020).
It is becoming increasingly common to attribute power and agency to data as a development actor in the emerging field of ‘data for development’ (D4D) – a part of the broader field of information and communications technologies for development (ICT4D). At least, in theory, investing in data can improve evidence-based decision-making, strengthen transparency and accountability, assess development progress, and improve living standards and equality.
The D4D strategic and policy initiatives also raise concerns about data’s potential to widen societal inequalities. However, a simplistic characterization of data inequalities as a fundamental problem of inclusion/exclusion is also prevalent based on the notion that inequalities in the diffusion of, access to, and use of data can widen development gaps between individuals, groups, and nations. Several causes, forms, and consequences of emerging inequalities of the data revolution are not adequately explained and addressed by conventional understandings of digital inequalities (Cinnamon, 2020). Alternatively, the data revolution is bringing about new forms of inequality that are difficult to capture by core concepts like digital inequalities.
Based on Cinnamon’s (2020) conceptualization of data inequalities, there are three interrelated yet divergent dimensions; access to data, world representation as data, and data flow control.
It is the goal of research on the digital divide to identify and reduce unequal patterns of opportunity and harm, a term coined in the early 1990s to describe the growing gap between those who have access to information and communication technology (ICTs) and those who do not. Normatively, technology is seen as an enabler of social inclusion and equality (Norris, 2001).
As Kleine (2018, p.225) points out, digital inequalities are a direct result of social inequalities, which tend to follow existing exclusionary axes such as class, gender, education, and age, as well as between urban and rural areas and between income-rich and income-poor countries.
In Cinnamon’s (2020) interpretation, the information divide is reduced to an issue of ‘information poverty,’ a condition in which information inequalities are accumulated over time, depending on a variety of factors such as cognitive ability, constructed social and cultural information norms, and political and economic factors at the level of society that determines who has access to information and who does not base on societal power structures. Information poverty can easily be described as an additive process that accumulates from smaller to larger scales, with implications for individuals and groups.
In the context of the so-called data revolution, inequalities in data access have become more visible and problematic and have sparked several debates and movements. There are frequently social, economic, and geographic barriers to accessing all data, not just scientific data. Rich countries collect and analyze data from a wide range of objects and activities – including thermostats, fitness trackers, and location-based services like Foursquare – which has led to an increase in the data divide. Disaster relief and development are hampered by a lack of reliable data in poor countries (Economist, 2014).
Data poverty is closely related to socioeconomic inequality at the individual level, negatively impacting individuals who may go through life without leaving any official records, let alone an elaborate civil registration system that tracks births, deaths, and causes of death. Within the world’s poorest countries, those without an official form of identification tend to be the most vulnerable groups at the community level (ID4D, 2016). Women, children, rural and remote residents, refugees, migrants, homeless populations, nomadic populations, and those deprived of identity documents for human trafficking are among the most vulnerable groups. Individuals who have been counted in countries with incomplete civil registration are usually of higher socioeconomic classes, giving them citizenship privileges. Data infrastructure and data resources are often closely related to the development and strength of a nation’s civil society and democratic institutions at the national and societal level (Szreter, 2007). Several less-developed countries do not have reliable data infrastructures for collecting and managing civil registration data, resulting in what Setel et al. (2007) term a “scandal of invisibility”. As a consequence of funders and donor agencies preferring countries that can demonstrate evidence-based need and measure the impact of interventions, data-poor countries are often locked in a cycle of underdevelopment (Ashraf, 2005; Center for Global Development, 2014), they often become trapped in an underdevelopment spiral.
Data access is a divide that has been challenged directly, if not comprehensively, by the ‘open data’ movement; its goal is to make existing datasets accessible to researchers, members of the public, private companies, civil society groups, and development organizations.
Open data remains an experiment, with little evidence to show that simply making existing datasets available will have these impacts. Growing evidence suggests that open data may empower private companies and those already economically privileged in society (Gurstein, 2011; Kitchin, 2014). Various claims are made about how opening data might reduce inequalities, although it is generally assumed that it will have some effect. As an extension of established discourses around freedom of information and informed by level-based understandings of the digital divide, open data is championed as an approach to democratizing data access, data use, and knowledge production. The result will be societal benefits in terms of decision-making, transparent governance, economic development, and scientific discovery (Arzberger et al., 2004; Janssen, Charalabidis, & Zuiderwijk, 2012; Sieber & Johnson, 2015; Zuiderwijk & Janssen, 2014). In spite of this, open data remains an experiment, with little evidence to suggest that simply making existing datasets available will have these effects, and a growing body of research suggests that open data could only empower private companies and already privileged individuals (Gurstein, 2011; Kitchin, 2014). A distinction must also be made between data access and data availability; open access is meaningless if no data exist, or if the quality of those data is dubious (Jerven & Johnston, 2015). Using passive data recorded as ‘exhaust’ – a byproduct of daily interactions with digital systems (e.g. Web, mobile phone networks) or sensors (e.g. remote sensing satellites, GPS) – as proxy sources of socioeconomic, demographic, and location and movement information in data-poor settings, could be a potential solution to data poverty. Data from remotely sensed nightlights are being used to estimate poverty, migration, and development patterns (Jean et al., 2016).
Kitchin (2014) explains how data should be viewed as part of a sociotechnical mixture of people, places, policies, practices, technologies, codes, laws, and standards that interact, shaping data following political, economic, technological, and social norms. By adopting this perspective, data cannot be viewed as having an existence of their own, nor as accurate digital representations of some objective social reality, but rather as constructed artifacts that reflect a set of choices during the data production process and the worldview prevalent in a particular sociotechnical context, which is then reproduced through their use (Borgman, 2015; Bowker, 2005; Gitelman, 2013). Hargittai and Walejko (2008) described a divide in online participation according to class, race, gender, and age. While the authors focus on the existence of the gap, they do imply future consequences of it: as user-generated online content ‘becomes increasingly important in setting social, political, and cultural agendas, the existence of such a participation gap will have increasing implications for social inequality’. Participation gaps produce unequal technological opportunities, yet they also enable the second type of data inequality, uneven representation of the world as data, which is fundamentally altering the physical world and how we operate in it. Those who produce data shape them according to their perspectives, with real effects on the world and society. As Ribes and Jackson (2013, p. 147) explain, ‘the work of producing, preserving, and sharing data reshapes the organizational, technological, and cultural worlds around [it]’.
Over a decade ago, Web 2.0 hype envisioned the new Internet as a space for true democratic participation. Despite this, actively generated data from digital crowdsourcing and user engagement projects can be used as a data production strategy to address the first dimension of the data divide, which is uneven access to and availability of data, especially in countries with poor data infrastructures and developing countries (Cinnamon, 2020). The fact that only a small percentage of people actively produce user-generated data is also important to note. A key component of big data, however, is passive data production, which is becoming universal in many countries as people use the Internet, social media, mobile phones, ‘smart’ infrastructure (public transport, buildings), or credit and debit cards. Data is collected about us as a result of these interactions – our consumer activities, preferences, and desires, the people we know and interact with, and the places we live, work, and play. As a result of passively produced personal behavioral data, high spatial and temporal resolutions can be obtained regarding population density, movement, sociodemographic phenomena, and crisis events. These data have been touted as a potential alternative to censuses and surveys, which can be expensive and time-consuming (Blumenstock, Cadamuro, & On, 2015; Lane, Stodden, Bender, & Nissenbaum, 2014). As a result, the data access divide can be ‘bridged’.
Through data-based representations of the world, user-generated and crowdsourced data production can contribute to advancing material inequalities. Innovative thinking is needed to address the data representation divide. It may actually be necessary to limit access to data production for some groups to reduce the harms of data representation inequalities, a consequence not well reflected by dominant understandings of digital divides, as development and humanitarian actors increasingly use user-generated data for decision-making purposes. For conceptualizing this divide to a certain extent, access, ordered levels, social and digital inequalities mirroring one another, and poverty accumulating across scales are useful since individuals, groups, and countries are all on the wrong side of the new data landscape (Cinnamon, 2020).
The topic of data inequality is an emerging and significant one that cannot be fully addressed in this post. It is possible, however, to make the following points: Addressing the data representation divide requires new thinking; Communicating data control inequalities within development discourses is essential in an urgent, unambiguous manner; Some causes, forms, and consequences of emerging inequalities of the data revolution cannot always be explained and addressed by conventional understandings of digital inequalities.

References

Arzberger, P. W., Schroeder, P., Beaulieu, A., Bowker, G. C., Casey, K., Laaksonen, L.,Wouters, P. (2004). Promoting access to public research data for scientific, economic, and social development. Data Science Journal, 3(29), 135–152.

Ashraf, H. (2005). Countries need better information to receive development aid. Bulletin of the World Health Organization, 83(8), 565–566.

Blumenstock, J., Cadamuro, G., & On, R. (2015). Predicting poverty and wealth from mobile phone metadata. Science, 350 (6264), 1073–1076.

Borgman, C. L. (2015). Big data, little data, no data: Scholarship in the networked world. Cambridge, MA: MIT Press.

Bowker, G. C. (2005). Memory practices in the sciences. Cambridge, MA: MIT Press.

Center for Global Development. (2014). Delivering on the data revolution in Sub-Saharan Africa. Washington, DC: Data for African Development Working Group.

Cinnamon, J. (2020). Data inequalities and why they matter for development. Information Technology for Development, 26(2), 214-233.

The Economist. (2014). Off the map: Rich countries are deluged with data; developing ones are suffering from drought. Retrieved from https://www.economist.com/international/2014/11/13/off-the-map?fsrc=scn%2Ftw%2Fte%2Fpe%2Foffthemap

Gitelman, L. (Ed.). (2013). “Raw data” is an oxymoron. Cambridge, MA: MIT Press.

Glassman, A., Ezeh, A., McQueston, K., Brinton, J., & Ottenhoff, J. (2014). Delivering on the data revolution in Sub-Saharan Africa. Washington, DC: Center for Global Development.

Gurstein, M. B. (2011). Open data: Empowering the empowered or effective data use for everyone?. First Monday, 16(2).

Hargittai, E., & Walejko, G. (2008). The participation divide: Content creation and sharing in the digital age. Information, Communication & Society, 11(2), 239–256.

ID4D. (2016). Identification for development: Strategic framework. Washington, DC: World Bank.

Janssen, M., Charalabidis, Y., & Zuiderwijk, A. (2012). Benefits, adoption barriers and myths of open data and open government. Information Systems Management, 29(4), 258–268.

Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell, D. B., & Ermon, S. (2016). Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), 790–794.

Jerven, M., & Johnston, D. (2015). Statistical tragedy in Africa? Evaluating the data base for African economic development. The Journal of Development Studies, 51(2), 111–115.

Kitchin, R. (2014). The data revolution: Big data, open data, data infrastructures and their consequences. Sage.

Kleine, D. (2018). Development. In J. Ash, R. Kitchin, & A. Leszczynski (Eds.), Digital geographies (pp. 225–237). London:SAGE.

Lane, J., Stodden, V., Bender, S., & Nissenbaum, H. (Eds.). (2014). Privacy, big data, and the public good: Frameworks for engagement. Cambridge: Cambridge University Press.

Norris, P. (2001). Digital divide: Civic engagement, information poverty, and the internet worldwide. New York, NY: Cambridge University Press.

Setel, P. W., Macfarlane, S. B., Szreter, S., Mikkelsen, L., Jha, P., Stout, S., & AbouZahr, C. (2007). A scandal of invisibility: Making everyone count by counting everyone. The Lancet, 370(9598), 1569–1577.

Sieber, R. E., & Johnson, P. A. (2015). Civic open data at a crossroads: Dominant models and current challenges. Government Information Quarterly, 32(3), 308–315.

Szreter, S. (2007). The right of registration: Development, identity registration, and social security—A historical perspective. World Development, 35(1), 67–86.

United Nations. (2014). A world that counts: mobilising the data revolution for sustainable development. UN Data Revolution Report.

Zuiderwijk, A., & Janssen, M. (2014). Open data policies, their implementation and impact: A framework for comparison. Government Information Quarterly, 31(1), 17–29.