Part 01 of the learning curve about Big Data for dummys (like myself)
Strawberry Pop Tarts and Big Data – Why does it matter for development?
In this blog, I want to explore my own learning in a simple, more fun and visual way, than it is usually happening in academics. I am trying to understand technology, big data, data justice and the internet. My questions are, how digital technology works, shapes our behavior, our social world and how it connects with development.

Strawberry Pop Tarts Photo: Evan-Amos, Wikimedia Commons
What is big data?
We are all generating huge amounts of data every second, using our phones, shopping, doing stuff in the internet.

https://www.statista.com/chart/25443/estimated-amount-of-data-created-on-the-internet-in-one-minute/
Big data is stored and then classified using the concept of 5 Vs:

The next step is to break it down into tasks and assemble results using even more computers. This is known as parallel processing. Followed by data analysis.
The models used are invisible, which model us as shoppers, patients or loan applicants.
Where is the causal relationship between big data and Strawberry pop tarts?
Walmart tirelessly gathers data on each of its customers, then remodels it to arrive at individual and group preferences. This process led to the fact that when a hurricane is imminent, shoppers tended to stock up on Kellogg’s’ strawberry pop tarts, ostensibly to tide over being marooned during the hurricane (and perhaps to give them extra sugary energy to board up their windows and doors).
Big-data correlations can point the way towards areas in which to explore causal relationships. Big data, accurately processed and analysed can predict a hurricane 5 days in advance before it hits.
So, that seems very useful for development …. and businesses as Walmart.
…if only the right algorithms are in place!
Let me first understand what algorithms are and how they can be unfair and reflect or even cause biases.
A good way to think of algorithms is that mini instruction manuals are telling computers how to complete a given task or manipulate given data.
Computer algorithms work via input and output. They take the input and apply each step of the algorithm to that information to generate an output.
For example, a search engine is an algorithm that takes a search query as an input and searches its database for items relevant to the words in the query. It then outputs the results.
You can easily visualise algorithms as a flowchart. The input leads to steps and questions that need handling in order. When each section of the flowchart is completed, the generated result is the output.
The point is that there is no transparency about the data models used and many companies hide the results of their models or even their existence. It is intellectual property, protected by lots of lawyers. So, nobody can answer the question if the model works against people’s interests, is unfair or even destroys or damages lives.
References:
https://www.statista.com/chart/25443/estimated-amount-of-data-created-on-the-internet-in-one-minute/
Big Data Success Story: Selling More Pop-Tarts | The TMG Blog
https://dzone.com/articles/big-data-trends-to-consider-in-2021
More on algorithms here https://www.thinkautomation.com/eli5/what-is-an-algorithm-an-in-a-nutshell-explanation/
Cathy O’Neill: Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, Crown Books, 2016

