data-100

An assignment index for Professor Frazier's DATA 100 class

View the Project on GitHub

Informal Response 2: Rob Kitchen, “Big Data, new epistemologies and paradigm shifts” and Chris Anderson, “The end of theory”

“Revolutions in science have often been preceded by revolutions in measurement.” Sinan Aral (cited in Kitchen, 2014)

In “Big Data, New Epistemologies and Paradigm Shifts” by Rob Kitchen the author emphasizes the importance of measurement as a significant contributing factor for improved description and analysis of complex natural and/or social phenomenon. How is the advent of big data serving to advance this revolution in measurement and subsequently, how is this revolution in measurement serving to elevate data science as an interdisciplinary field of study? How is the data deluge advancing a better understanding of human movement, behavior and relationships? How does data science contribute towards our improved understanding of human development as a complex and adapting social and economic system?

Rob Kitchen is undeniably an advocate for big data – or at least, the critical undertaking and review of it. His opening statements describe the recent availability and new methods attributed to data analysis in the modern age as a challenge to epistemology itself. Theorizing the origins of knowledge itself based on data alone may sound far-fetched or even fantastical, but this is just one argument for the case of big data as the “fourth paradigm of science” that Kitchen brings up in an unbiased manner as he breaks down the future of data science and its associated fields. To grotesquely summarize “big data” as Kitchen uses it in his article (it is impossible to truly paraphrase the word, as it exists as both a concept and a label applied to a vast variety of raw information), it is huge in volume, created in or near real-time, diverse in the topics it brings to light, exhaustive in its scope, intensely relational by nature, and extremely flexible as an industrial tool of computational science. In order to aptly understand the results wrought by the collection of this kind of information, the development of new forms of measurement and analysis are deemed not only useful, but absolutely necessary.

This “revolution in measurement” has already begun, with its roots having spread from the industries of marketing, retail, and other consumerist activities to the analysis of human behavior itself in the social sciences. Amazon’s terrifyingly-accurate algorithm produces spot-on recommendations for shoppers that they never would have thought to search for themselves. Customer satisfaction surveys, such as those commissioned by Walmart, generate terabytes of data analyzing the purchasing patterns of over a million patrons every hour. The tools used to capture, store, and provide insights on such data would have seemed utterly ridiculous in scale a mere ten years ago. Rather than simply extracting summaries from scarce, static, and poorly-relational data sets produced by ultra-specific, non-mutable questions, however, Kitchen notes that today’s modes of big data measurement have created ways to gather “data-borne” insights. This is the kind of “revolution in measurement” that Kitchen claims indicates the start of a “revolution in science” – a revolution that, according to him, will lead to the advent of the fourth paradigm of human thought. Transforming “how knowledge is produced, business conducted, and governance enacted” (Kitchen 2), this method of interrogating the world through big data collection seems set to radically change all previously established tenets of scientific inquiry, and by doing so, irremovably embed itself within all disciplines of academic study.

Here, a point of contention has arisen regarding the future of big data’s imminent golden age: will the next era of data promote a more empiricist approach to science, entirely free of biased theoretics, or a radical new extension of the established scientific method? As argued by Chris Anderson, author of the Wired article “The end of theory,” the “data deluge makes the scientific method obsolete” (Kitchen 3). The new age of knowledge production is nigh, and according to Anderson and his enthusiasts, its success will capitalize upon the removal of human bias and priori models, while simultaneously allowing big data to speak for itself. The sheer amount of digital, widely-accessible, and easily-collected information generated by billions of constantly plugged-in people, Anderson says, provides both the means and the end to understanding modern behavior (see the Amazon example detailed above). In short, Anderson believes that it does not matter which elements lead to a certain result; it is the ability of society – or rather, big data – to accurately predict the result that matters. If Anderson’s proposition were to come to fruition, it would entail a massive transformation for all disciplines of research, prompted by the idea that questions posed, patterns revealed, and answers provided by big data itself can offer an improved understanding of human development as a complex and adapting system akin to any other scientific phenomenon.

The “end of theory” is not an infallible concept, though. No data is truly unbiased in its content, and even society’s most advanced methods of analysis are not free from trace amounts of human influence. Believing that data can “speak for itself,” then, is like believing a person’s voice is really as loud as it sounds when amplified through a stadium megaphone. The augmenting aspect of the tools scientists and development experts use to collect, compartmentalize, analyze, and present data is irremovable. From a certain viewpoint, it is also irreplaceable in its value. Gathering more opinions on raw data before sampling, repurposing, and even generating it in the first place can allow for the construction of more specific, carefully thought out, and most importantly, useful datasets. Diversity in thought is necessary to address the innate diversity of data; the computational abilities, however astounding, of one brain is not enough to fully understand a phenomenon, regardless of whether it is an electric processor or a human mind that does the thinking. This argument for “data-driven science” – the inclusion of empiricist theory to the analysis of big data – supports the rejuvenation of the scientific method rather than its destruction by hybridizing “abductive, inductive and deductive approaches” to understanding information (Kitchen 5). Ultimately, both Kitchen and Anderson’s arguments are striving for the same thing. Human development is an ongoing process in many regions across the world, and it is up to those who have access to high-powered data analysis tools and the correct methods of data measurement to use them to help lay out pathways to social, economic, and political flourishing.

Kitchen, Rob. “Big Data, new epistemologies and paradigm shifts.” DOI: 10.1177/2053951714528481. Big Data & Society, Volume 1-12, April-June 2014. SAGE Journals, bds.sagepub.com.

Anderson, Chris. “The end of theory: The data deluge makes the scientific method obsolete.” http://www.wired.com/science/discoveries/magazine/16-07/pb_theory. Wired, 23 June 2008.