I would like to thank Dr. Fisher for pointing out one possible definition of the term “big data.” However, the sentence he used from the Wikipedia page to support this definition refers to an article titled “Future telescope array drives development of exabyte processing,” which appeared in a Conde Nast magazine called “Ars Technica.”* A search of this article did not reveal anywhere the term “big data.”
In this evolving world of use of very large datasets, one can find many variations on the term “big data.” For example, in their authoritative treatise, Mayer-Schonberger and Cukier1 write: “There is no rigorous definition of big data…big data refers to things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value….” This is precisely what Mathis et al.2 did in their study of novel factors that influence laryngeal mask failure in children.
Jonathan Stuart Ward and Adam Barker, from the University of St. Andrews in Scotland, recently tackled this conundrum of the ambiguous definition of “big data.”† They stated that “…there is no single unified definition, and various stakeholders provide diverse and often contradictory definitions….” They point out that the definition is likely some combination of size, complexity, and technology and refer to the Method for an Integrated Knowledge Environment (MIKE2.0) project, which states that “Big Data can be very small and not all large datasets are big.”‡ The MIKE project prefers a definition that includes a high degree of permutations and interactions within a dataset.
Finally, according to the National Institutes of Health, “The term ‘Big Data’ is meant to capture the opportunities and challenges facing all biomedical researchers in accessing, managing, analyzing, and integrating datasets of diverse data types [e.g., imaging, phenotypic, molecular (including various ‘–omics’), exposure, health, behavioral, and the many other types of biological and biomedical and behavioral data] that are increasingly larger, more diverse, and more complex, and that exceed the abilities of currently used approaches to manage and analyze effectively.”§§
When I used the term “big data” to describe the analysis of laryngeal mask complications3 in the study by Mathis, I meant it as a general way to convey what was a relatively large and complex dataset by most standards. There is currently no existing definition of how large this dataset must be to earn the label “‘big data.”
The author declares no competing interests.
Francis M: Future telescope array drives development of exabyte processing. Ars Technica, April 2, 2012. Available at: http://arstechnica.com/science/2012/04/future-telescope-array-drives-development-of-exabyte-processing/. Accessed March 27, 2014.
Ward JS, Barker A: Undefined by Data: A Survey of Big Data Definitions. Available at: http://arxiv.org/abs/1309.5821. Accessed March 27, 2014.
Big Data Definition–MIKE2.0, the open source methodology for Information Development. Available at: http://mike2.openmethodology.org/blogs/information-development/2012/03/18/its-time-for-a-new-definition-of-big-data/. Accessed March 27, 2014.
NIH Big Data to Knowledge (BD2K): What is Big Data? Available at: http://bd2k.nih.gov/about_bd2k.html#sthash.YnnSANEV.dpbs. Accessed March 27, 2014.