2 Data Science for Librarians of ­ these fields provide multiple opportunities for collaborations that ­ will strengthen ­ these disciplines’ core mission. The mission of data science is “to transform raw, messy data into actionable knowledge” (Stanton, 2012, para. 6). Marchionni (2016) considers information science to be a general term “that subsumes library science and informatics and focuses on distinctions and sim- ilarities among ­these disciplines that each informs data science” (p. 1). Data- information-knowledge-wisdom (DIKW) hierarchy, introduced by Ackoff (1989), shows us that data transforms into information, information trans- forms into knowledge, and knowledge transforms into wisdom. Data, information, and knowledge are interrelated. Buckland (1991) pre­ sents three meanings for information: information as pro­cess, information as knowledge, and information as ­ thing. In ­ these three definitions, you can find how data, information, and knowledge are in fact interrelated. Information as pro­cess is when someone becomes informed what the person knows has changed. Information as knowledge is when information is communicated to pro- vide intelligence or par­tic­ u ­ lar facts of an event. Information as ­ thing is used attributively for objects such as data. Data carry no meaning on its own. For data to become information, it must be interpreted and take on a meaning. What Is Data? The Greek phi­los­o­pher Anaxagoras thought physical nature was “a por- tion of every­thing in every­thing.” Much like Anaxagoras’s definition of phys- ical nature, data can be found in practically every­thing and in anything. While this type of definition is ambiguous and unsatisfactory, it is precisely what one finds when searching for the definition of data. “Data” is a very broad term used across vari­ous disciplines and organ­ izations. Data is a collection of facts (such as words, numbers, mea­sure­ments, observations, ­etc.) that’s been translated into a con­ve­nient form that com- puter systems can pro­cess. Also, keep in mind that data is typically used to identify as well as to separate vital information from mere bits. Data could exist in several forms—as text or numbers on pieces of paper, as facts in a person’s mind, or as bytes and bits stored in electronic memory. ­ People have, since the mid-1900s, used the term “data” in order to denote computer infor- mation that’s stored or transmitted. “Data,” as the plural form of the Latin datum, means “­things that have been given” (Buckland, 1991, p. 353). The federal government of the United States defines data in the OMB Circular A-110: “the recorded factual mate- rial commonly accepted in the scientific community as necessary to validate research findings.” Data can mean dif­fer­ent ­ things to dif­fer­ent ­ people, organ­ izations, business, and disciplines. For example, in the third edition of Oxford Dictionary of Statistics, data is defined by Upton and Cook (2014) to mean information, which is usually numerical or categorical. While this is a rather ­ simple definition, it is straightforward for the field of statistics. “Data are any and all of the digital materials that are collected and ana- lyzed in the pursuit of scientific advances” (Bloom, Ganley, and Winker, 2014, p. 1). While this may be true to an extent in the 21st ­ century, data can also consist of nondigital items that may be found in notebooks, letters, and paper surveys. It is impor­tant to remember that not all data is made up of digital objects. Furner (2017) defined data as “concrete instantiations of sym- bolic repre­sen­ ta ­ tions of descriptive propositions, informed by empirical obser- vations, about the quantitative and qualitative properties of real-­world phenomena” (p. 66).
Previous Page Next Page