What is Data Literacy?

This March I’ll be presenting at the ACRL 2015 conference with Christine Murray (Bates College) on teaching data literacy in the library. To help me prepare and perhaps preview our discussion, I thought I’d post a few thoughts on the blog to get the juices flowing. Let’s begin with some definitions as they appear in both the library literature and the scholarship of statistics education in order to answer the question: what is data literacy?

In Libraryland, “data literacy” seems to be the most popular term (over statistical literacy, quantitative literacy, and numeracy), and consists of two aspects: information literacy and data management. From an information literacy perspective, the emphasis is on statistics, which are considered a special form of information but one that still falls under the information literacy umbrella. For example, Schield (2004:6) describes statistical literacy as the critical consumption of statistical information when used as evidence in arguments. Similarly, Stephenson and Caravello (2007) advocate for librarians to promote statistical literacy by assisting learners to locate and evaluate authoritative statistical sources, recalling Standards 2 and 3 of the 2000 ACRL Information Literacy Standards, as well as reference classics like the annual Statistical Abstract of the United States.

From the data management perspective, the emphasis is on data rather than statistics, and focuses on the organizational skills needed to create, process, and preserve original data sets. Returning to Schield (2004:7), he defines data literacy as the ability to obtain and manipulate data, but reserves these skills for certain fields of study such as business or the social sciences.  Carson et al. (2011:631), based on interviews with faculty and GIS students, emphasize the importance of data management and curation skills required to “store, describe, organize, track, preserve, and interoperate data.” There is plenty of literature on data management, a hot topic in Libraryland fueled by interest in e-science initiatives, new data requirements for federal grants, and the creation of institutional repositories. In my experience, though, discussion of data management is often divorced from statistical literacy, perhaps due to its focus on faculty and other experts rather than data novices. Calzada Prado and Marzal (2013) do attempt to unify the information literacy and data management aspects under one rubric, although their proposal for five data literacy standards is largely derivative of the soon-to-be-sunsetted 2000 ACRL Information Literacy Standards, which doesn’t bode well for their wider adoption.

Turning away from librarianship, we find that statisticians and statistics educators typically use the term “statistical literacy” to describe the knowledge, skills, and dispositions surrounding their field. One widely cited exposition of statistical literacy is that of Iddo Gal (2002:2-3), who identifies two interrelated components: the ability to interpret and critically evaluate statistical information, as well as the ability to discuss and communicate one’s understanding, opinions, and concerns regarding such statistical information. Gal (2002:4) further describes a model of interrelated knowledge elements and dispositions that together enable statistically literate behavior. Gal’s definition will no doubt look familiar to information literacy librarians, incorporating the evaluative and communicative aspects of information literacy along with the dispositions and affective components we find highlighted under the new ACRL Framework.

But what is the nature of statistical information, the object of Gal’s model for statistical literacy? It may be helpful to consider this in terms put forth by George Cobb and David S. Moore (1997). In their oft cited article on statistics pedagogy, they break down statistical analysis into three interrelated phases: data production, data analysis, and formal inference. Each of these phases produces statistical information requiring varying levels of contextual and mathematical knowledge.

Data production includes aspects of the research process related to designing a study, creating a data set, and preparing the data for short term and long term analysis. Viewed from the library, the data production phase is most closely associated with data management skills. Data analysis, next in Cobb and Moore’s schema, consists of the exploratory and descriptive phase of data-driven research. This includes examining the data set to discover trends or outliers, and using descriptive statistics to reduce large amounts of data into summary information such as measures of central tendency and variance (e.g. mean, median, mode, range, percentiles, standard deviation). Through this analysis, researchers can make hypotheses or predictions about phenomena revealed by the data. Finally, formal inference can be used to draw conclusions about a population from findings in sample data. Here we find the notorious formulas full of Greek letters such as Student’s t-test, chi-square test, ANOVA, and regression models. I’ll return to Cobb and Moore’s pedagogical advice in a future post.

So back to the original question: what is data literacy?

I suggest librarians borrow heavily from statistics educators when trying to answer this question. To paraphrase Gal and apply his definition to Cobb and Moore’s three phases of statistical analysis, the simplest definition of data literacy is the ability to interpret, evaluate, and communicate statistical information. Central to this ability is an understanding of how statistical information is created, encompassing data production, data analysis, and formal inference. In other words, data literacy includes the ability to evaluate the modes of data production, including the underlying research design and means of sampling, and how this impacts the possible findings. Data literacy also includes the ability to interpret the results of formal inference tests, including confidence intervals and the probability that findings are representative of a population rather than coincidental to the given sample. And finally, data literacy includes the ability to interpret and communicate about the descriptive statistics learners and citizens encounter everyday, from unemployment rates to political polling.

And what about data management? Ultimately it belongs to the data production phase of Cobb and Moore’s schema, and is perhaps one aspect of data literacy that, as Schield intimated, can be reserved for the specialists. While the data literate person can identify and evaluate the soundness of a research design and data collection methods, perhaps only trained practitioners need the specialized skills to carry out a full-fledged project involving data curation and advanced tools. And in most instances, teaching these skills is beyond the purview of librarians. Stay tuned for more on this and data literacy instruction in the library.

Calzada Prado, Javier and Miguel Ángel Marzal. 2013. “Incorporating Data Literacy into Information Literacy Programs: Core Competencies and Contents.” Libri: International Journal of Libraries & Information Services 63(2):123–34.
Carlson, Jacob, Michael Fosmire, C. C. Miller, and Megan Sapp Nelson. 2011. “Determining Data Information Literacy Needs: A Study of Students and Research Faculty.” portal: Libraries and the Academy 11(2):629–57.
Cobb, George W. and David S. Moore. 1997. “Mathematics, Statistics, and Teaching.” The American Mathematical Monthly 104(9):801–23.
Gal, Iddo. 2002. “Adults’ Statistical Literacy: Meanings, Components, Responsibilities.” International Statistical Review 70(1):1–25.
Schield, Milo. 2004. “Information Literacy, Statistical Literacy and Data Literacy.” IASSIST Quarterly 28(2):6–11.
Stephenson, Elizabeth and Patti Schifter Caravello. 2007. “Incorporating Data Literacy into Undergraduate Information Literacy Programs in the Social Sciences: A Pilot Project.” Reference Services Review 35(4):525–40.

Primary Colors: My presentation at LOEX 2014

The LOEX 2014 conference has just ended, and I wanted to make sure folks that attended my presentation could access my slides:  Click the title below to download the PowerPoint, or access my slides on SlideShare.

Primary Colors: The Art of Teaching & Learning with Primary Sources in the Library

Thanks to everyone that attended my discussion of teaching with primary sources in the library! I’ll post again soon with my takeaways from this great library instruction conference.

ACRL’s Proposed Framework for Information Literacy: A View from the Disciplines

I’m feeling a bit behind the times on this, but I held off on posting about ACRL’s proposed Framework for Information Literacy. I promised to contribute a column on the Framework to ANSS Currents, the newsletter of the Anthropology & Sociology Section of ACRL, and didn’t want to preempt that with a blog post.

The spring 2014 issue of ANSS Currents has been released, and my thoughts on the new Framework from the perspective of subject liaison librarians begin on page 19. For a little context, the new Framework is  intended to replace the existing ACRL information literacy standards, which were adopted in 2000. Since then, a myriad of subject-specific standards that have been developed, including standards for information literacy in anthropology and sociology, and it will be interesting to see what becomes of those should the new Framework be implemented. You’ll have to read the column in ANSS Currents for my initial thoughts on the matter.

Tomorrow I head off to Grand Rapids, Michigan, for the LOEX 2014 conference, and I expect to have some great discussions on the new Framework and the future of information literacy efforts in academic libraries. I’m also presenting on my work with students using primary sources, so it’s going to be a weekend full of library awesomeness.

Doing Content Analysis in a Building Full of Content

Two weeks ago I presented at the 2013 Forum of the NOLA Information Literacy Collective, our local gathering of academic instruction librarians in the Greater New Orleans area. I shared my experience developing an active learning session at the library to drive home a sociology professor’s lesson on content analysis methodology. The slides from my presentation are available online, and this post describes the lesson plan which readers can adopt and adapt in their own instruction.

For a little background, students in the sociology major are required to take a sequence of 3 courses early in their major career: a foundations course, a research design course, and a research analysis (a.k.a. statistics) course. Collaborating with faculty members, I’ve integrated information literacy instruction into this 3-course progression.1 In the foundations class, students learn to read an academic article and conduct basic searches in the database Sociological Abstracts in service of an annotated bibliography assignment. In research design, there are two library sessions: the first focuses on building a literature review, and the second had been an introduction to using and evaluating U.S. Census data. In the last course, the library session is about locating quantitative data sets. The lesson plan I’m about to describe replaced the second session in the research design class.

The professor wanted students in her class to be exposed to archival materials. In semesters past, her students had fixated on surveys as the preferred sociological methodology, and she wanted to promote qualitative analysis of existing materials. In order to assist in this goal, I first needed to familiarize myself with what archival materials meant in sociological research, so I consulted the sociology literature and read up on studies using content analysis methods. The professor and I corresponded and brainstormed on our goals and expectations for the session, and I came up with the following:

Before the session, students had homework: use Sociological Abstracts to find an article wherein the author uses content analysis methodology, read and evaluate it, and bring it to the library session. Students were given the hint to include “content analysis” in the Abstract field. This served both as a refresher on how to use the database, and modeled the use of content analysis by experts in the field.

To start the library session, I asked students to share two aspects of their located article: what was the content used by the author, and what sociological question did the author seek to answer? I listed on the board the types of content students identified, and the common themes among authors’ research questions. This helped students by showing them the realm of possibilities in content to be analyzed, and what sociological questions are appropriate and possible with this methodological approach. The professor also had the opportunity to ask follow up questions about students’ articles.

Then came the big reveal! I had a cart stacked with potential content: namely, print materials I had selected from the stacks. These included Caldecott award-winning children’s books, U.S. history textbooks from the 1980s and 1990s, published song lyrics from three disparate artists2, bound Life magazines from the 1950s, and recent issues of the photography magazine American Photo. I divided the students into 5 groups, assigned each group a content from the cart, and challenged them to work together to come up with a research proposal using content analysis on their assigned materials. Now they had the chance to practice the method they had only read about to this point, and by providing a content set to practice with, I removed the added challenge of conceptualizing and locating their own data set before they had a strong grasp of how to use it. (Watch for a future post on the relationship between finding and using information in student research assignments.)

After 10 to 12 minutes of group work, during which the professor and I circulated to offer encouragement and assistance as needed, I asked for a representative from each group to describe their content to the class and share the research proposal they came up with, emphasizing how they would apply content analysis methodology and the sociological question they intended to address. The exercise was a great success! Students were able to articulate the use of content analysis methods and generated interesting sociological research questions. The professor asked follow up questions that related back to their initial class lecture and readings on the method, and she later reported back to me that students seemed to “get it” both in subsequent class meetings and later on the exam. Thus we were able to assess students’ learning both in their reporting back to the class in the library session, and in the formalized space of their exam responses.

The sequence of assignments and activities here was designed to provide some scaffolding for the students as they learned to do content analysis. Scaffolding—a pedagogical metaphor often attributed to Jerome Bruner—is a method of breaking down the distance between novices and experts by creating stages through which students can work, taking into account what they can do on their own, what they can do with assistance, and what might still lie beyond the learning horizon.3 In this lesson plan, we first modeled for them what content analysis is and looks like in the discipline through their class readings and lecture, and then in finding their own example of a sociologist using the methodology. Then we worked as a class to list types of content and research questions typical of the methodology. Then working in small groups students could rely on each other during the hands-on challenge, giving students who already grasped the concept to help along those who weren’t quite there. Finally, with hands-on practice under their belts, students could demonstrate their conceptual grasp of the method and its application individually on the exam.

This lesson plan based on content analysis methodology in sociology could be applied in any disciplinary context using primary source materials. It’s also a means of showcasing interesting collections in the library while avoiding a dry show-and-tell presentation. Students will become aware of library sources by using them in service of a real learning goal, and not just for the sake of the materials themselves. This approach also demonstrates to faculty members that library instruction is more than just learning how to search, but can be a space for active engagement with library materials in support of their disciplinary content and learning goals. I encourage readers to adopt and adapt the content analysis library session, and come back to share how it went.

1. For more on this sequenced information literacy program, see my article “Beyond the One-Shot: Advantages of a Programmatic Approach to Information Literacy Instruction,” ANSS Currents 27 no. 2 (Fall 2012): 23-26 (http://anssacrl.files.wordpress.com/2010/09/anss-currents-fall-2012-1.pdf).
2. The song lyrics were three published collections: Hank Williams, Paul Simon, and John Denver.
3. See for example, David Wood, Jerome S. Bruner, and Gail Ross, “The Role of Tutoring in Problem Solving,”
Journal of Child Psychology and Psychiatry 17, no. 2 (1976): 89-100.