On April 11th, members of the Digital Scholarship Team presented initial work in analyzing the text of the Indianapolis Recorder at IUPUI Research Day. The Indianapolis Recorder is one of the nation’s oldest and most prominent African American newspapers, and the University Library has a digitized collection covering the vast majority of issues published between 1899 and 2005. The full text of more than 96,000 pages is currently available for export from the ContentDM platform as a tab delimited or XML text file (1.3GB).
To highlight potential use for this dataset, the Digital Scholarship Team performed some basic textual analysis and visualization using VOSviewer, a software program originally designed to analyze bibliometric networks. Using VOSviewer, the team created term co-occurrence maps of the most frequently occurring and highly relevant words in the text of the Recorder. As an example, a term map of all the issues published in the 1960s is compared with a term map of all the issues published in the 1970s.
Term map created from issues published in the 1960s.
Term map created from issues published in the 1970s.
The colors indicate the density of the terms in the text, the size of the terms correlates to their frequency of usage, and their proximity to one another indicates terms being used in the same context (http://arxiv.org/ftp/arxiv/papers/1109/1109.2058.pdf">Van Eck & Waltman, 2011). For example, as one expects, the term “church” is frequently used with “minister” and “congregation.” Perhaps more interesting is the visual representation of the change from using the term “negro” (clearly visible in the lower left of the 1960s map) to the term “black” (visible in the lower left of the 1970s map). This represents a significant shift in language, one that is clearly visible in the text of the Recorder. Creating a similar term map for the issues published in the 1980s and 90s, one would likely see the emergence of the term African American.
One question many people have when faced with visualizations is what can we learn from them, the inevitable so what. It is true we must be cautious about the conclusions we draw from these types of visualizations, realizing they are only as good as the data that support them. In the case of the Recorder there are limitations to the data (for further discussion see http://hdl.handle.net/1805/4263">http://hdl.handle.net/1805/4263). However, visualizations allow us to identify patterns and trends over longer time-scales, larger amounts of data, and at higher levels of complexity, what Matthew Jockers refers to as www.matthewjockers.net/2011/07/01/on-distant-reading-and-macroanalysis/">http://www.matthewjockers.net/2011/07/01/on-distant-reading-and-macroana...">macroanalysis.
While textual analysis combined with visualization by itself will not answer any research questions, when used as one tool in a researcher’s ever-expanding digital toolkit, it provides useful insights. Visualization can present us with patterns that confirm what we already know, employing a new methodological approach to validate existing research. Or, exploratively, it can help us identify new patterns that we can investigate further using other methodologies.
Van Eck, Nees Jan, & Waltman, Ludo. (2011). http://arxiv.org/ftp/arxiv/papers/1109/1109.2058.pdf">Text mining and visualization using VOSviewer. ISSI Newsletter, 7(3), 50-54.