- Find Articles & More
- Find Books & More
- Guides to Research
- Information Literacy
- Digital Scholarship
- Patron Services
- Room Reservations
- Interlibrary Loan
- Faculty Support
- Also in the Library
- About Us
- Library Hours
- Library Policies
On Tuesday, September 9th I will be teaching a workshop on data visualization for the IUPUI Arts & Humanities Institute, “Introduction to Data Visualization I: Visualization with Gephi.” For the uninitiated, Gephi is an open-source network visualization program. The tool is ideal for networks of any size. It offers a vast array of network analysis and visualization options, including geospatial layouts for data, statistical measures for social network analysis, and dynamic network visualization. Gephi handles a variety of data formats and allows the construction of datasets within the tool itself, perfect for those working with smaller amounts of data. Gephi runs on Windows, Mac, and Linux operating systems.
Aside from preparing for the onslaught of instruction that will be fall semester, my time lately has been spent exploring topic modeling (I realize that I am somewhat late to the game on this, but it has been on my ‘to do’ list for a while now). After installing MALLET, a java-based natural language processing package that facilitates topic modeling among other things, reading this helpful tutorial, and seeing evidence of topic modeling’s utility for analyzing large volumes of text, I am intrigued but also somewhat overwhelmed. The further I move away from introductory explanations of topic modeling, like David M.
Do a Google image search for data visualization and undoubtedly you will see many examples of networks, otherwise known as graphs. The identification and study of these networks is useful in a variety of fields from social network analysis in sociology and social informatics to the study of predation networks in ecology. If you can identify connections between groups of entities, then you can study it using some aspect of network theory. However, the visual representations of these networks as graphs are often difficult to interpret. This post intends to shed some light onto the topic of network visualizations.
Essentially, networks are data structures that represent relationships between entities. For example, Author A writes an article with Author B. Obviously in this case the authors are the entities and are connected through their co-authoring relationship. Graphs consist of nodes (entities) and edges (relationships that connect the entities). We might visually represent the previous example as:
I recently attended the Federal Depository Library Conference in Washington D.C. Among the many interesting topics discussed, one in particular caught my attention and got me thinking about the way my duty as a documents librarian and as a member of our Digital Scholarship Team overlaps: promoting access to and preserving born-digital government information.
Over the past decade the amount of government information online far outpaced the number of documents printed by the Government Printing Office (GPO) for distribution through the Federal Depository Library Program (FDLP) (Jacobs, 2014). The sheer volume of this information makes both providing access (at least through bibliographic control) and ensuring preservation extremely difficult. What’s worse, much of this information is transitory and is lost when administrations change or Congressional committees disband.
On April 11th, members of the Digital Scholarship Team presented initial work in analyzing the text of the Indianapolis Recorder at IUPUI Research Day. The Indianapolis Recorder is one of the nation’s oldest and most prominent African American newspapers, and the University Library has a digitized collection covering the vast majority of issues published between 1899 and 2005. The full text of more than 96,000 pages is currently available for export from the ContentDM platform as a tab delimited or XML text file (1.3GB).