Go Digital, They Said. It Will Be Fun, They Said. Teaching DH Methods for Historical Research

Session 5A

Author

Affiliation

Ina Serif

University of Basel, Switzerland

Published

September 13, 2024

Modified

November 15, 2024

Doi

10.5281/zenodo.14171331

Abstract

The digitization of historical materials and the application of computational techniques significantly expand the spectrum of sources and questions for historical research. However, the practical use of computer-assisted methods often involves resolving technical problems unique to a specific project. When teaching such methods to history students, this is the major challenge: there isn’t a simple set of commands that covers all the potential issues in a research project. Moreover, the goal is not to train humanities students to be computer scientists, but to equip them with the skills to tackle specific problems. I will discuss how, based on problems faced in my own research, I combine the teaching of computer-assisted methods with student projects to help the students understand the limitations of out-of-the-box solutions while letting them experience the possibilities of digital analyses. Through their own project, students learn how to break down research questions into separate, manageable technical tasks and identify which types of problems can and which can’t be resolved using digital history methods.

Keywords

teaching computer-assisted methods, digital history, digital literacy

For this paper, slides are available on Zenodo (PDF).

Introduction

As historians today, we profit from an unmatched availability of historical sources online, with most of the information contained in these sources digitally accessible. This greatly facilitates the use of computer-assisted methods to support or augment historical analyses. How and when to use which methods in a research endeavor are questions that cannot easily be answered, as the application of appropriate techniques more often than not is something to be clarified or revised during a project. Therefore, we need to find a way to not only teach computer-assisted methods to history students, but also how to enable them to conceptualize a historical research project and how to solve technical problems along the way, empowering them to develop and apply different methods in a practical and inspiring way. In the following, I will discuss an approach that proposes designing semester-long courses with a thematic focus, where students progressively learn how to use computational tools through continuous engagement with a historical source.

Motivation and Course Design

In a text-based field like history, techniques such as text recognition, text/data mining or natural language processing are very valuable for historical analyses (see for example Jockers and Underwood 2016). However, university courses for history students should go beyond merely teaching a specific technique. They should also equip digital novices with the skills to navigate the digital realm, whether that involves (basic) computer skills, effective collaboration on projects or questions related to data management (for two recent handbooks on how to teach digital history see Battershill and Ross 2022; Guiliano 2022).

Over the years, I have experimented with various course designs, with introductions to specific software as well as to programming languages. Approaching the topic from the perspective of a course based on programming in order to analyse historical sources, however, has consistently produced the best results in both project outcomes and course evaluations.¹ Now, with the rise of large language models, it has been argued that AI can easily generate any script, prompting some to question the necessity of teaching programming. For effective use of this technology, though, learning basic programming skills is essential. Relying on AI generated output without understanding its mechanics will result in mistakes, unnoticed misinterpretations, and, eventually, useless research. By learning the basics of scripting, students not only acquire the ability to perform their own analyses on a data set, but also learn how to use generative AI productively, enabling them to critically assess, correct and refine the output.

The current curriculum at the Department of History at the University of Basel does not include foundational, semester-long courses that cover digital literacy or computational skills on a broad basis for all students. Since 2022, however, a self-paced introductory course to digital history has become a mandatory part of the first semester (Serif 2022). This course provides an initial overview of digital methods and their use for historical research, along with a practical component where students learn to apply different methods to a corpus. They are gently introduced to the command line,² learning about APIs, regular expressions, string extraction, automation and other relevant techniques, as well as ways to visualize first results. By encountering a computer-based approach to historical sources early in their studies, students become more aware of the subject when planning their courses for the following semesters.

In the absence of a comprehensive introductory course, the digital history courses I offer still begin at a basic level. By showing students the command line as a way to use the computer, I aim to dispel any unfounded fears and encourage a different way of thinking: A task that initially seems overwhelming can be broken down into several small steps, leading to its completion. I let students work on a small multi-step task that increases their motivation and demonstrates the potential relevance of these methods for historical research. The courses are student-project based, and while we also discuss some examples of digital history projects and reflect on the methods used (reading assignments include Romein et al. 2020; Graham et al. 2022; and Lemercier and Zalc 2019), the focus lies on learning by doing, this is by working with their own material, towards the completion of their project.

When developing such a course, I take inspiration from the problems I face. In ongoing research, for example, I am examining book advertisements placed in an early modern newspaper.³ One part of this project requires a matching of the advertised titles with an existing database of printed books⁴ in order to enrich the dataset with additional information such as format, number of pages, edition, or genre. To achieve this, one has to overcome a series of obstacles, such as extracting book titles from an advertisement, creating database queries, transforming the received format, handling missing or incorrect metadata, and adding the information to the original dataset. Neither a simple list of commands nor an out-of-the-box solution exists to solve all these problems, encountered while working on the source, at once. However, when tackled individually, each step becomes more understandable and manageable for a programming novice. Using this as a classroom example, programming becomes directly linked to a specific historical source to answer very concrete questions, for example as simple as determining the number of book titles advertised in the newspaper during a particular year.

Through small programming exercises, students learn principles of automation, standardization, scalability, etc., while also understanding the importance of metadata and data formats. Examples are drawn from the same historical corpora that will be used in their projects later in the semester, allowing the students to become familiar with both the methods and the sources. This approach helps students gradually understand the potential of computer-assisted analysis and how to apply it to historical research. After being equipped with the technical basics, students form small groups to develop research questions that they would like to explore using digital methods. At the end of the semester, they present their findings, including any challenges faced, discuss how their analyses addressed their initial questions, and reflect on further analyses that could be performed and new questions that arise in light of their results. This structure ensures that students remain connected to the historical material, learn programming not merely for its own sake and identify usefulness and understand limitations of computer-assisted methods in answering historical research questions.

Conclusion and Outlook

The described courses provide an opportunity for every student to learn how to use computer-assisted methods for historical research. From the course evaluations we know that the courses have been largely appreciated, and that there is a strong demand for more classes of this kind. Furthermore, a significant portion of the participants come from other humanities disciplines, as their own curricula lack equivalent courses. Admittedly, the learning curve is quite steep, and the pace at the beginning is fast. In the current setting, this is unavoidable, but those who persevere often enjoy experimenting with their new skills and achieve unexpected results.

So far, only few students choose to focus on computational analyses for their bachelor’s or master’s thesis,⁵ mostly because they do not feel fully confident with their new skill set (and also because potential supervisors often lack sufficient expertise to support them). Consequently, changes in the humanities curriculum seem necessary if we aim to educate more students in digital methods for historical research. With the increasing prominence of large language models, it seems all the more crucial to ensure that future historians can produce verifiable and reproducible results, leveraging computer-assisted methods both effectively and meaningfully.

References

Battershill, Claire, and Shawna Ross. 2022. Using Digital Humanities in the Classroom: A Practical Introduction for Teachers, Lecturers, and Students. Second edition. London New York Oxford New Delhi Sydney: Bloomsbury Academic.

Blaney, Jonathan, Jane Winters, Sarah Milligan, and Martin Steer. 2021. Doing Digital History: A Beginner’s Guide to Working with Text as Data. IHR Research Guides. Manchester: Manchester University Press.

Dickmann, Lars. 2022. “Topographien Des Verlorenen. Zur Praxis Des Verlierens Und Findens Im Basler ‘Avisblatt,’ 1729-1844.” Thesis, Universität Basel. https://edoc.unibas.ch/91836/.

Graham, Shawn, Ian Milligan, Scott B Weingart, and Kim Martin. 2022. Exploring Big Historical Data: The Historian’s Macroscope. 2nd ed. WORLD SCIENTIFIC. https://doi.org/10.1142/12435.

Guiliano, Jennifer. 2022. A Primer for Teaching Digital History: Ten Design Principles. Design Principles for Teaching History. Durham: Duke University Press.

Jockers, Matthew L., and Ted Underwood. 2016. “Text-Mining the Humanities.” In A New Companion to the Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth, 291–306.

Lemercier, Claire, and Claire Zalc. 2019. Quantitative Methods in the Humanities. An Introduction. Charlottesville: University of Virginia Press.

Romein, C. Annemieke, Max Kemman, Julie M. Birkholz, James Baker, Michel De Gruijter, Albert Meroño‐Peñuela, Thorsten Ries, Ruben Ros, and Stefania Scagliola. 2020. “State of the Field: Digital History.” History 105 (365): 291–312. https://doi.org/10.1111/1468-229X.12969.

Serif, Ina. 2022. “Introduction to Digital History.” https://wissen-ist-acht.github.io/digitalhistory.intro/.

Footnotes

Either R or Python is taught, as both offer a wide range of packages and libraries for humanities data as well as abundant tutorials for different methods.↩︎
Probably most discussed if necessary or not – I found confirmation for introducing the command line among others in (Blaney et al. 2021).↩︎
The subset of book ads had been created in the context of a SNF project, see https://avisblatt.philhist.unibas.ch/.↩︎
Verzeichnis Deutscher Drucke des 18. Jahrhunderts VD18, https://vd18.k10plus.de.↩︎
Some of the underlying ideas for the analytic part in the master thesis of Dickmann (2022) was developed by him in a course of mine in fall 2020, see https://github.com/LarsDIK/avis-analysis.↩︎

Reuse

CC BY-SA 4.0

Citation

BibTeX citation:

@misc{serif2024,
  author = {Serif, Ina},
  editor = {Baudry, Jérôme and Burkart, Lucas and Joyeux-Prunel,
    Béatrice and Kurmann, Eliane and Mähr, Moritz and Natale, Enrico and
    Sibille, Christiane and Twente, Moritz},
  title = {Go {Digital,} {They} {Said.} {It} {Will} {Be} {Fun,} {They}
    {Said.} {Teaching} {DH} {Methods} for {Historical} {Research}},
  date = {2024-09-13},
  url = {https://digihistch24.github.io/submissions/687/},
  doi = {10.5281/zenodo.14171331},
  langid = {en},
  abstract = {The digitization of historical materials and the
    application of computational techniques significantly expand the
    spectrum of sources and questions for historical research. However,
    the practical use of computer-assisted methods often involves
    resolving technical problems unique to a specific project. When
    teaching such methods to history students, this is the major
    challenge: there isn’t a simple set of commands that covers all the
    potential issues in a research project. Moreover, the goal is not to
    train humanities students to be computer scientists, but to equip
    them with the skills to tackle specific problems. I will discuss
    how, based on problems faced in my own research, I combine the
    teaching of computer-assisted methods with student projects to help
    the students understand the limitations of out-of-the-box solutions
    while letting them experience the possibilities of digital analyses.
    Through their own project, students learn how to break down research
    questions into separate, manageable technical tasks and identify
    which types of problems can and which can’t be resolved using
    digital history methods.}
}

For attribution, please cite this work as:

Serif, Ina. 2024. “Go Digital, They Said. It Will Be Fun, They Said. Teaching DH Methods for Historical Research.” Edited by Jérôme Baudry, Lucas Burkart, Béatrice Joyeux-Prunel, Eliane Kurmann, Moritz Mähr, Enrico Natale, Christiane Sibille, and Moritz Twente. Digital History Switzerland 2024: Book of Abstracts. https://doi.org/10.5281/zenodo.14171331.

--- submission_id: 687 categories: 'Session 5A' title: Go Digital, They Said. It Will Be Fun, They Said. Teaching DH Methods for Historical Research author: - name: Ina Serif orcid: 0000-0003-2419-4252 email: ina.serif@unibas.ch affiliation: University of Basel, Switzerland keywords: - teaching computer-assisted methods - digital history - digital literacy abstract: | The digitization of historical materials and the application of computational techniques significantly expand the spectrum of sources and questions for historical research. However, the practical use of computer-assisted methods often involves resolving technical problems unique to a specific project. When teaching such methods to history students, this is the major challenge: there isn't a simple set of commands that covers all the potential issues in a research project. Moreover, the goal is not to train humanities students to be computer scientists, but to equip them with the skills to tackle specific problems. I will discuss how, based on problems faced in my own research, I combine the teaching of computer-assisted methods with student projects to help the students understand the limitations of out-of-the-box solutions while letting them experience the possibilities of digital analyses. Through their own project, students learn how to break down research questions into separate, manageable technical tasks and identify which types of problems can and which can’t be resolved using digital history methods. date: 09-13-2024 date-modified: 11-15-2024 doi: 10.5281/zenodo.14171331 other-links: - text: Presentation Slides (PDF) href: https://doi.org/10.5281/zenodo.14171331 bibliography: references.bib --- ::: {.callout-note appearance="simple" icon=false} For this paper, slides are available [on Zenodo (PDF)](https://zenodo.org/records/14171331/files/687_DigiHistCH24_GoDigital_Slides.pdf). ::: ## Introduction As historians today, we profit from an unmatched availability of historical sources online, with most of the information contained in these sources digitally accessible. This greatly facilitates the use of computer-assisted methods to support or augment historical analyses. How and when to use which methods in a research endeavor are questions that cannot easily be answered, as the application of appropriate techniques more often than not is something to be clarified or revised during a project. Therefore, we need to find a way to not only teach computer-assisted methods to history students, but also how to enable them to conceptualize a historical research project and how to solve technical problems along the way, empowering them to develop and apply different methods in a practical and inspiring way. In the following, I will discuss an approach that proposes designing semester-long courses with a thematic focus, where students progressively learn how to use computational tools through continuous engagement with a historical source. ## Motivation and Course Design In a text-based field like history, techniques such as text recognition, text/data mining or natural language processing are very valuable for historical analyses [see for example @jockers_text-mining_2016]. However, university courses for history students should go beyond merely teaching a specific technique. They should also equip digital novices with the skills to navigate the digital realm, whether that involves (basic) computer skills, effective collaboration on projects or questions related to data management [for two recent handbooks on how to teach digital history see @battershill_using_2022; @guiliano_primer_2022]. Over the years, I have experimented with various course designs, with introductions to specific software as well as to programming languages. Approaching the topic from the perspective of a course based on programming in order to analyse historical sources, however, has consistently produced the best results in both project outcomes and course evaluations.[^4] Now, with the rise of large language models, it has been argued that AI can easily generate any script, prompting some to question the necessity of teaching programming. For effective use of this technology, though, learning basic programming skills is essential. Relying on AI generated output without understanding its mechanics will result in mistakes, unnoticed misinterpretations, and, eventually, useless research. By learning the basics of scripting, students not only acquire the ability to perform their own analyses on a data set, but also learn how to use generative AI productively, enabling them to critically assess, correct and refine the output. The current curriculum at the Department of History at the University of Basel does not include foundational, semester-long courses that cover digital literacy or computational skills on a broad basis for all students. Since 2022, however, a self-paced introductory course to digital history has become a mandatory part of the first semester [@serif_introduction_2022]. This course provides an initial overview of digital methods and their use for historical research, along with a practical component where students learn to apply different methods to a corpus. They are gently introduced to the command line,[^3] learning about APIs, regular expressions, string extraction, automation and other relevant techniques, as well as ways to visualize first results. By encountering a computer-based approach to historical sources early in their studies, students become more aware of the subject when planning their courses for the following semesters. In the absence of a comprehensive introductory course, the digital history courses I offer still begin at a basic level. By showing students the command line as a way to use the computer, I aim to dispel any unfounded fears and encourage a different way of thinking: A task that initially seems overwhelming can be broken down into several small steps, leading to its completion. I let students work on a small multi-step task that increases their motivation and demonstrates the potential relevance of these methods for historical research. The courses are student-project based, and while we also discuss some examples of digital history projects and reflect on the methods used [reading assignments include @romein_state_2020; @graham_exploring_2022; and @lemercier_quantitative_2019], the focus lies on learning by doing, this is by working with their own material, towards the completion of their project. When developing such a course, I take inspiration from the problems I face. In ongoing research, for example, I am examining book advertisements placed in an early modern newspaper.[^1] One part of this project requires a matching of the advertised titles with an existing database of printed books[^2] in order to enrich the dataset with additional information such as format, number of pages, edition, or genre. To achieve this, one has to overcome a series of obstacles, such as extracting book titles from an advertisement, creating database queries, transforming the received format, handling missing or incorrect metadata, and adding the information to the original dataset. Neither a simple list of commands nor an out-of-the-box solution exists to solve all these problems, encountered while working on the source, at once. However, when tackled individually, each step becomes more understandable and manageable for a programming novice. Using this as a classroom example, programming becomes directly linked to a specific historical source to answer very concrete questions, for example as simple as determining the number of book titles advertised in the newspaper during a particular year. Through small programming exercises, students learn principles of automation, standardization, scalability, etc.\, while also understanding the importance of metadata and data formats. Examples are drawn from the same historical corpora that will be used in their projects later in the semester, allowing the students to become familiar with both the methods and the sources. This approach helps students gradually understand the potential of computer-assisted analysis and how to apply it to historical research. After being equipped with the technical basics, students form small groups to develop research questions that they would like to explore using digital methods. At the end of the semester, they present their findings, including any challenges faced, discuss how their analyses addressed their initial questions, and reflect on further analyses that could be performed and new questions that arise in light of their results. This structure ensures that students remain connected to the historical material, learn programming not merely for its own sake and identify usefulness and understand limitations of computer-assisted methods in answering historical research questions. [^1]: The subset of book ads had been created in the context of a SNF project, see <https://avisblatt.philhist.unibas.ch/>. [^2]: Verzeichnis Deutscher Drucke des 18. Jahrhunderts VD18, <https://vd18.k10plus.de>. [^3]: Probably most discussed if necessary or not -- I found confirmation for introducing the command line among others in [@blaney_doing_2021]. [^4]: Either R or Python is taught, as both offer a wide range of packages and libraries for humanities data as well as abundant tutorials for different methods. ## Conclusion and Outlook The described courses provide an opportunity for every student to learn how to use computer-assisted methods for historical research. From the course evaluations we know that the courses have been largely appreciated, and that there is a strong demand for more classes of this kind. Furthermore, a significant portion of the participants come from other humanities disciplines, as their own curricula lack equivalent courses. Admittedly, the learning curve is quite steep, and the pace at the beginning is fast. In the current setting, this is unavoidable, but those who persevere often enjoy experimenting with their new skills and achieve unexpected results. So far, only few students choose to focus on computational analyses for their bachelor's or master's thesis,[^5] mostly because they do not feel fully confident with their new skill set (and also because potential supervisors often lack sufficient expertise to support them). Consequently, changes in the humanities curriculum seem necessary if we aim to educate more students in digital methods for historical research. With the increasing prominence of large language models, it seems all the more crucial to ensure that future historians can produce verifiable and reproducible results, leveraging computer-assisted methods both effectively and meaningfully. [^5]: Some of the underlying ideas for the analytic part in the master thesis of @dickmann_topographien_2022 was developed by him in a course of mine in fall 2020, see <https://github.com/LarsDIK/avis-analysis>. ## References ::: {#refs} :::