DigiHistCH24
  • Home
  • Book of Abstracts
  • Conference Program
  • Call for Contributions
  • About

Swiss Google Books for Research

  • Home
  • Book of Abstracts
    • Data-Driven Approaches to Studying the History of Museums on the Web: Challenges and Opportunities for New Discoveries
    • On a solid ground. Building software for a 120-year-old research project applying modern engineering practices
    • Tables are tricky. Testing Text Encoding Initiative (TEI) Guidelines for FAIR upcycling of digitised historical statistics.
    • Training engineering students through a digital humanities project: Techn’hom Time Machine
    • From manual work to artificial intelligence: developments in data literacy using the example of the Repertorium Academicum Germanicum (2001-2024)
    • A handful of pixels of blood
    • Impresso 2: Connecting Historical Digitised Newspapers and Radio. A Challenge at the Crossroads of History, User Interfaces and Natural Language Processing.
    • Learning to Read Digital? Constellations of Correspondence Project and Humanist Perspectives on the Aggregated 19th-century Finnish Letter Metadata
    • Teaching the use of Automated Text Recognition online. Ad fontes goes ATR
    • Geovistory, a LOD Research Infrastructure for Historical Sciences
    • Using GIS to Analyze the Development of Public Urban Green Spaces in Hamburg and Marseille (1945 - 1973)
    • Belpop, a history-computer project to study the population of a town during early industrialization
    • Contributing to a Paradigm Shift in Historical Research by Teaching Digital Methods to Master’s Students
    • Revealing the Structure of Land Ownership through the Automatic Vectorisation of Swiss Cadastral Plans
    • Rockefeller fellows as heralds of globalization: the circulation of elites, knowledge, and practices of modernization (1920–1970s): global history, database connection, and teaching experience
    • Theory and Practice of Historical Data Versioning
    • Towards Computational Historiographical Modeling
    • Efficacy of Chat GPT Correlations vs. Co-occurrence Networks in Deciphering Chinese History
    • Data Literacy and the Role of Libraries
    • 20 godparents and 3 wives – studying migrant glassworkers in post-medieval Estonia
    • From record cards to the dynamics of real estate transactions: Working with automatically extracted information from Basel’s historical land register, 1400-1700
    • When the Data Becomes Meta: Quality Control for Digitized Ancient Heritage Collections
    • On the Historiographic Authority of Machine Learning Systems
    • Films as sources and as means of communication for knowledge gained from historical research
    • Develop Yourself! Development according to the Rockefeller Foundation (1913 – 2013)
    • AI-assisted Search for Digitized Publication Archives
    • Digital Film Collection Literacy – Critical Research Interfaces for the “Encyclopaedia Cinematographica”
    • From Source-Criticism to System-Criticism, Born Digital Objects, Forensic Methods, and Digital Literacy for All
    • Connecting floras and herbaria before 1850 – challenges and lessons learned in digital history of biodiversity
    • A Digital History of Internationalization. Operationalizing Concepts and Exploring Millions of Patent Documents
    • From words to numbers. Methodological perspectives on large scale Named Entity Linking
    • Go Digital, They Said. It Will Be Fun, They Said. Teaching DH Methods for Historical Research
    • Unveiling Historical Depth: Semantic annotation of the Panorama of the Battle of Murten
    • When Literacy Goes Digital: Rethinking the Ethics and Politics of Digitisation
  • Conference Program
    • Schedule
    • Keynote
    • Practical Information
    • Event Digital History Network
    • Event SSH ORD
  • Call for Contributions
    • Key Dates
    • Evaluation Criteria
    • Submission Guidelines
  • About
    • Code of Conduct
    • Terms and Conditions

On this page

  • Poster Abstract
  • Edit this page
  • Report an issue

Other Links

  • Poster (PDF)

Swiss Google Books for Research

Poster Session
Authors
Affiliation

Martin Reisacher

University of Basel, University Library

Eric Dubey

University of Basel, University Library

Matteo Lorenzini

University of Basel, University Library

Published

September 12, 2024

Doi

10.5281/zenodo.13908139

A PDF version of the poster is available on Zenodo (PDF).

Poster Abstract

The UB Bern, ZHB Lucerne, ZB Zurich and UB Basel are digitizing a large part of their holdings from the 18th and 19th centuries in collaboration with Google Books. This digital collection, which is accessible in full text, is intended to offer new possibilities for digital and data-driven research and teaching, e.g. in the context of text and data mining and distant reading.

Due to its size (90 million pages), the collection offers many opportunities, but also presents libraries and researchers with new challenges. Google’s algorithms are responsible for image processing, book composition and full-text recognition. Continuous data improvement/changes must therefore be expected when changed algorithms deliver new data versions. This helps to continuously improve quality, but represents a black box that makes it complicated to make transparent statements about the data production processes.

The four partner libraries are currently working on a project (“Google Books for Research”):

  • Research and teaching requirements for large digital historical text collections
  • State of the art solutions for research-orientated accessibility of large historical text collections
  • Data quality and enrichment
  • Infrastructure solutions

The central question is how libraries, as cultural and memory institutions, can offer relatively generic infrastructure in the digital space and keep it stable while still being able to use it flexibly enough for very specific research questions and methods.

As part of the poster session, we will present the results of the preliminary project and would like to explore these further with the audience.

Back to top

Reuse

CC BY-SA 4.0

Citation

BibTeX citation:
@misc{reisacher2024,
  author = {Reisacher, Martin and Dubey, Eric and Lorenzini, Matteo},
  editor = {Baudry, Jérôme and Burkart, Lucas and Joyeux-Prunel,
    Béatrice and Kurmann, Eliane and Mähr, Moritz and Natale, Enrico and
    Sibille, Christiane and Twente, Moritz},
  title = {Swiss {Google} {Books} for {Research}},
  date = {2024-09-12},
  url = {https://digihistch24.github.io/submissions/poster/484/},
  doi = {10.5281/zenodo.13908139},
  langid = {en}
}
For attribution, please cite this work as:
Reisacher, Martin, Eric Dubey, and Matteo Lorenzini. 2024. “Swiss Google Books for Research.” Edited by Jérôme Baudry, Lucas Burkart, Béatrice Joyeux-Prunel, Eliane Kurmann, Moritz Mähr, Enrico Natale, Christiane Sibille, and Moritz Twente. Digital History Switzerland 2024: Book of Abstracts. https://doi.org/10.5281/zenodo.13908139.
Source Code
---
submission_id: 484
categories: 'Poster Session'
title: Swiss Google Books for Research
author:
  - name: Martin Reisacher
    orcid: 0009-0008-4529-5291
    email: martin.reisacher@unibas.ch
    affiliations:
      - University of Basel, University Library
  - name: Eric Dubey
    orcid: 0000-0002-9300-9762
    email: eric.dubey@unibas.ch
    affiliations:
      - University of Basel, University Library
  - name: Matteo Lorenzini
    orcid: 0009-0009-4159-5614
    email: matteo.lorenzini@unibas.ch
    affiliations:
      - University of Basel, University Library
date: 09-12-2024
doi: 10.5281/zenodo.13908139
other-links:
  - text: Poster (PDF)
    href: https://doi.org/10.5281/zenodo.13908139
---

::: {.callout-note appearance="simple" icon=false}

A PDF version of the poster is available [on Zenodo (PDF)](https://zenodo.org/records/13908139/files/484_DigiHistCH24_SwissGoogleBooks_Poster.pdf).

:::

## Poster Abstract

The UB Bern, ZHB Lucerne, ZB Zurich and UB Basel are digitizing a large part of their holdings from the 18th and 19th centuries in collaboration with Google Books. This digital collection, which is accessible in full text, is intended to offer new possibilities for digital and data-driven research and teaching, e.g. in the context of text and data mining and distant reading.

Due to its size (90 million pages), the collection offers many opportunities, but also presents libraries and researchers with new challenges. Google's algorithms are responsible for image processing, book composition and full-text recognition. Continuous data improvement/changes must therefore be expected when changed algorithms deliver new data versions. This helps to continuously improve quality, but represents a black box that makes it complicated to make transparent statements about the data production processes.

The four partner libraries are currently working on a project (“Google Books for Research”):

* Research and teaching requirements for large digital historical text collections
* State of the art solutions for research-orientated accessibility of large historical text collections
* Data quality and enrichment
* Infrastructure solutions

The central question is how libraries, as cultural and memory institutions, can offer relatively generic infrastructure in the digital space and keep it stable while still being able to use it flexibly enough for very specific research questions and methods.

As part of the poster session, we will present the results of the preliminary project and would like to explore these further with the audience.
  • Edit this page
  • Report an issue