Digital Humanities (DH) are growing rapidly; the necessary infrastructure
is being built up gradually and slowly. For smaller DH projects, e. g. for
testing methods, as a preliminary work for submitting applications or for use in
teaching, a corpus often has to be digitised. These small-scale projects make an
important contribution to safeguarding and making available cultural heritage, as
they make it possible to machine read those resources that are of little or no interest
to large projects because they are too special or too limited in scope. They
close the gap between large scanning projects of archives, libraries or in connection
with research projects and projects that move beyond the canonised paths.
Yet, these small projects can fail in this first step of digitisation, because it is
often a hurdle for (Digital) Humanists at universities to get the desired texts digitised:
either because the digitisation infrastructure in libraries/archives is not
available (yet) or it is paid service. Also, researchers are often no digitising experts
and a suitable infrastructure at university is missing.
In order to promote small DH projects for teaching purposes, a digitising infrastructure
was set up at the University of Stuttgart as part of a teaching project. It
should enable teachers to digitise smaller corpora autonomously.
This article presents a study that was carried out as part of this teaching project.
It suggests how to implement best practices and on which aspects of the digitisation
workflow need to be given special attention.
The target group of this article are (Digital) Humanists who want to digitise a
smaller corpus. Even with no expertise in scanning and OCR and no possibility
to outsource the digitisation of the project, they still would like to obtain the best
possible machine-readable files.