W-PIV-1: Infrastructure and Support
Wednesday, 07/Mar/2018:
4:00pm - 5:30pm

Session Chair: Tanja Säily
Location: PIV

4:00pm - 4:30pm
Long Paper (20+10min) [publication ready]

Towards an Open Science Infrastructure for the Digital Humanities: The Case of CLARIN

Koenraad De Smedt1, Franciska De Jong2, Bente Maegaard3, Darja Fišer4, Dieter Van Uytvanck2

1University of Bergen, Norway; 2CLARIN ERIC, The Netherlands; 3University of Copenhagen, Denmark; 4University of Ljubljana and Jožef Stefan Institute, Slovenia

CLARIN is the European research infrastructure for language resources. It is a sustainable home for digital research data in the humanities and it also of-fers tools and services for annotation, analysis and modeling. The scope and structure of CLARIN enable a wide range of studies and approaches, in-cluding comparative studies across regions, periods, languages and cul-tures. CLARIN does not see itself as a stand-alone facility, but rather as a player in making the vision that is underlying the emerging European poli-cies towards Open Science a reality, interconnecting researchers across na-tional and discipline borders by offering seamless access to data and ser-vices in line with the FAIR data principles. CLARIN also aims contribute to responsible data science by the design as well as the governance of its in-frastructure and to achieve an appropriate and transparent division of re-sponsibilities between data providers, technical centres, and end users. CLARIN offers training towards digital scholarship for humanities scholars and aims at increased uptake from this audience.

De Smedt-Towards an Open Science Infrastructure for the Digital Humanities-249_a.pdf
De Smedt-Towards an Open Science Infrastructure for the Digital Humanities-249_c.pdf

4:30pm - 4:45pm
Short Paper (10+5min) [abstract]

The big challenge of data! Managing digital resources and infrastructures for digital humanities researchers

Isto Huvila

Uppsala University,

Digital humanities research is dependent on the development and seizing of appropriate digital methods and technologies, collection and digitisation of data, and development of relevant and practicable research questions. In the long run, the potential of the field to sustain as a significant social intellectual movement (or in Kuhnian terms, paradigm) is, however, conditional to the sustainability of the scholarly practices in the field. Digital humanities research has already moved from early methodological experiments to the systematic development of research infrastructures. These efforts are based both on the explicit needs to develop new resources for digital humanities research and on the strategic initiatives of the keepers of relevant existing collections and datasets to open up their holdings for users. Harmonisation and interoperability of the evolving infrastructures are in different stages of developments both nationally and internationally but in spite of the large number of practical difficulties, the various national, European (e.g. DARIAH, CLARIN and ARIADNE) and international initiatives are making progress in this respect. The sustainability of digital infrastructures is another issue that has been scrutinised and addressed both in theory and practice under the auspices of national data archives, specialist organisations like the British Digital Curation Centre and international discussions, for instance, within the iPRES conference community. However, an aspect of the management of the infrastructures that has received relatively little attention so far, is management for use. We are lacking a comprehensive understanding of how the emerging digital data and infrastructures are used, could be used and consequently, how the emanating resources should be managed to be useful for digital humanities research not only in the context within which they were developed but also for other researchers and many cases users outside of the academia.

This paper discusses the processes and competences for the management of digital humanities resources and infrastructures for (theoretically) maximising their current and future usefulness for the purposes of research. On the basis of empirical work on archaeological research data in the context of the Swedish Archaeological Information in the Digital Society (ARKDIS) research project (Huvila, 2014) and a comparative study with selected digital infrastructures in other branches of humanities research, a model of use-oriented management of research data with central processes and competences is presented. The suggested approach complements existing digital curation and management models by opening up the user side processes of digital humanities data resources and their implications for the functioning, development and management of appropriate research infrastructures. Theoretically the approach draws from the records continuum theory (as formulated by Upward and colleagues (e.g. Upward, 1996, 1997, 2000; McKemmish, 2001)) and Pickering’s notion of the mangle of practice (Pickering, 1995) developed in the context of the social studies of science. The model demonstrates the significance of being sensitive to explicit wants and needs of the researchers (users) but also the implicit, often tacit requirements that emerge from their practical research work. Simultaneously, the findings emphasise the need of a meta-competence to manage the data and provide appropriate services for its users.


Huvila, I. (Ed.) (2014). Perspectives to Archaeological Information in the Digital Society. Uppsala: Department of ALM, Uppsala University.


McKemmish, S. (2001). Placing Records Continuum Theory and Practice. Archival Science, 1(4), 333–359.


Pickering, A. (1995). The Mangle of Practice: Time, Agency, and Science. Chicago: University of Chicago Press.

Upward, F. (1996). Structuring the Records Continuum Part One: Postcustodial Principles and Properties. Archives and Manuscripts, 24(2), 268– 285.

Upward, F. (1997). Structuring the Records Continuum, Part Two: Structuration Theory and Recordkeeping. Archives and Manuscripts, 25(1), 10–35.

Upward, F. (2000). Modelling the continuum as paradigm shift in recordkeeping and archiving processes, and beyond–a personal reflection. Records Management Journal, 10(3), 115–139.

Huvila-The big challenge of data! Managing digital resources and infrastructures-104_a.pdf
Huvila-The big challenge of data! Managing digital resources and infrastructures-104_c.pdf

4:45pm - 5:00pm
Short Paper (10+5min) [abstract]

Research in Nordic literary collections: What is possible and what is relevant?

Mads Rosendahl Thomsen1, Kristoffer Laigaard Nielbo2, Mats Malm3

1Aarhus University; 2University of Southern Denmark; 3University of Gothenburg

There are a growing number of digital literary collections in the Nordic countries that make the literary heritage accessible and have great potential for research that takes advantage of machine readable texts. These collections range from very large collections such as the Norwegian Bokhylla, medium-sized collections such as the Swedish Litteraturbanken and the Danish Arkiv for Dansk Litteratur, to one-author collections, e.g. the collected works of N.F.S. Grundtvig. In this presentation we will discuss some of the obstacles for a more widespread use of these collections by literary scholars and present outcomes of a series of seminars – UCLA 2015, Aarhus 2016, UCLA 2017 – sponsored by the Fondation Maison des sciences de l’homme courtesy of a grant from the Andrew Carnegie Mellon Foundation.

We find that there are two important thresholds in the use of collections:

1) The technical obstacles for collecting the right corpora and applying the appropriate tools for analysis are too high for the majority of researchers working in literary studies. While much have been done to advance the access to works, differences in formats and metadata make it difficult to work across the collections. Our project has addressed this issue by creating a Nordic github repository for literary texts, CLEAR, which provides cleaned versions of Nordic literary works, as well as a suite of tools in Python.

2) The capacity to combine traditional hermeneutical approaches to literary studies with computational approaches is still in its infancy despite numerous good studies from the past years, e.g. by Stanford Literary Lab, Leonard and Tangherlini and Ted Underwood. We have worked to bring together in our series of seminar scholars with great technical prowess and more traditionally trained literary scholars in a series of seminars to generate projects that are technically feasible and scholarly relevant. The process of expanding the methodological vocabulary of literary studies is complicated and requires significant domain expertise to verify the outcome of computational analyses, and conversely, openness to work with results that cannot be verified by close readings. In this presentation we will show how thematic variation and readability can provide new perspectives on Swedish and Danish modernist literature, and discuss how this relates to more general visions of literary studies in an age of computation (Heise, Thomsen).


Algree-Hewitt, Mark et al. 2016. ”Canon/Archive. Large-scale Dynamics in the Literary Field.” Stanford Literary Lab Pamphlet 11.

Heise, Ursula. 2017. “Comparative literature and computational criticism: A conversation with Franco Moretti.” Futures of Comparative Literature: ACLA State of the Discipline Report. London: Routledge, 2017.

Leonard, Peter and Timothy R. Tangherlini. 2013. “Trawling in the Sea of the Great Unread: Sub-Corpus Topic Modeling and Humanities Research”. Poetics 41(6): 725-749.

Thomsen, Mads Rosendahl et al. 2015. “No Future without Humanities.” Humanities 1.

Underwood, Ted. 2013. Why Literary Period Mattered. Stanford: Stanford University Press.

Thomsen-Research in Nordic literary collections-133_a.pdf

5:00pm - 5:30pm
Long Paper (20+10min) [publication ready]

Reassembling the Republic of Letters - A Linked Data Approach

Jouni Tuominen1,2, Eetu Mäkelä1,2, Eero Hyvönen1,2, Arno Bosse3, Miranda Lewis3, Howard Hotson3

1Aalto University, Semantic Computing Research Group (SeCo); 2University of Helsinki, HELDIG – Helsinki Centre for Digital Humanities; 3University of Oxford, Faculty of History

Between 1500 and 1800, a revolution in postal communication allowed ordinary men and women to scatter letters across and beyond Europe. This exchange helped knit together what contemporaries called the respublica litteraria, Republic of Letters, a knowledge-based civil society, crucial to that era’s intellectual breakthroughs, and formative of many modern European values and institutions. To enable effective Digital Humanities research on the epistolary data distributed in different countries and collections, metadata about the letters have been aggregated, harmonised, and provided for the research community through the Early Modern Letters Online (EMLO) service. This paper discusses the idea and benefits of using Linked Data as a basis for the next digital framework of EMLO, and presents experiences of a first demonstrational implementation of such a system.

Tuominen-Reassembling the Republic of Letters-207_a.pdf
Tuominen-Reassembling the Republic of Letters-207_c.pdf

