Conference Agenda

Session Overview
Location: P674
 
Date: Thursday, 08/Mar/2018
11:00am - 12:30pmT-P674-1: Place
Session Chair: Christian-Emil Smith Ore
P674 
 
11:00am - 11:30am
Long Paper (20+10min) [publication ready]

SDHK meets NER: Linking place names with medieval charters and historical maps

Olof Karsvall2, Lars Borin1

1University of Gothenburg,; 2Swedish National Archives

Mass digitization of historical text sources opens new avenues for research in the humanities and social sciences, but also presents a host of new methodological challenges. Historical text collections become more accessible, but new research tools must also be put in place in order to fully exploit the new research possibilities emerging from having access to vast document collections in digital format. This paper highlights some of the conditions to consider when place names in an older source material, in this case medieval charters, are to be matched to geographical data. The Swedish National Archives make some 43,000 medieval letters available in digital form through an online search facility. The volume of the material is such that manual markup of names will not be feasible. In this paper, we present the material, discuss the promises for research of linking, e.g., place names to other digital databases, and report on an experiment where an off-the-shelf named-entity recognition system for modern Swedish is applied to this material.


11:30am - 11:45am
Distinguished Short Paper (10+5min) [publication ready]

On Modelling a Typology of Geographic Places for the Collaborative Open Data Platform histHub

Manuela Weibel, Tobias Roth

Schweizerisches Idiotikon

HistHub will be a platform for Historical Sciences providing authority records for interlinking and referencing basic entities such as persons, organisations, concepts and geographic places within an ontological framework. For the case of geographic places, a draft of a place typology is presented here. Such a typology will be needed for semantic modelling in an ontology. We propose a hierarchical two-step model of geographic place types: a more generic type remaining stable over time that will ultimately be incorporated into the ontology as the essence of the identity of a place, and a more specific type closer to the nature of the place the way it is actually perceived by humans.

Our second approach on our way to a place typology is decidedly bottom-up. We try to standardise the place types in our database of heterogeneous top-onymic data using the place types already present as well as textual descriptions and name matches with typed external data sources. The types used in this standardisation process are basic conceptual units that are most likely to play a role in any place typology yet to be established. Standardisation at this early stage leads to comprehensive and deep knowledge of our data which helps us developing a good place typology.


11:45am - 12:00pm
Distinguished Short Paper (10+5min) [publication ready]

Geocoding, Publishing, and Using Historical Places and Old Maps in Linked Data Applications

Esko Ikkala1, Eero Hyvönen1,2, Jouni Tuominen1,2

1Aalto University, Semantic Computing Research Group (SeCo); 2University of Helsinki, HELDIG – Helsinki Centre for Digital Humanities

This paper presents a Linked Open Data brokering service prototype Hipla.fi for

using and maintaining historical place gazetteers and maps based on distributed SPARQL endpoints. The service introduces several novelties: First, the service facilitates collaborative maintenance of geo-ontologies and maps in real time as a side effect of annotating contents in legacy cataloging systems. The idea is to support a collaborative ecosystem of curators that creates and maintains data about historical places and maps in a sustainable way. Second, in order to foster understanding of historical places, the places can be provided on both modern and historical maps, and with additional contextual Linked Data attached. Third, since data about historical places is typically maintained by different authorities and in different countries, the service can be used and extended in a federated fashion, by including new distributed SPARQL endpoints (or other web services with a suitable API) into the system.


12:00pm - 12:15pm
Short Paper (10+5min) [abstract]

Using ArcGIS Online and Story Maps to visualise spatial history: The case of Vyborg

Antti Härkönen

University of Eastern Finland

Historical GIS (HGIS) or spatially oriented history is a field that uses geoinformatics to look at historical phenomena from a spatial perspective. GIS tools are used to visualize, manage and analyze geographical data. However, the use of GIS tools requires some technical expertise and ready-made historical spatial data is almost non-existent, which significantly reduces the reach of HGIS. New tools should make spatially oriented history more accessible.

Esri’s ArcGIS Online (AGOL) allows making internet visualization of maps and map layers created with Esri’s more traditional GIS desktop program ArcMap. In addition, Story Map tool allows the creation of more visually pleasing presentations using maps, text and multimedia resources. I will demonstrate the use of Story Maps to represent spatial change in the case of the city of Vyborg.

The city of Vyborg lies in Russia near the Finnish border. A small town grew near the castle founded by Swedes in 1293. Vyborg was granted town privileges in 1403, and later in the 15th century, it became one of the very few walled towns in Kingdom of Sweden. The town was located on a hilly peninsula near the castle. Until 17th century the town space was ‘medieval’ i.e. irregular. The town was regulated to conform to a rectangular street layout in 1640s. I show the similarities between old and new town plans by superimposing them on a map.

The Swedish period ended when the Russians conquered Vyborg in 1710. Vyborg became a provincial garrison town and administrative center. Later, when Russia conquered rest of Finland in 1809, the province of Vyborg (aka ‘Old Finland’) was added to the Autonomous Grand Duchy of Finland, a part of the Russian empire. During 19th century Vyborg became increasingly important trade and industrial center, and the population grew rapidly. I map expanding urban areas using old town plans and population statistics.

Another perspective to the changing town space is the growth of fortifications around Vyborg. As the range of artillery grew, the fortifications were pushed further and further outside the original town. I use story maps to show the position of fortifications of different eras by placing them in the context of terrain. I also employ viewshed analyses to show how the fortifications dominate the terrain around them.


12:15pm - 12:30pm
Short Paper (10+5min) [abstract]

EXPLORING COUNTRY IMAGES IN SCHOOL BOOKS: A COMPARATIVE COMPUTATIONAL ANALYSIS OF GERMAN SCHOOL BOOKS IN THE 20TH AND THE 21ST CENTURY

Kimmo Elo1, Virpi Kivioja2

1University of Helsinki; 2University of Turku

This paper is based on an ongoing PhD project entitled “An international triangle drama?”, which studies the depictions of West Germany and East Germany in Finnish, and depictions of Finland in West German and East German geography textbooks in the Cold War era. The primary source material consists of Finnish, West German, and East German geography textbooks that were published between 1946 and 1999.

Contrary to traditional methods of close reading thus far applied in school book analysis, this paper presents an exploratory approach based computational analysis of a large book corpus. The corpus consists of school books in geography used in the Federal Republic of Germany between 1946 and 1999, and in the German Democratic Republic between 1946 and 1990. The corpus has been created by digitising all books by applying OCR technologies on the scanned page images. The corpus has also been post-processed by correcting OCR errors and by adding metadata.

The main aim of the paper is to extract and analyse conceptual geocollocations. Such an analysis focuses on how concepts are embedded geospatially on the one hand, how geographical entities (cities, regions, etc.) are conceptually embedded, on the other. Regarding the former, the main aim is to examine and explain the geospatial distribution of terms and concepts. Regarding the latter, the main focus is on the analysis of concept collocations surrounding geographical entities.

The analysis presented in the paper consists of four steps. First, standard methods of text mining are used in order to identify geographical concepts (names of different regions, cities etc.). Second, concepts and terms in the close neighborhood of geographical terms are tagged with geocodes. Third, network analysis is applied to create concept networks around geographical entities. And fourth, both the geotagged and network data are enriched by adding bibliographical metadata allowing comparisons over time and between countries.

The paper adopts several methods to visualise analytical results. Geospatial plots are used to visualise geographical distribution of concept and its changes over time. Network graphs are used to visualise collocation structures and their dynamics. An important functions of the graphs, however, is to exemplify how graphical visualisations can be used to visualise historical knowledge and how graphical visualisations can help us to tackle change and continuity from a comparative perspective.

Concerning historical research from a more general perspective, one of the main objectives of this paper is to exemplify and discuss how computational methods could be applied to tackle research questions typical for social sciences and historical research. The paper is motivated by the big challenge to move away from computational history guided and limited by tools and methods of computational sciences toward an understanding that computational history requires computational tools developed to find answers to questions typical and crucial for historical research. All tools, data and methods developed during this research project will later be made available for scholars interested in similar topics, thus helping the to gain advantage of this project.

 
2:00pm - 3:30pmT-P674-2: Crowdsourcing and Collaboration
Session Chair: Hannu Salmi
P674 
 
2:00pm - 2:30pm
Long Paper (20+10min) [abstract]

From crowdsourcing cultural heritage to citizen science: how the Danish National Archives 25-year old transcription project is meeting digital historians

Barbara Revuelta-Eugercios1,2, Nanna Floor Clausen1, Katrine Tovgaard-Olsen1

1Rigsarkivet (Danish National Archives); 2Saxo Institute, University of Copenhagen

The Danish National Archives have the oldest crowdsourcing project in Denmark, with more than 25 million records transcribed that illuminate the live and deaths of Danes since the early 18th century. Until now, the main group interested in creating and using these resources have been amateur historians and genealogists. However, it has become clear that the material also holds immense value to historians, armed with the new digital methods. The rise of citizen science projects show, likewise, an alternative way, with clear research purposes, of using the crowdsourcing of cultural heritage material. How to reconcile the traditional crowd-centered approach of the existing projects, to the extent that we can talk about co-creation, with the narrowly-defined research questions and methodological decisions researchers required? How to increase the use of these materials by digital historians without losing the projects’ core users?

This article articulates how the Danish National Archives are answering these questions. In the first section, we discuss the tensions and problems of combining crowdsourcing digital heritage and citizen science; in the second, the implications of the crowd-centered nature of the project in the incorporation of research interests; and in the third one, we present the obstacles and solutions put in place to successfully attract digital historians to work on this material.

Crowdsourcing cultural heritage: for the public and for the humanists

In the last decades, GLAMs (galleries, libraries, archives and museums) have been embarked in digitalization projects to broaden the access, dissemination and appeal of their collections, as well as enriching them in different ways (tagging, transcribing, etc.), as part of their institutional missions. Many of these efforts have included audience or community participation, which can be loosely defined as either crowdsourcing or activities that predate or conform to the standard definition of crowdsourcing, taking Howe’s (2006) business-related definition as “the act of taking a job traditionally performed by a designated agent (usually an employee) and outsourcing it to an undefined, generally large group of people in the form of an open call” (Ridge 2014). However, the key feature that differentiates these crowdsourcing cultural heritage projects is that the work the crowd performs has never been undertaken by employees. Instead, they co-create new ways for the collections to be made available, disseminated, interpreted, enriched and enjoyed that could never had been paid for within their budgets.

These projects often feature “the crowd” at both ends of the process: volunteers contribute to improve access to and availability of the collections, which in turn will benefit the general public from which volunteers are drawn. In the process, access to the digital cultural heritage material is democratized and facilitated, transcribing records, letters, menus, tagging images, digitizing new material, etc. As a knock-on effect, the research community can also benefit, as the new materials open up possibilities for researchers in the digital humanities. The generally financially limited Humanities projects could never achieve the transcription of millions of records.

At the same time, there has been a strand of academic applications of crowdsourcing in Humanities projects (Dunn and Hedges 2014). These initiatives fall within the so-called citizen science projects, which are driven by researchers and narrowly defined to answer a research question, so the tasks performed by the volunteers are lined up to a research purpose. Citizen science or public participation on scientific research, that emerged out of natural sciences projects in the mid-1990s (Bonney et al 2009), has branched out to meet the Humanities, building on a similar utilization of the crowd, i.e. institutional digitalization projects of cultural heritage material. In particular, archival material has been a rich source for such endeavours: weather observations from ship logs in Old Weather (Blaser 2014), Benthan’s works in Transcribe Bentham (Causer & Terras 2014) or restaurant menus on What’s on the menu (2014). While some of them have been carried out in cooperation with the GLAMs responsible for those collections, the new opportunities opened up for the digital humanities allow these projects to be carried out by researchers independently from the institutions that host the collections, missing a great opportunity to combine interests and avoid duplicating work.

Successfully bringing a given project to contribute to crowdsourcing cultural heritage material and citizen science faces many challenges. First, a collaboration needs to be established across at least two institutional settings – a GLAMs and a research institution- that have very different institutional aims, funding, culture and legal frameworks. GLAMs foundational missions often relate to serving the public in general first, the research community being only a tiny percentage of its users. Any institutional research they undertake on the collections is restricted to particular areas or aspects of the collections and institutional interest which, on the other hand, is less dependent on external funding. The world of Academia, on the other hand, has a freer approach to formulating research questions but is often staffed with short-term positions and projects, time-constraints and a need of immediacy of publication and the ever-present demand for proving originality and innovation.

Additionally, when moving from cultural heritage dissemination to research applications, a wide set of issues also come into view in these crowdsourcing works that can determine their development and success: the boundaries between professional and lay expertise, the balance of power in the collaboration between the public, institutions and researchers, ethical concerns in relation to data quality and data property, etc. (Riesh 2014, Shirk et al 2012).

The Danish National Archives crowd-centered crowdsourced 25-year-old approach

In this context, the Danish National Archives are dealing with the challenge of how to incorporate a more citizen-science oriented approach and attract historians (and digital humanists) to work with the existing digitized sources while maintaining its commitment to the volunteers. This challenge is of a particular difficulty in this case because not only the interests of the archives and researchers need to align, but also those of the “crowd” itself, as volunteers have played a major role in co-creating crowdsourcing for 25 years.

The original project, now the Danish Demographic Database, DDD, (www.ddd.dda.dk), is the oldest “crowdsourcing project” in the country. It started in 1992 thanks to the interest of the genealogical communities in coordinating the transcription of historical censuses and church books. (Clausen & Jørgensen 2000). From its beginning, the volunteers were actively involved in the decision-making process of what was to be done and how, while the Danish National Archives (Rigsarkivet) were in charge of coordination, management and dissemination functions. Thus, there has been a dual government of the project and a continuous conversation of negotiation of priorities, in the form of, a coordination committee, which combines members of the public and genealogical societies as well as Rigsarkivet personel.

This tradition of co-creation has shaped the current state of the project and its relationship to research. The subsequent Crowdsourcing portal, CS, (https://cs.sa.dk/), which started in 2014 with a online interface, broadened the sources under transcription and the engagement with volunteers (in photographing, counselling, etc.), and maintains a strong philosophy of serving the volunteers’ wishes and interests, rather than imposing particular lines. Crowdsourcing is seen as more than a framework for creating content: it is also a form of engagement with the collections that benefits both audiences and archive. However, it has also introduced some citizen-science projects, in which the transcriptions are intended to be used for research (e.g. the Criminality History project).

Digital history from the crowdsourced material: present and future

In spite of that largely crowd-oriented nature of this crowdsourcing project, there were also broad research interests (if not a clearly defined research project) behind the birth of DDD, so that the decisions taken in its setup ensured that the data was suitable for research. Dozens of projects and publications have made use of it, applying new digital history methods, and the data has been included in international efforts, such as the North Atlantic Population Project (NAPP.org).

However, while amply known in genealogist and amateur historian circles, the Danish National Archives large crowdsourcing projects are still either unknown or taken advantage of by historians and students in the country. Some of the reasons are related to field-specific developments, but one of the key constraints towards a wider use is, undoubtedly, the lack of adequate training. There is no systematic training for dealing with historical data or digital methods in the History degrees, even when we are witnessing a clear increase in the digital Humanities.

In this context, the Danish National Archives are trying to push their material into the hands of more digital historians, building bridges to the Danish universities by multiple means: collaboration with universities in seeking joint research projects and applications (SHIP and Link Lives project); active dissemination of the material for educational purposes across disciplines (Supercomputer Challenge at Southern Denmark University ); addressing the lack of training and familiarity of students and researchers with it through targeted workshops and courses, including training in digital history methods (Rigsarkivets Digital History Labs); and promotion of an open dialogue with researchers to identify more sources that could combine the aims of access democratization and citizen science.

References

Blaser, L., 2014 “Old Weather: approaching collections from a different angle” in Ridge (ed) Crowdsourcing our Cultural Heritage, Ashgate, 45-56.

Bonney et al. 2009. Public Participation in Scientific Research: Defining the Field and Assessing Its Potential for Informal Science Education. Center for Advancement of Informal Science Education (CAISE), Washington, DC

Clausen, N.C and Marker, H.J., 2000, ”The Danish Data Archive” in Hall, McCall, Thorvaldsen International historical microdata for population research, Minnesota Population Center . Minneapolis, Minnesota, 79-92,

Causer, T. and Terras, M. 2014, ”‘Many hands make light work. Many hands together make merry work’: Transcribe Bentham and crowdsourcing manuscript collections”, in Ridge (ed) Crowdsourcing our Cultural Heritage, Ashgate, 57-88.

Dunn, S. and Hedges, M. 2014“How the crowd can surprise us: Humanities crowd-sourcing and the creation of knowledge”, in Ridge (ed) Crowdsourcing our Cultural Heritage, Ashgate, 231-246.

Howe, J. 2006, “The rise of crowdsourcing”, Wired, June.

Ridge, M. 2014, “Crowdsourcing our cultural heritage: Introduction”, in Ridge (ed) Crowdsourcing our Cultural Heritage, Ashgate, 1-16.

Riesch, H., Potter, C., 2014. Citizen science as seen by scientists: methodological, epistemological and ethical dimensions. Public Understanding of Science 23 (1), 107–120

Shirk, J.L et al, 2012. Public participation in scientific research: a framework for deliberate design. Ecology and Society 17 (2),


2:30pm - 2:45pm
Short Paper (10+5min) [abstract]

CAWI for DH

Jānis Daugavietis, Rita Treija

Institute of Literature, Folklore and Art - University of Latvia

Survey method using questionnaire for acquiring different kinds of information from the population is old and classic way to collect the data. As examples of such surveys we can trace back to the ancient civilizations, like censuses or standardised agricultural data recordings. The main instrument of this method is question (closed-ended or open-ended) which should be asked exactly the same way to all the representatives of surveyed population. During the last 20-25 years the internet survey method (also called web, electronic, online, CAWI [computer assisted web interview] etc.) is well developed and and more and more frequently employed in social sciences and marketing research, among others. Usually CAWI is designed for acquiring quantitative data, but as in other most used survey modes (face-to-face paper assisted, telephone or mail interviews) it can be used to collect qualitative data, like un- or semi-structured text/ speech, pictures, sounds etc.

In recent years DH (digital humanities) starting to use more often the CAWI alike methodology. At the same time the knowledge of humanitarians in this field is somehow limited (because lack of previous experience and in many cases - education, humanitarian curriculum usually does not include quantitative methods). The paper seeks to analyze specificity of CAWI designed for needs of DH, when the goal of interaction with respondents is to acquire the primary data (eg questioning/ interviewing them on certain topic in order to make a new data set/ collection).

Questionnaires as the approach for collecting data of traditional culture date back to an early stage of the disciplinary history of Latvian folkloristics, namely, to the end of the 19th century and the beginning of the 20th century (published by Dāvis Ozoliņš, Eduard Wolter, Pēteris šmits, Pēteris Birkerts). The Archives of Latvian Folklore was established in 1924. Its founder and the first Head, folklorist and schoolteacher Anna Bērzkalne on regular basis addressed to Archives’ collaborators the questionnaires (jautājumu lapas) on various topics of Latvian folklore. She both created original sets of questions herself and translated into Latvian and adapted those by the Estonian and Finnish folklore scholars (instructions for collecting children’s songs by Walter Anderson; questionnaires of folk beliefs by O. A. F. Mustonen alias Oskar Anders Ferdinand Lönnbohm and Viljo Johannes Mansikka). The localised equivalents were published in the press and distributed to Latvian collectors. Printed questionnaires, such as “House and Household”, “Fishing and Fish”, “Relations between Relatives and Neighbors” and other, presented sets of questions of which were formulated in a suggestive way so that everyone who had some interest could easily engage in the work. The hand-written responses by contributors were sent to the Archives of Latvian Folklore from all regions of the country; the collection of folk beliefs in the late 1920s greatly supplemented the range of materials at the Archives.

However, the life of the survey as a method of collecting folklore in Latvia did not last long. Soon after the World War II it was overcome by the dominance of collective fieldwork and, at the end of the 20th century, by the individual field research, implying mainly the face-to-face qualitative interviews with the informants.

Only in 2017, the Archives of Latvian Folklore revitalized the approach of remote data collecting via the online questionnaires. Within the project “Empowering knowledge society: interdisciplinary perspectives on public involvement in the production of digital cultural heritage” (funded by the European Regional Development Fund), a virtual inquiry module has been developed. The working group of virtual ethnography launched a series of online surveys aimed to study the calendric practices of individuals in the 21st century. Along with working out the iterative inquiry, data accumulation and analysis tools, the researchers have tried to find solutions to the technical and ethical challenges of our day.

Mathematics, sociology and other sciences have developed a coherent theoretical methodology and have accumulated experience based knowledge for online survey tools. That rises several questions, such as:

- How much of this knowledge is known by DH?

- How much are they useful for DH? How different is DH CAWI?

- What would be the most important aspects for DH CAWI?

To answer these questions, we will make a schematic comparison of ‘traditional’ or most common CAWI of social sciences and those of DH, looking at previous experience of our work in fields and institutions of sociology, statistics and humanities.


2:45pm - 3:00pm
Short Paper (10+5min) [abstract]

Wikidocumentaries

Susanna Ånäs

Aalto University

background

Wikidocumentaries is a concept for a collaborative online space for gathering, researching and remediating cultural heritage items from memory institutions, open platforms and the participants. The setup brings together communities of interest and of expertise to work together on shared topics with onnline tools. For the memory organization, Wikidocumentaries offers a platform for crowdsourcing, for amateur and expert researchers it provides peers and audiences, and from the point of view of the open environments, it acts as a site of curation.

Current environments fall short in serving this purpose. Content aggregators focus on gathering, harmonizing and serving the content. Commercial services fail to take into account the open and connected environment in the search for profit. Research environments do not prioritize public access and broad participation. Many participatory projects live short lives from enthusiastic engagement to oblivion due to lack of planning for the sustainability of the results. Wikidocumentaries tries to battle these challenges.

This short paper will be the first attempt in creating an inventory of research topics that this environment surfaces.

the topics

Technologically the main focus of the project is investigating the use of linked open data, and especially proposing the use of Wikidata for establishing meaningful connections across collections and sustainability of the collected data.

Co-creation is an important topic in many senses. What are the design issues of the environment to encourage collaborative creative work? How can the collaboration reach out from the online environment into communities of interest in everyday life? What are the characteristics of the collaborative creations or what kind of creative entrepreneurship can such open environment promote? How to foster and expand a community of technical contributors for the open environments?

The legislative environment sets the boundaries for working. How will privacy and openness be balanced? Which copyright licensing schemes can encourage widest participation? Can novel technologies of personal information management be applied to allow wider participation?

The paper will draw together recent observations from a selection of disciplines for practices in creating participatory knowledge environments.


3:00pm - 3:15pm
Short Paper (10+5min) [abstract]

Heritage Here, K-Lab and intra-agency collaboration in Norway

Vemund Olstad, Anders Olsson

Directorate for Cultural Heritage,

Heritage Here, K-Lab and intra-agency collaboration in Norway

Introduction

This paper aims to give an overview of an ongoing collaboration between four Norwegian government agencies, by outlining its history, its goals and achievements and its current status. In doing so, we will, hopefully, be able to arrive at some conclusions about the usefulness of the collaboration itself – and whether or not anything we have learned during the collaboration can be used as a model for – or an inspiration to – other projects within the cultural heritage sector or the broader humanities environment.

First phase – “Heritage Here” 2012 – 2015

Heritage Here (or “Kultur- og naturreise” as it is known in its native Norwegian) was a national project which ran between 2012 and 2015 (http://knreise.org/index.php/english/). The project had two main objectives:

1. To help increase access to and use of public information and local knowledge about culture and nature

2. To promote the use of better quality open data.

The aim being that anyone with a smartphone can gain instant access to relevant facts and stories about their local area wherever they might be in the country.

The project was a result of cross-agency cooperation between five agencies from 3 different ministries. Project partners included:

• the Norwegian Mapping Authority (Ministry of Local Government and Modernization).

• the Arts Council Norway and the National Archives (Ministry of Culture).

• the Directorate of Cultural Heritage and (until December 2014) the Norwegian Environment Agency (the Ministry of Climate and Environment).

Together, these partners made their own data digitally accessible; to be enriched, geo-tagged and disseminated in new ways. Content included information about animal and plant life, cultural heritage and historical events, and varied from factual data to personal stories. The content was collected into Norway’s national digital infrastructure ‘Norvegiana’ (http://www.norvegiana.no/) and from there it can be used and developed by others through open and documented API’s to create new services for business, tourism, or education. Parts of this content were also exported into the European aggregation service Europeana.eu (http://www.europeana.eu).

In 2012 and 2013 the main focus of the project was to facilitate further development of technical infrastructures - to help extract data from partner databases and other databases for mobile dissemination. However, the project also worked with local partners in three pilot areas:

• Bø and Sauherad rural municipalities in Telemark county

• The area surrounding Akerselva in Oslo

• The mountainous area of Dovre in Oppland county.

These pilots were crucial to the project, both as an arena to test the content from the various national datasets, but also as a testing ground for user community participation on a local and regional level. They have also been an opportunity to see Heritage Here’s work in a larger context. The Telemark pilot was for example, used to test the cloud-based mapping tools developed in the Best Practice Network “LoCloud” (http://www.locloud.eu/) which where coordinated by the National Archives of Norway.

In addition to the previously mentioned activities Heritage Here worked towards being a competence builder – organizing over 20 workshops on digital storytelling and geo-tagging of data, and numerous open seminars with topics ranging from open data and LOD, to IPR and copyright related issues. The project also organized Norway’s first heritage hackathon “#hack4no” in early 2014 (http://knreise.org/index.php/2014/02/27/hack4no-a-heritage-here-hackathon/). This first hackathon has since become an annual event – organized by one of the participating agencies (The Mapping authority) – and a great success story, with 50+ participants coming together to create new and innovative services by using open public data.

Drawing on the experiences the project had gathered, the project focused its final year on developing various web-based prototypes which use a map as the users starting point. These demonstrate a number of approaches for visualizing and accessing different types of cultural heritage information from various open data sets in different ways – such as content related to a particular area, route or subject. These prototypes are free and openly accessible as web-tools for anyone to use (http://knreise.no/demonstratorer/). The code to the prototypes has been made openly available so it can be used by others – either as it is, or as a starting point for something new.

Second phase – “K-Lab” 2016 –>

At the end of 2015 Heritage Here ended as a project. But the four remaining project partners decided to continue their digital cross-agency cooperation. So, in January 2016 a new joint initiative with the same core governmental partners was set up. Heritage here went from being a project to being a formalized collaboration between four government agencies. This new partnership is set up to focus on some key issues seen as crucial for further development of the results that came out of the Heritage Here project. Among these are:

• In cooperation develop, document and maintain robust, common and sustainable APIs for the partnerships data and content.

• Address and discuss the need for, and potential use of, different aggregation services for this field.

• Develop and maintain plans and services for a free and open flow of open and reusable data between and from the four partner organizations.

• In cooperation with other governmental bodies organize another heritage hackathon in October 2016 with the explicit focus on open data, sharing, reuse and new and other services for both the public and the cultural heritage management sector.

• As a partnership develop skillsets, networks, arenas and competence for the employees in the four partner organizations (and beyond) within this field of expertise.

• Continue developing and strengthening partnerships on a local, national and international level through the use of open workshops, training, conferences and seminars.

• Continue to work towards improving data quality and promoting the use of open data.

One key challenge at the end of the Heritage here project was making the transition from being a project group to becoming a more permanent organizational entity – without losing key competence and experience. This was resolved by having each agency employing one person from the project each and assigning this person in a 50% position to the K-Lab collaboration. The remaining time was to be spent on other tasks for the agency. This helped ensure the following things:

• Continuity. The same project group could continue working, albeit organized in a slightly different manner.

• Transfer of knowledge. Competence built during Heritage here was transferred to organizational line of the agencies involved.

• Information exchange. By having one employee from each agency meeting on a regular basis information, ideas for common projects and solutions to common problems could easily be exchanged between the collaboration partners.

I addition to the allocation of human resources, each agency chipped in roughly EUR 20.000 as ‘free funds’. The main reasoning behind this kind of approach was to allow the new entity a certain operational freedom and room for creativity – while at the same time tying it closer to the day-to-day running of the agencies.

Based on an evaluation of the results achieved in Heritage Here, the start of 2016 was spent planning the direction forward for K-Lab, and a plan was formulated – outlining activities covering several thematic areas:

Improving data quality and accessibility. Making data available to the public was one of the primary goals of the Heritage here project, and one most important outcomes of the project was the realisation that in all agencies involved there is huge room for improvement in the quality of the data we make available and how we make it accessible. One of K-Lab’s tasks will be to cooperate on making quality data available through well documented API’s and making sure as much data as possible have open licenses that allow unlimited re-use.

Piloting services. The work done in the last year of Heritage Here with the map service mentioned above demonstrated to all parties involved the importance of actually building services that make use of our own open data. K-lab will, as a part of its scope, function as a ‘sandbox’ for both coming up with new ideas for services, and – to the extent that budget and resources allow for it – try out new technologies and services. One such pilot service, is the work done by K-lab – in collaboration with the Estonian photographic heritage society – in setting up a crowdsourcing platform for improving metadata on historic photos (https://fotodugnad.ra.no/).

For 2018, K-Lab will start looking into building a service making use of linked open data from our organizations. All of our agencies are data owners that responsible for authority data in some form or another – ranging from geo names to cultural heritage data and person data. Some work has been done already to bring our technical departments closer in this field, but we plan to do ‘something’ on a practical level next year.

Building competence. In order to facilitate the exchange of knowledge between the collaboration partners K-Lab will arrange seminars, workshops and conferences as arenas for discussing common challenges, learning from each other and building networks. This is done primarily to strengthen the relationship between the agencies involved – but many activities will have a broader scope. One such example is the intention to arrange workshops – roughly every two months – on topics that are relevant to our agencies, but that are open to anyone interested. To give a rough overview of the range of topics, these workshops were arranged in 2017:

• A practical introduction to Cidoc-CRM (May)

• Workshop on Europeana 1914-1918 challenge – co-host: Wikimedia Norway (June)

• An introduction to KulturNAV – co-host: Vestfoldmuseene (September)

• Getting ready for #hack4no (October)

• Transkribus – Text recognition and transcription of handwritten text - co-host: The Munch museum (November)

Third phase – 2018 and beyond

K-lab is very much a work in progress, and the direction it takes in the future depends on many factors. However, a joint workshop was held in September 2017 to evaluate the work done so far – and to try and map out a direction for the future. Employees from all levels in the organisations were present, with invited guests from other institutions from the cultural sector – like the National Library and Digisam from Sweden – to evaluate, discuss and suggest ideas.

No definite conclusions were drawn, but there was an overall agreement that the focus on the three areas described above is of great importance, and that the work done so far by the agencies together has been, for the most part, successful. Setting up arenas for discussing common problems, sharing success stories and interacting with colleagues across agency boundaries has been a key element in the relative success of K-Lab so far. This work will continue into 2018 with focus on thematic groups on linked open data and photo archives, and a new series of workshops is being planned. The experimentation with technology will continue, and hopefully new ideas will be brought forward and realised over the course of the next year(s).


3:15pm - 3:30pm
Short Paper (10+5min) [abstract]

Semantic Annotation of Cultural Heritage Content

Uldis Bojārs1,2, Anita Rašmane1

1National Library of Latvia; 2Faculty of Computing, University of Latvia

This talk focuses on the semantic annotation of textual content and on annotation requirements that emerge from the needs of cultural heritage annotation projects. The information presented here is based on two text annotation case studies at the National Library of Latvia and was generalised to be applicable to a wider range of annotation projects.

The two case studies examined in this work are (1) correspondence (letters) from the late 19th century between two of the most famous Latvian poets Aspazija and Rainis, and (2) a corpus of parliamentary transcripts that document the first four parliament terms in Latvian history (1922-1934).

The first half of the talk focus on the annotation requirements collected and how they may be implemented in practical applications. We propose a model for representing annotation data and implementing annotation systems. The model includes support for three core types of annotations - simple annotations that may link to named entities, structural annotations that mark up portions of the document that have a special meaning within a context of a document and composite annotations for more complex use cases. The model also introduces a separate Entity database for maintaining information about the entities referenced from annotations.

In the second half of the talk we will present a web-based semantic annotation tool that was developed based on this annotation model and requirements. It allows users to import textual documents (various document formats such as HTML and .docx are supported), create annotations and reference the named entities mentioned in these documents. Information about the entities references from annotations is maintained in a dedicated Entity database that supports links between entities and can point to additional information about these entities including Linked Open Data resources. Information about these entities is published as Linked Data. Annotated documents may be exported (along with annotation and entity information) in a number of representations including a standalone web view.

 
4:00pm - 5:30pmT-P674-3: Database Design
Session Chair: Jouni Tuominen
P674 
 
4:00pm - 4:30pm
Long Paper (20+10min) [publication ready]

Open Science for English Historical Corpus Linguistics: Introducing the Language Change Database

Joonas Kesäniemi1, Turo Vartiainen2, Tanja Säily2, Terttu Nevalainen2

1University of Helsinki, Helsinki University Library; 2University of Helsinki, Department of Modern Languages

This paper discusses the development of an open-access resource that can be used as a baseline for new corpus-linguistic research into the history of English: the Language Change Database (LCD). The LCD draws together information extracted from hundreds of corpus-based articles that investigate the ways in which English has changed in the course of history. The database includes annotated summaries of the articles, as well as numerical data extracted from the articles and transformed into machine-readable form, thus providing scholars of English with the opportunity to study fundamental questions about the nature, rate and direction of language change. It will also make the work done in the field more cumulative by ensuring that the research community will have continuous access to existing results and research data.

We will also introduce a tool that takes advantage of this new source of structured research data. The LCD Aggregated Data Analysis workbench (LADA) makes use of annotated versions of the numerical data available from the LCD and provides a workflow for performing meta-analytical experimentations with an aggregated set of data tables from multiple publications. Combined with the LCD as the source of collaborative, trusted and curated linked research data, the LADA meta-analysis tool demonstrates how open data can be used in innovative ways to support new research through data-driven aggregation of empirical findings in the context of historical linguistics.


4:30pm - 4:45pm
Short Paper (10+5min) [abstract]

“Database Thinking and Deep Description: Designing a Digital Archive of the National Synchrotron Light Source (NSLS)”

Elyse Graham

Stony Brook University,

Our project involves developing a new kind of digital resource to capture the history of research at scientific facilities in the era of the “New Big Science.” The phrase “New Big Science” refers to the post-Cold War era at US national laboratories, when large-scale materials science accelerators rather than high-energy physics accelerators became marquee projects at most major basic research laboratories. The extent, scope, and diversity of research at such facilities makes keeping track of it difficult to compile using traditional historical methods and linear narratives; there are too many overlapping and bifurcating threads. The sheer number of experiments that took place at the NSLS, and the vast amount of data that it produced across many disciplines, make it nearly impossible to gain a comprehensive global view of the knowledge production that took place at this facility.

We are therefore collaborating to develop a new kind of digital resource to capture the full history of this research. This project will construct a digital archive, along with an associated website, to obtain a comprehensive history of the National Synchrotron Light Source at Brookhaven National Laboratory. The project specifically will address the history of “the New Big Science” from the perspectives of data visualization and the digital humanities, in order to demonstrate that new kinds of digital tools can archive and present complex patterns of research and configurations of scientific infrastructure. In this talk, we briefly discuss methods of data collection, curation, and visualization for a specific case project, the NSLS Digital Archive.


4:45pm - 5:00pm
Distinguished Short Paper (10+5min) [publication ready]

Integrating Prisoners of War Dataset into the WarSampo Linked Data Infrastructure

Mikko Koho1, Erkki Heino1, Esko Ikkala1, Eero Hyvönen1,2, Reijo Nikkilä3, Tiia Moilanen3, Katri Miettinen3, Pertti Suominen3

1Semantic Computing Research Group (SeCo), Aalto University, Finland; 2HELDIG - Helsinki Centre for Digital Humanities, University of Helsinki, Finland; 3The National Prisoners of War Project

One of the great promises of Linked Data and the Semantic Web standards is to provide a shared data infrastructure into which more and more data can be imported and aligned, forming a sustainable, ever growing knowledge graph or linked data cloud, Web of Data. This paper studies and evaluates this idea in the context of the WarSampo Linked Data cloud, providing an infrastructure for data related to the Second World War in Finland. As a case study, a new database of prisoners of war with related contents is considered, and lessons learned discussed in relation to using traditional data publishing approaches.


5:00pm - 5:15pm
Short Paper (10+5min) [abstract]

"Everlasting Runes": A Research Platform and Linked Data Service for Runic Research

Magnus Källström1, Marco Bianchi2, Marcus Smith1

1Swedish National Heritage Board; 2Uppsala University

"Everlasting Runes" (Swedish: "Evighetsrunor") is a three-year collaboration between the Swedish National Heritage Board and Uppsala University, with funding provided by the Bank of Sweden Tercentenary Foundation (Riksbankens jubileumsfond) and the Royal Swedish Academy of Letters (Kungliga Vitterhetsakademien). The project combines philology, archaeology, linguistics, and information systems, and is comprised of several research, digitisation, and digital development components. Chief among these is the development of a web-based research platform for runic researchers, built on linked open data services, with the aim of drawing together disparate structured digital runic resources into a single convenient interface. As part of the platform's development, the corpus of Scandinavian runic inscriptions in Uppsala University's Runic Text Database will be restructured and marked up for use on the web, and linked against their entries in the previously digitised standard corpus work (Sveriges runinskrifter). In addition, photographic archives of runic inscriptions from the 19th- and 20th centuries from both the Swedish National Heritage Board archives and Uppsala University library will be digitised, alongside other hitherto inaccessible archive material.

As a collaboration between a university and a state heritage agency with a small research community as its primary target audience, the project must bridge the gap between the different needs and abilities of these stakeholders, as well as resolve issues of long-term maintenance and stability which have previously proved problematic for some of the source datasets in question. It is hoped that the resulting research- and data platforms will combine the strengths of both the National Heritage Board and Uppsala university to produce a rich, actively-maintained scholarly resource.

This paper will present the background and aims of the project within the context of runic research, as well as the various datasets that will be linked together in the research platform (via its corresponding linked data service) with particular focus on the data structures in question, the philological markup of the corpus of inscriptions, and requirements gathering.


5:15pm - 5:30pm
Distinguished Short Paper (10+5min) [abstract]

Designing a Generic Platform for Digital Edition Publishing

Niklas Liljestrand

Svenska litteratursällskapet i Finland r.f.,

This presentation describes the technical design for streamlining work with publishing Digital Editions on the web. The goal of the project is to provide a platform for scholars working with Digital Editions to independently create, edit, and publish their work. The platform is to be generic, but with set rules of conduct and processes, providing rich documentation of use.

The work on the platform started during 2016, with a rebuild of the website for Zacharias Topelius Skrifter for the mobile web (presented during DHN 2017, http://dhn2017.eu/abstracts/#_Toc475332550). The work continues with building the responsive site to be easily customizable and suite the different Editions needs.

The platform will consist of several independent tools, such as tools for publishing, version comparison, editing, and tagging XML TEI formatted documents. Many of the tools are already available today, but they are heavily dependent on customization for each new edition and MS Windows only. For the existing tools, the project aims to combine, simplify and make the tools platform independent.

The project will be completed within 2018 and the aim is to publish all tools and documentation as open-source.

 

 
Date: Friday, 09/Mar/2018
11:00am - 12:00pmF-P674-1: Teaching and Learning the Digital
Session Chair: Maija Paavolainen
P674 
 
11:00am - 11:15am
Short Paper (10+5min) [publication ready]

Creative Coding at the arts and crafts school Robotti (Käsityökoulu Robotti)

Tomi Dufva

Aalto-University, the school of Arts, Design and Architecture,

The increasing use of digital technologies presents a new set of challenges that, in addition to key economic and societal viewpoints, also reflects similar use in both education and culture. On the other hand, instead of a challenge, digitalization of our environment can also be seen as new material and a new medium for art and art education. This article suggests that both a better understanding of digital structures, and the ability for greater self-expression through digital technology is possible using creative coding as a teaching method.

This article focuses on Käsityökoulu Robotti (www.kasityokoulurobotti.fi), a type of hacker space for children that offers children teaching about art and technology. Käsityökoulu Robotti is situated within the contexts of art education, the maker movement, critical technology education, and media art. Art education is essential to Käsityökoulu Robotti in a bilateral sense, i.e., to discover in what ways art can be used to create clearer understanding of technology and at the same time teach children how to use new technological tools as a way to greater self-expression. These questions are indeed intertwined, as digital technology, like code, can be a substantial way to express oneself in ways that otherwise could not be expressed. Further, using artistic approaches, such as creative coding, can generate more tangible knowledge of digital technology. A deeper understanding of digital technology is also critical when dealing with the ever-increasing digitalization of our society, as it helps society to understand the digital structures that underlie our continually expanding digital world.

This article examines how creative coding works as a teaching method in Käsityökoulu Robotti to promote both artistic expression and a critical understanding of technology. Further still, creative coding is a tool for bridging the gap between maker movement, critical thinking and art practices and bring each into sharper focus. This discussion is the outcome of an ethnographic research project at Käsityökoulu Robotti.


11:15am - 11:30am
Distinguished Short Paper (10+5min) [abstract]

A long way? Introducing digitized historic newspapers in school, a case study from Finland

Inés Matres

University of Helsinki

During 2016/17 two Finnish newspapers, from their first issue to their last, were made available to schools in eastern Finland through the digital collections of the National Library of Finland (http://digi.kansalliskirjasto.fi). This paper presents the case study of one upper-secondary class making use of these materials. Before having access to these newspapers, the teachers in the school in question had little awareness of what this digital library contained. The initial research questions of this paper are whether digitised historic newspapers can be used by school communities, and what practices they enable. Subsequently, the paper explores how these practices relate to teachers’ habits and to the wider concept of literacy, that is, the knowledge and skills students can acquire using these materials. To examine the significance of historic newspapers in the context of their use today, I rely on the concept of ‘practice’ defined by cultural theorist Andreas Reckwitz as the “use of things that ‘mould’ activities, understandings and knowledge”.

To correctly assess practice, I approached this research through ethnographic methods, constructing the inquiry with participants in the research: teachers, students and the people involved in facilitating the materials. During 2016, I conducted eight in-depth interviews with teachers about their habits, organized a focus group with further 15 teachers to brainstorm activities using historic newspapers, and collaborated closely with one language and literature teacher, who implemented the materials in her class right away. Observing her students work and hearing their presentations, motivations, and opinions about the materials showed how students explored the historical background of their existing personal, school-related and even professional interests. In addition to the students’ projects, I also collected their newspaper clippings and logs of their searches in the digital library. These digital research assets revealed how the digital library that contains the historic newspapers influenced the students’ freedom to choose a topic to investigate and their capacity to ‘go deep’ in their research.

The findings of this case study build upon, and extend, previous research about how digitized historical sources contribute in upper-secondary education. The way students used historical newspapers revealed similarities with activities involving contemporary newspapers, as described by the teachers who participated in this study. Additionally, both the historicity and the form of presentation of newspapers in a digital library confer unique attributes upon these materials: they allow students to explore the historical background of their research interests, discover change across time, verbalize their research ideas in a concrete manner, and train their skills in distant and close reading to manage large amounts of digital content. In addition to these positive attributes that connect with learning goals set by teachers, students also tested the limits of these materials. The lack of metadata in articles or images, the absence of colour in materials that originally have it, or the need for students to be mindful of how language has changed since the publication of the newspapers are constrains that distinguish digital libraries from resources, such as web browsers and news sites, that are more familiar to students. Being aware of these positive and negative affordances, common to digital libraries containing historic newspapers and other historical sources, can support teachers in providing their students effective guidelines when using this kind of materials.

This use case demonstrates that digitized historical sources in education can do more than simply enabling students to “follow the steps of contemporary historians”, as research has previously established. These materials could also occupy a place between history and media education. The objective of media education in school –regardless of the technological underpinnings of a single medium, which change rapidly in this digital age– aims at enabling students to reflect on the processes of media consumption and production. The contribution of digitized historical newspapers to this subject is acquainting students with processes of media preservation and heritage. However, it could still be a long way until teachers adopt these aspects in their plans. It is necessary to acknowledge the trajectory and agents involved, since the 1960s, in the work of introducing newspapers in education. This task not only consisted of facilitating access to newspapers, but also of developing teaching plans and advocating for a common understanding and presence of media education in schools.

In addition to uncovering an aspect of digital cultural heritage that is relevant for the school community today, another aim of this paper is to raise awareness among the cultural heritage community, especially national libraries, about the diversity in the uses and users of their collections, especially in a time when the large-scale digitization of special collections is generalizing access to materials traditionally considered for academic research.

Selected bibliography:

Buckingham, D. (2003). Media education: literacy, learning, and contemporary culture. Polity Press.

Gooding, P. (2016). Historic Newspapers in the Digital Age: ‘Search All About It!’ Routledge.

Lévesque, S. (2006). Discovering the Past: Engaging Canadian Students in Digital History. Canadian Social Studies, 40(1).

Martens, H. (2010). Evaluating Media Literacy Education: Concepts, Theories and Future Directions. Journal of Media Literacy Education, 2(1).

Nygren, T. (2015). Students Writing History Using Traditional and Digital Archives. Human IT, 12(3), 78–116.

Reckwitz, A. (2002). Toward a Theory of Social Practices: A Development in Culturalist Theorizing. European Journal of Social Theory, 5(2), 243–263.


11:30am - 11:45am
Short Paper (10+5min) [abstract]

“See me! Not my gender, race, or social class”: Combating Stereotyping and prejudice mixing digitally manipulated experience with classroom debriefing.

Anders Steinvall1, Mats Deutschmann2, Mattias Lindvall-Östling2, Jon Svensson3, Roger Mähler3

1Department of Language Studies, Umeå University, Sweden; 2School of Humanities, Education and Social Sciences, Örebro University, Sweden; 3Humlab, Umeå University, Sweden

INTRODUCTION

Not only does stereotyping, based on various social categories such as age, social class, ethnicity, sexuality, regional affiliation, and gender serve to simplify how we perceive and process information about individuals (Talbot et al. 2003: 468), it also builds up expectations on how we act. If we recognise social identity as an ongoing construct, and something that is renegotiated during every meeting between humans (Crawford 1995), it is reasonable to speculate that stereotypic expectations will affect the choices we make when interacting with another individual. Thus, stereotyping may form the basis for the negotiation of social identity on the micro level. For example, research has shown that white American respondents react with hostile face expressions or tone of voice when confronted with African American faces, which is likely to elicit the same behaviour in response, but, as Bargh et al. point out (1996: 242), “because one is not aware of one's own role in provoking it, one may attribute it to the stereotyped group member (and, hence, the group)”. Language is a key element in this process. An awareness of such phenomena, and how we unknowingly may be affected by the same, is, we would argue, essential for all professions where human interaction is in focus (psychologists, teachers, social workers, health workers etc.).

RAVE (Raising Awareness through Virtual Experiencing) funded by the Swedish Research Council, aims to explore and develop innovative pedagogical methods for raising subjects’ awareness of their own linguistic stereotyping, biases and prejudices, and to systematically explore ways of testing the efficiency of these methods. The main approach is the use of digital matched-guise testing techniques with the ultimate goal to create an online, packaged and battle-tested, method available for public use.

We are confident that there is a place for this, in our view, timely product. There can be little doubt that the zeitgeist of the 21st centuries first two decades has swung the pendulum in a direction where it has become apparent that the role of Humanities should be central. In times when unscrupulous politicians take every chance to draw on any prejudice and stereotypical assumptions about Others, be they related to gender, ethnicity or sexuality, it is the role of the Humanities to hold up a mirror and let us see ourselves for what we are. This is precisely the aim of the RAVE project.

In line with this thinking, open access to our materials and methods is of primary importance. Here our ambition is not only to provide tested sample cases for open access use, but also to provide clear directives on how these have been produced so that new cases, based on our methods, can be created. This includes clear guidelines as to what important criteria need to be taken into account when so doing, so that our methodology is disseminated openly and in such a fashion that it becomes adaptable to new contexts.

METHOD

The RAVE method at its core relies on a treatment session where two groups of test subjects (i.e. students) each are exposed to one out of two different versions of the same scripted dialogue. The two versions differ only with respect to the perception of the gender of the characters, whereas scripted properties remain constant. In one version, for example, one participant, “Terry”, may sound like a man, while in the other recording this character has been manipulated for pitch and timbre to sound like a woman. After the exposure, the subjects are presented with a survey where they are asked to respond to questions related to linguistic behaviour and character traits one of the interlocutors. The responses of the two sub-groups are then compared and followed up in a debriefing session, where issues such as stereotypical effects are discussed.

The two property-bent versions are based on a single recording, and the switch of the property (for instance, gender) is done using digital methods described below. The reason for this procedure is to minimize the number of uncontrolled variables that could affect the outcome of the experiment. It is a very difficult - if not an impossible - task to transform the identity-related aspects of a voice recording, such as gender or accent, while maintaining a “perfect” and natural voice - a voice that is opposite in the specific aspect, but equivalent in all other aspects, and doing so without changing other properties in the process or introducing artificial artifacts.

Accordingly, the RAVE method doesn’t strive for perfection, but focuses on achieving a perceived credibility of the scripted dialogue. However, the base recording is produced with a high quality to provide the best possible conditions for the digital manipulation. For instance, the dialogue between the two speakers are recorded on separate tracks so as to keep the voices isolated.

The digital manipulation is done with the Praat software (Boersma & Weenink, 2013). Formants, range and and pitch median are manipulated for gender switching using standard offsets and are then adapted to the individual characteristics of the voices. Several versions of the manipulated dialogues are produced, and evaluated by a test group via an online survey. Based on the survey result, the one with the highest quality is selected. This manipulated dialogue needs further framing to reach a sufficient level of credibility.

The way the dialogue is framed for the specific target context, how it is packaged and introduced is of critical importance. Various kinds of techniques, for instance use of audiovisual cues, are used to distract the test subject from the “artificial feeling”, as well as to enforce the desired target property. We add various kinds of distractions, both audial and visual, which lessen the listeners’ focus on the current speaker, such as background voices simulating the dialogue taking place in a cafe, traffic noise, or scrambling techniques simulating, for instance, a low-quality phone or a Skype call.

On this account, the RAVE method includes a procedure to evaluate the overall (perceived) quality and credibility of a specific case setup.This evaluation is implemented by exposing a number of pre-test subjects to the packaged dialogue (in a set-up comparable to the target context). After the exposure, the pre-test subjects respond to a survey designed to measure the combined impression of aspects such as the scripted dialogue, the selected narrators, the voices, the overall set-up, the contextual framing etc.

The produced dialogues, and accompanying response surveys are turned into a single online package using the program Storyline. The single entry point to the package makes the process of collecting anonymous participant responses more fail-safe and easier to carry out.

The whole package is produced for a “bring your own device” set-up, where the participants use their own smart phones, tablets or laptops to take part in the experiment. These choices of using an online single point of entry package adapted to various kinds of devices have been made to facilitate experiment participation and recording of results. The results from the experiment is then collected by the teacher and discussed with the students at an ensuing debriefing seminar.

FINDINGS

At this stage, we have conducted experiments using the RAVE method with different groups of respondents, ranging from teacher trainees, psychology students, students of sociology, active teachers, the public at large etc, in Sweden and elsewhere. Since the experiments have been carried out in other cultural contexts (in the Seychelles, in particular), we have received results that enable cross-cultural comparisons.

All trials conducted addressing gender stereotyping have supported our hypothesis that linguistic stereotyping acts as a filter. In trials conducted with teacher trainees in Sweden (n = 61), we could show that respondents who listened to the male guise overestimated stereotypical masculine conversational features such as how often the speaker interrupted, how much floor space ‘he’ occupied, and how often ‘he’ contradicted his counterpart. On the other hand, features such as signalling interest and being sympathetic were overestimated by the respondents when listening to the female guise.

Results from the Seychelles have strengthened our hypothesis. Surveys investigating linguistic features associated with gender showed that respondents’ (n=46) linguistic gender stereotyping was quite different from that of Swedish respondents. For example, the results from the Seychelles trials showed that floor space and the number of interruptions made were overestimated by the respondents listening to the female guise, quite unlike the Swedish respondents, but still in line with our hypothesis.

Trials using psychology students (n=101) have similar results. In experiments where students were asked to rate a case character’s (‘Kim’) personality traits and social behaviour, our findings show that the male version of Kim was deemed more unfriendly and a bit careless compared to the female version of Kim, who was regarded to be more friendly and careful. Again, this shows that respondents overestimate aspects that confirm their stereotypic preconceptions.

PEDAGOGY

The underlying pedagogical idea for the set-up is to confront students and other participants with their own stereotypical assumptions. In our experience, discussing stereotypes with psychology and teacher training students does not give rise to the degree of self-reflection we would like. This is what we wanted to remedy. With the method described here, where the dialogues are identical except for the manipulation in terms of pitch and timbre, perceived differences in personality and social behaviour can only be explained as residing in the beholder.

A debriefing seminar after the exposure gave the students an opportunity to reflect on the results from the experiment. They were divided into mixed groups where half the students had listened to and responded to the male guise, and the other half to the female guise. Since any difference between the groups was the result of the participants’ rating, their own reactions to the conversations, there was something very concrete and urgent to discuss. Thus, the experiment affected the engagement positively. Clearly, the concrete and experiential nature of this method made the students analyze the topic, their own answers, the reasons for these and, ultimately, themselves in greater detail and depth in order to understand the results from the experiment, and try to relate the results to earlier research findings. Judging from these impressions, the method is clearly very effective.

Answers from a survey with psychology students (n=101) after the debriefing corroborate this impression. In response to the question “What was your general experience of the experiment that you have just partaken in? Did you learn anything new?”, a clear majority of the students responded positively: 76 %. Moreover, close to half of these answers explicitly expressed self-reflective learning. Of the remaining comments, 15 % were neutral, and 9 % expressed critical feedback.

Examples of responses expressing self-reflection include: “… It gave me food for thought. Even though I believed myself to be relatively free of prejudice I can't help but wonder if I make assumptions about personalities merely from the time of someone's voice.” And: “I learned some of my own preconceptions and prejudices that I didn't know I had.” An example of a positive comment with no self-reflective element is: “Female and male stereotypes were stronger than I expected, even if only influenced by the voice”,

The number of negative comments was small. The negative comments generally took the position that the results were expected so there was nothing to discuss, or that the student had figured out the set-up from the beginning. A few negative comments revealed that the political dimension of the subject of gender could influence responses. These students would probably react in the same way to a traditional seminar. We haven’t been able to reach everyone … yet ...


11:45am - 12:00pm
Short Paper (10+5min) [abstract]

Digital archives and the learning processes of performance art

Tero Nauha

University of Helsinki

In this presentation, the process of learning performance art is articulated in the contextual change that digital archives have caused starting from the early 1990s. It is part of my postdoctoral research, artistic research on the conjunctions between divergent gestures of thought and performance, done in a research project How to Do Things with Performance? funded by the Academy of Finland.

Since performance art is a form of ‘live art’, it would be easy to regard that the learning processes are also mostly based on the physical practice and repetition. However, in my regard, performance art is a significant line of flight from the 1960’s and 70’s conceptual art, alongside the video-art. Therefore, the pedagogy of performance art has been tightly connected with the development of media from the collective use of the Portapak video cameras and the recent development of VR-attributed performances, or choreographic archive methods by such figures like William Forsythe, or the digital journals of artistic research like Ruukku-journal or Journal for Artistic Research, JAR.

This presentation will speculate on the transformation of performance art practices, since when the vast amount of historical archive materials has become accessible to artists, notwithstanding the physical location of a student or an artist. At the same time the social media affects the peer groups of artists. My point of view is not based on statistics, but on the notions that I have gathered from the teaching of performance art, as well as instructing MA and PhD level research projects.

The argument is that the emphasis on learning in performative practices is not based on talent, but it rather is general and generic, where the access to networks and digital archives serve as a tool for social form of organization. Or speculation on what performance art is? In this sense, and finally my argument is that the digital virtuality does not conflate with the concept of the virtual. On this, my argument leans on the philosophical thought on actualization and the virtual by Henri Bergson, Gilles Deleuze and Alexander R. Galloway. The access to the digital archives in the learning processes is rather based on the premise that artistic practices are explicitly actualizations of the virtual, already. The digitalization is a modality of this process.

The learning process of performance art is not done through resemblance, but doing with someone or something else and developed in heterogeneity with digital virtualities.

 
4:00pm - 5:30pmF-P674-2: Between the Manual and the Automatic
Session Chair: Eero Hyvönen
P674 
 
4:00pm - 4:15pm
Short Paper (10+5min) [publication ready]

In search of Soviet wartime interpreters: triangulating manual and digital archive work

Svetlana Probirskaja

University of Helsinki

This paper demonstrates the methodological stages of searching for Soviet wartime interpreters of Finnish in the digital archival resource of the Russian Ministry of Defence called Pamyat Naroda (Memory of the People) 1941–1945. Since wartime interpreters do not have their own search category in the archive, other means are needed to detect them. The main argument of this paper is that conventional manual work must be done and some preliminary information obtained before entering the digital archive, especially when dealing with a marginal subject such as wartime interpreters.


4:15pm - 4:30pm
Distinguished Short Paper (10+5min) [abstract]

Digital Humanities Meets Literary Studies: the Challenges for Estonian Scholarship

Piret Viires1, Marin Laak2

1Tallinn University; 2Estonian Literary Museum

In recent years, the application of DH as a method of computerised analysis and the extensive digitisation of literary texts, making them accessible as open data and organising them into large text corpora, have made the relations between literature and information technology a hot topic.

New directions in literary history link together literary analysis, computer technology and computational linguistics, offering new possibilities for studying the authors’ style and language, analysing texts and visualising results.

Along such mainstream uses, DH still contain several other important directions for literary studies. The aim of this paper is to check out the limits and possibilities of DH as a concept

and to determine their suitability for literary research in the digital age. Our discussion is based, first, on the twenty-year-long experience of digital representing of Estonian literary

and cultural heritage and, second, on the synchronous study of digitally born literary forms; we shall also offer more representative examples.

We shall also discuss the concept of DH from the viewpoint of literary studies, e.g., we examine the ways of positioning the digitally created literature (both “electronic literature”

and the literature born in social media) under this renewed concept. This problem was topical in the early 2000s, but in the following decade it was replaced by the broader ideas of

intermedia and transmedia, which treated literary texts only as one medium among many others. Which are the specific features of digital literature, which are its accompanying effects

and how has the role of the reader as the recipient changed in the digital environment? These theoretical questions are also indirectly relevant for making the literature created in the era of

printed books accessible as e-books or open data.

Digitising of older literature is the responsibility of memory institutions (libraries, archives, museums). Extensive digitising of texts at memory institutions seems to have been done for

making reading more convenient – books can be read even on smartphones. Digitising works of fiction as part of the projects for digitising cultural heritage has been carried out for more

than twenty years. What is the relation of these virtual bookshelves with the digital humanities? We need to discover whether and how do both the digitally born literature and

the digitised literature that was born in the era of printing have an effect on literary theory. Our paper will also focus on mapping different directions, practices and applications of DH in

the present day literary theory. The topical question is how to bridge the gap between the research possibilities offered by the present day DH and the ever increasing resources of texts,

produced by memory institutions. We encounter several problems. Literary scholars are used to working with texts, analysing them as undivided works of poetry, prose or drama. Using of

DH methods requires the treating of literary works or texts as data, which can be analysed and processed with computer programmes (data mining, using visualisation tools, etc.). These

activities require the posing of new and totally different research questions in literary studies. Susan Schreibman, Ray Siemens and John Unsworth, the editors of the book A New Companion to Digital Humanities (2016), discuss the problems of DH and point out in their Foreword that it is still questioned whether DH should be considered a separate discipline or, rather, a set of different interlinked methods. In our paper we emphasise the diversity of DH as an academic field of research and talk about other possibilities it can offer for literary research in addition to computational analyses of texts.

In Estonia, research on the electronic new media and the application of digital technology in the field of literary studies can be traced back to the second half of the 1990s. The analysis of

social, cultural and creative effect (see Schreibman, Siemens, Unsworth 2016: xvii-xviii), as well as constant cooperation with social sciences in the research of the Internet usage have

played an important role in Estonian literary studies.


4:30pm - 4:45pm
Short Paper (10+5min) [abstract]

Digital humanities and environmental reporting in television during the Cold War Methodological issues of exploring materials of the Estonian, Finnish, Swedish, Danish, and British broadcasting companies

Simo Laakkonen

University of Turku, Degree Programme on Cultural Production and Landscape Studies

Environmental history studies have relied on traditional historical archival and other related source materials so far. Despite the increasing availability of new digitized materials studies in this field have not reacted to these emerging opportunities in any particular way. The aim of the proposed paper is to discuss possibilities and limitations that are embodied in the new digitized source materials in different European countries. The proposed paper is an outcome of a research project that explores the early days of television prior to the Earth Day in 1970 and frame this exploration from an environmental perspective. The focus of the project is reporting of environmental pollution and protection during the Cold War. In order to realize this study the quantity and quality of related digitized and non-digitized source materials provided by the national broadcasting companies of Estonia (ETV), Finland (YLE), Sweden (SVT), Denmark (DR), and United Kingdom (BBC) were examined. The main outcome of this international comparative study is that the quantity and quality of available materials varies greatly, even in a surprising way between the examined countries that belonged to different political spheres (Warsaw Pact, neutral, NATO) during the Cold War.


4:45pm - 5:00pm
Short Paper (10+5min) [abstract]

Prosodic clashes between music and language – challenges of corpus-use and openness in the study of song texts

Heini Arjava

University of Helsinki,

In my talk I will discuss the relationship between linguistic and musical rhythm, and the connections to digital humanities and open science that arise in their study. My ongoing corpus research discusses the relationship between linguistic and musical segment length in songs, focusing on instances where the language has adapt prosodically to the rhythmic frame provided by pre-existing music. More precisely, the study addresses the question of how syllable length and note length interact in music. To what extent can non-conformity between linguistic and musical segment length, clashes, be acceptable in song lyrics, and what other prosodic features, such as stress, may influence the occurrence of clashes in segment length?

Addressing these questions with a corpus-based approach leads to questions of retrieving information retrieval complicated corpora which combine two medias (music and language), and the openness and accessibility of music sources. In this abstract I will first describe my research questions and the song corpus used in my study in section 1, and discuss their relationship with the use, analysis and availability of corpora, and issues of open science in section 2.

1. Research setting and corpus

My study aims to approach the comparison of musical and linguistic rhythm by both qualitative and statistical methods. It bases on a self-collected song corpus in Finnish, a language where syllable length has a versatile relationship with stress (cf. Hakulinen et al 2004). Primary stress in Finnish is weight-insensitive and always falls on the first syllable of a word, and syllables of any length, long or short, can be stressed or unstressed. Finnish sound segment length is also phonemic, that is, creates distinctions of meaning. Syllable length in Finnish is therefore of particular interest in a study of musical segment length, because length deviations play an evident role in language perception.

Music and text can be combined into a composition in a number of ways, but my study focuses on the situations in which language is the most dependent of music. Usually there are three alternative orders in which music and language can be combined into songs: First, text and music may be written simultaneously and influence the musical and linguistic choices of the writer at the same time (Language < – > Music). Secondly, text can precede the music, as when composers compose a piece to existing poetry (Language –> Music). And finally, the melody may exist first, as when new versions of songs are created by translating or otherwise rewriting them to familiar tunes (Music –> Language).

My research is concerned with this third relationship, because it poses the strongest constraints on the language user. The language (text) must conform to the music’s already existing rhythmic frame that is in many respects inflexible, and in such cases, it is difficult to vary the rhythmic elements of the text, because the musical space restricts the rhythmic tools available for the language user. This in turn may lead to non-neutral linguistic output. Thus the crucial question arises: How does language adapt its rhythm to music?

My corpus contains songs that clearly and transparently represent the relationship of music being created first and providing the rhythmic frame, and language having to adjust to that frame. The pilot corpus consists of 15 songs and approximately 1500 prosodically annotated syllables of song texts in Finnish, translated or otherwise adapted from different languages, or written to instrumental or traditional music. The genres include chansons, drinking songs, Christmas songs and hymns, which originate from different eras and languages, namely English, French, German, Swedish, and Italian.

One data point in the table format of the corpus is a Finnish syllable, the prosodic properties of which I compare with the rhythm of the respective notes (musical length and stress). The most basic instance of a clash between segment lengths is the instance where a short syllable ((C)V in Finnish) falls on a long note (i.e. a longer note than a basic half-beat) . Both theoretical and empirical evidence will be used to determine which length values create the clearest cases of prosodic clashes.

A crucial presupposition when problematising the relationship between a musical form and the text written to it is the notion that a song is not poetry per se (I will return to this conception in section 2). The conventions of Western art music allow for a far greater range of length distinctions than language: the syllable lengths usually fall into binary or ternary categories (e.g. short and long syllables), whereas in music notes can be elongated infinitely. A translated song in which all rhythmic restrictions come from the music may follow the lines of poetic traditions, but must deviate from them if the limits of space within music do not allow for full flexibility. It is therefore an intermediate form of verbal art.

2. Challenges for digital humanities and open science

The corpus-based approach to language and music poses problematic questions regarding digital humanities. First of these is, of course, if useful music-linguistic corpora can be found at all at the present. Existent written and spoken corpora of the major European languages contain millions of words, often annotated to a great linguistic detail (cf. Korp of Kielipankki for Finnish (korp.csc.fi), which offers detailed contextual, morphological and syntactic analysis). For music as well, digital music scores can be found “in a huge number” (Ponce de León et al. 2008:560). Corpora of song texts with both linguistic and musical information seem to be more difficult to find.

One problem of music linguistic studies is related to the more restricted openness and shareability of sources than that of written or spoken language. The copyright questions of art are in general a more sensitive issue than for instance those of newspaper articles or internet conversations, and the reluctance of the owners of song texts and melodies may have made it difficult to create open corpora of contemporary music.

But even with ownership problems aside (such as with older or traditional music), building a music-linguistic corpus remains a difficult task to comply. A truly useful corpus of music for linguistic purposes would include metadata of both medias, both language and music. Thus even an automatically analysed metric corpus of poetry, like Anatoli Starostin’s Treeton for metrical analysis of Russian poems (Pilshcikov & Starostin 2011) or the rhythmic Metricalizer for determining meter by stress patterns in German poems (Bobenhausen 2011) does not answer to the questions of rhythm of a song text, which exists in a extra-linguistic medium, music, altogether. Vocal music is metrical, but it is not metrical in the strict sense of poetic conventions, with which it shares the isochronic base. Automated analysis of a song text without its music notation does not tell anything about its real metrical structure.

On a technical level, a set of tools that is necessary for researchers of music are the tools for quick visualization of music passages (notation tools, sound recognition). Such software can be found and used freely in the internet and are useful for depiction purposes. Mining of information from music requires more effort, but has been done in various projects for instance for melody information retrieval (Ponce de León et al. 2008), or metrical detection of notes (Temperley 2001). But again, these tools seem to rarely combine linguistic and musical meter simultaneously.

By raising these questions I hope to bring attention to the challenges of studying texts in the musical domain, that is, not simply music or poetry separately. The crux of the issue is that for the linguistic analysis of song texts we need actual textual data where the musical domain appears as annotated metadata. Means exist to analyse text automatically, and to analyse musical patterns with sound recognition or otherwise, but to combine the two raises the analysis to a more complicated level.

Literature

Blumenfeld, Lev. 2016. End-weight effects in verse and language. In: Studia Metrica Poet. Vol. 3.1 pp. 7–32.

Bobenhausen, Klemens. 2011. The Metricalizer – Automated Metrical Markup of German Poetry. In: Küper, C. (ed.), Current trends in metrical analysis, pp. 119-131. Frankfurt am Main; New York: Peter Lang.

Hayes, Bruce. 1995. Metrical Stress Theory: principals and case studies. Chicago: The University of Chicago Press.

Hakulinen, et al. (eds.). 2004. Iso suomen kielioppi, pp.44–48. Helsinki: Suomalaisen Kirjallisuuden Seura.

Jeannin, M. 2008. Organizational Structures in Language and Music. In: The World of Music,50(1), pp. 5–16.

Kiparsky, Paul. 2006. A modular metrics for folk verse. In: B. Elan Dresher & Nila Friedberg (eds.), Formal approaches to poetry: recent developments in metrics, pp.7–52. Berlin: Mouton de Gruyter.

Lerdahl, Fred & Jackendoff, Ray. 1983. A generative theory of tonal music. Cambridge (MA): MIT.

Lotz, John. 1960. Metric typology. In: Thomas Sebeok (ed.), Style in language. Massachusetts: The M.I.T. Press.

Palmer, Caroline & Kelly, Michael H. 1992. Linguistic Prosody and Musical Meter in Song.

Journal of memory and language 31, pp. 525–542.

Pilshchikov, Igor & Starostin, Anatoli. 2011. Automated Analysis of Poetic Texts and the Problem of Verse Meter. In: Küper, C. (ed.), Current trends in metrical analysis, pp. 133–140. Frankfurt am Main; New York: Peter Lang.

Ponce de León, Pedro J., Iñesta, José M. & Rizo, David. 2008. Mining Digital Music Score Collections: Melody Extraction and Genre Recognition. In: Peng-Yeng Yin (ed.), Pattern Recognition Techniques, Technology and Applications, pp. 626–. Vienna: I-Tech.

Temperley, D. 2001. The Cognition Of Basic Musical Structures. Cambridge, Mass: MIT Press.


5:00pm - 5:15pm
Distinguished Short Paper (10+5min) [abstract]

Finnish aesthetics in scientific databases

Darius Pacauskas, Ossi Naukkarinen

Aalto University School of Arts, Design and Architecture

The major academic databases such as Web of Science and Scopus are dominated by publications written in English, often by scholars affiliated to American and British universities. As such databases are repeatedly used as basis for assessing and analyzing activities and impact of universities and even individual scholars, there is a risk that everything published in other, especially minor languages, will be sidetracked. Standard data-mining procedures do not notice them. Yet, especially in humanities, other languages and cultures have an important role and scholars publish in various languages.

The aim of this research project is to critically look into how Finnish aesthetics is represented in scientific databases. What kind of picture of Finnish aesthetics can we draw if we rely on the metadata from commonly used databases?

We will address this general issue through one example. We will compare metadata from two different databases, in two different languages, English and Finnish, and form a picture of two different interpretations of an academic field, aesthetics - or estetiikka in Finnish. To achieve this target we will employ citation analysis, as well as text summarization techniques, in order to understand the differences lying between the largest world scientific database - Scopus, and the largest Finnish one - Elektra. Moreover, we will identify the most influential Finnish aestheticians and analyze their publications record in order to understand to what extent the scientific databases can represent Finnish aesthetics. Through this, we will present 1) two different maps containing actors and works recognized in the field, and 2) an overview of the main topics from two different databases.

For these goals, we will collect metadata from the both Scopus and Elektra databases and references from each relevant article. Relevant articles will be located by using keyword “aeshetics” or the Finnish equivalent “estetiikka”, as well as identifying scientific journals focusing on aesthetics. We will perform citation analysis to explore in which countries which publications are cited, based on Scopus data. This comparison will allow us to understand what are the most prominent works for different countries, as well as to find the countries in which those works are developed, e.g., works that are acknowledged by Finnish aestheticians according to international database. In addition, the comparison will allow us to understand how Finnish aesthetics differs from other countries.

Later, we will perform citation analysis with the data gathered from the Finnish scientific database Elektra. Results will indicate distribution between cited Anglo-American texts and the ones written in Finland or in Finnish language. Thus we could understand which language-family sources Finnish aestheticians rely on in their works. Further we will apply text summary techniques to see the differences in the topics both databases are discussing. Furthermore, we will collect a list names of the most influential Finnish aestheticians, and their works (as provided by the databases). We will perform searches within two databases to understand how much of their works are covered.

As additional contribution, we will be developing an interactive web based tool to represent results of this research. Such tool will give an opportunity for aesthetics researchers to explore Finnish aesthetics field through our established lenses and also comment on possible gaps in the pictures offered by the databases. It is possible that databases only give a very partial picture of the field and in this case new tools should be developed in co-operation with researchers. The similar situation might be true also in other sub-fields of humanities where non-English activities are usual.