Conference Agenda

Digital Humanities in the Nordic Countries 3rd Conference

Location: PII

Date: Wednesday, 07/Mar/2018

4:00pm - 5:30pm

W-PII-1: Historical Texts
Session Chair: Asko Nivala

PII

4:00pm - 4:30pm
Long Paper (20+10min) [abstract]

Diplomatarium Fennicum and the digital research infrastructures for medieval studies

Seppo Eskola, Lauri Leinonen

National Archives of Finland,

Digital infrastructures for medieval studies have advanced in strides in Finland over the last few years. Most literary sources concerning medieval Finland − the Diocese of Åbo − are now available online in one form or another: Diplomatarium Fennicum encompasses nearly 7 000 documentary sources, the Codices Fennici project recently digitized over 200 mostly well-preserved pre-17th century codices and placed them online, and Fragmenta Membranea contains digital images of 9 300 manuscript leaves belonging to over 1 500 fragmentary manuscripts. In terms of availability of sources, the preconditions for research have never been better. So, what’s next?

This presentation discusses the current state of digital infrastructures for medieval studies and their future possibilities. For the past two and a half years the presenters have been working on the Diplomatarium Fennicum webservice, published in November 2017, and the topic is approached from this background. Digital infrastructures are being developed on many fronts in Finland: several memory institutions are actively engaged (the three above-mentioned webservices are developed and hosted by the National Archives, The Finnish Literature Society, and the National Library respectively) and many universities have active medieval studies programs with an interest in digital humanities. Furthermore, interest in Finnish digital infrastructures is not restricted to Finland as Finnish sources are closely linked to those of other Nordic countries and the Baltic Sea region in general. In our presentation, we will compare the different Finnish projects, highlight opportunities for international co-operation, and discuss choices (e.g. selecting metadata models) that could best support collaboration between different services and projects.

4:30pm - 4:45pm
Short Paper (10+5min) [publication ready]

The HistCorp Collection of Historical Corpora and Resources

Eva Pettersson, Beáta Megyesi

Uppsala University

We present the HistCorp collection, a freely available open platform aiming at the distribution of a wide range of historical corpora and other useful resources and tools for researchers and scholars interested in the study of historical texts. The platform contains a monitoring corpus of historical texts from various time periods and genres for 14 European languages. The collection is taken from well-documented historical corpora, and distributed in a uniform, standardised format. The texts are downloadable as plaintext, and in a tokenised format. Furthermore, some texts are normalised with regard to spelling, and some are annotated with part-of-speech and syntactic structure. In addition, preconfigured language models and spelling normalisation tools are provided to allow the study of historical languages.

4:45pm - 5:00pm
Short Paper (10+5min) [publication ready]

Semantic National Biography of Finland

Eero Hyvönen^1,2, Petri Leskinen¹, Minna Tamper^1,2, Jouni Tuominen^2,1, Kirsi Keravuori³

¹Aalto University; ²University of Helsinki (HELDIG); ³Finnish Literature Society (SKS)

This paper presents the idea and project of transforming and using

the textual biographies of the National Biography of Finland, published by the

Finnish Literature Society, as Linked (Open) Data. The idea is to publish the lives as semantic, i.e., machine “understandable” metadata in a SPARQL endpoint in the Linked Data Finland (LDF.fi) service, on top of which various Digital Humanities applications are built. The applications include searching and studying individual personal histories as well as historical research of groups of persons using methods of prosopography. The basic biographical data is enriched by extracting events from unstructured texts and by linking entities internally and to external data sources. A faceted semantic search engine is provided for filtering groups of people from the data for research in Digital Humanities. An extension of the event-based CIDOC CRM ontology is used as the underlying data model, where lives are seen as chains of interlinked events populated from the data of the biographies and additional sources, such as museum collections, library databases, and archives.

5:00pm - 5:15pm
Short Paper (10+5min) [abstract]

Creating a corpus of communal court minute books: a challenge for digital humanities

Maarja-Liisa Pilvik¹, Gerth Jaanimäe¹, Liina Lindström¹, Kadri Muischnek¹, Kersti Lust²

¹University of Tartu, Estonia,; ²The National Archives of Estonia, Estonia

This paper presents the work of a digital humanities project concerned with the digitization of Estonian communal court minute books. The local communal courts in Estonia came into being through the peasant laws of the early 19th century and were the first instance class-specific courts, that tried peasants. Rather than being merely judicial institutions, the communal courts were at first institutions for the self-government of peasants, since they also dealt with police and administrative matters. After the municipal reform of 1866, however, the communal courts were emancipated from the noble tutelage and the court became a strictly judicial institution, that tried peasants for their minor offences and solved their civil disputes, claims and family matters. The communal courts in their earlier form ceased to exist in 1918, when Estonia became independent from the Russian rule.

The National Archives of Estonia holds almost 400 archives of communal courts from the pre-independence period. They have been preserved very unevenly and not all of them include minute books. The minute books themselves are also written in an inconsistent manner, the earlier minute books are often written in German and the writing is strongly dependent on the skills and will of the parish clerk. However, the materials from the period starting with the year 1866, when the creation of the minute books became more systematic, are a massive and rich source shedding light on the everyday lives of the peasantry. Still, at the moment, the users of the minute books meet serious difficulties in finding relevant information since there are no indexes and one has to go through all the materials manually. The minute books are also a fascinating resource for linguists, both dialectologists and computational linguists: the books contain regional varieties tied to specific genre and early time period (making it possible to detect linguistic expressions, which are rare in atlases, for example, and also in dialect corpus, which represents language from about 100 years later) while also being a written resource, reflecting the writing traditions of the old spelling system. This is also what makes these texts complex and challenging for automatic analysis methods, which are otherwise quite well-established in contemporary corpus linguistics.

In our talk we present a project dealing with the digitization and analysis of the minute books from the period between 1866 and 1890. The texts were first digitized in the 2000s and preserved in a server in html-format, which is good for viewing, but not as good for automatic processing. After the server crashed, the texts were rescued via web archives and the structure of the minute books was used to convert the documents automatically into a more functional format using xml-markup and separating the body text with tags referring to information about the titles, dates, indexes, participants, content and topical keywords, which indicate the purview of the communal courts in that period.

We discuss the workflow of creating a digital resource in a standardized and maximally functional format as well as challenges, such as automatic text processing for cleaning and annotating the corpus in order to distinguish the relevant layers of information. In order to enable queries with different degrees of specificity in the corpus, the texts also need to be linguistically analyzed. For both named entity recognition (NER), which enables network analysis and links the events described in the materials to geospatial locations, and morphological annotation, which makes it possible to perform queries based on lemmas or grammatical information, we have applied the Estnltk library in Python, which is developed for contemporary written standard Estonian. For NER, its performance was satisfactory, i.e. it found recognized names well, even though it systematically overrecognized organization names. The most complicated issue so far is the morphological analysis and disambiguation of word forms. Tools developed for Estonian morphological analysis, such as Estnltk or Vabamorf, are trained on contemporary written standard Estonian. Communal court minute books, however, include language variants, which are a mixture of dialectal language, inconsistent spelling and the old spelling system. In the presentation, we introduce the results of our first attempts to apply Estnltk tools to the materials of communal court minute books, the problems that we’ve run into, and provide solutions for overcoming these problems.

The final aim of the project is to create a multifunctional source, which could be of interest for researchers of different fields within the humanities. As the National Archives have a considerable amount of communal court minute books, which are thus far only in a scanned form, the digitized minute books collection is planned to expand using crowdsourcing oportunities.

References:

Estnltk. Open source tools for Estonian natural language processing; https://estnltk.github.io/estnltk/1.2/#.

Vabamorf. Eesti keele morfanalüsaator [‘The morphological analyzer of Estonian’]; https://github.com/Filosoft/vabamorf.

5:15pm - 5:30pm
Distinguished Short Paper (10+5min) [publication ready]

FSvReader – Exploring Old Swedish Cultural Heritage Texts

Yvonne Adesam, Malin Ahlberg, Gerlof Bouma

University of Gothenburg,

This paper describes FSvReader, a tool for easier access to Old Swedish (13th–16th century) texts. Through automatic fuzzy linking of words in a text to a dictionary describing the language of the time, the reader has direct access to dictionary pop-up definitions, in spite of the large amount of graphical and morphological variation. The linked dictionary entries can also be used for simple searches in the text, highlighting possible further instances of the same entry.

Date: Thursday, 08/Mar/2018

9:00am - 10:30am

Plenary 2: Kathryn Eccles
Session Chair: Eero Hyvönen

Finding the Human in Data: What can Digital Humanities learn from digital transformations in cultural heritage?

PII

11:00am - 12:30pm

T-PII-1: Our Digital World
Session Chair: Leo Lahti

PII

11:00am - 11:15am
Short Paper (10+5min) [publication ready]

The unchallenged persuasions of mobile media technology: The pre-domestication of Google Glass in the Finnish press

Minna Saariketo

Aalto University,

In recent years, networked devices have taken an ever tighter hold of

people’s everyday lives. The tech companies are frantically competing to grab

people’s attention and secure a place in their daily routines. In this short paper, I

elaborat further a key finding from an analysis of Finnish press coverage on

Google Glass between 2012 and 2015. The concept of pre-domestication is used

to discuss the ways in which we are invited and persuaded by the media discourse

to integrate ourselves in the carefully orchestrated digital environment. It is

shown how the news coverage deprives potential new users of digital technology

a chance to evaluate the underpinnings of the device, the attachments to data harvesting, and the practices of hooking attention. In the paper, the implications of

contemporary computational imaginaries as (re)produced and circulated in the

mainstream media are reflected, thereby shedding light on and opening possibilities to criticize the politics of mediated pre-domestication.

11:15am - 11:30am
Distinguished Short Paper (10+5min) [publication ready]

Research of Reading Practices and ’the Digital’

Anna Kaisa Kajander

University of Helsinki,

Books and reading habits belong to one of the areas of our everyday lives that have strongly been affected by digitalisation. The subject has been lifted repeatedly to public discussions in Finnish mainstream media, and the typical discourse is focused on e-books and printed books, sometimes still in a manner which juxtaposes the formats. Another aspect of reading that has gained publicity recently, concerns the decreasing interest towards books in general. The acceptance of e-books and the status of printed books in contemporary reading have raised questions, but it has also been realised that the recent changes are connected with digitalisation in a wider cultural context. It has enabled new forms of reading and related habits, which benefit readers and book culture, but it has also affected free time activities that do not support interest towards books.

In this paper, my aim is to discuss the research of books and reading as a socio-cultural practice, and ask if this field could benefit from co-operation with digital humanities scholars. The idea of combining digital humanities with book research is not new; collaboration has been welcomed especially in research that focuses on new technologies of books and the use of digitised historical records, such as bibliographies. However, I would like to call for discussion on how digital humanities could benefit the research of (new) reading practices and the ordinary reader. Defining ‘the digital’ would be essential, as well as knowledge of relevant methodologies, tools and data. I will first introduce my ongoing PhD-project and present some questions that I have had during the process. Then, based on the questions, I’d like to discuss what kind of co-operation between digital humanities and reading research could be useful to help gain knowledge of the change in book reading, related habits and contemporary readership.

PhD-project Life as a Reader

In my ongoing dissertation project, I am focusing on attitudes and expectations towards printed and e-books and new reading practices. The research material I am using consists of approximately 540 writings that were sent to the Finnish Literature Society in a public collection called “Life as a reader” in 2014. This collection was organised by Society’s Literary- and Traditional archives in co-operation with the Finnish Bookhistorical Society, and the aim was to focus on reading as a socio-cultural practice. The organisers wanted people to write in their own words about reading memories. They also included questions in the collection call, which handled, for example, topics about childhood and learning to read, reading as a private or shared practice, places and situations of reading, and experiences about recent changes, such as opinions about e-books and virtual book groups. Book historical interests were visible in the project, as all of the questions mentioned above had been apparent also in other book history research; interests towards the ordinary readers and their every day lives, the ways readers find and consume texts and readership in the digital age.

In the dissertation I will focus on the writings and especially on those writers, who liked to read books for pleasure and as a free time activity. The point is to emphasise the readers point of view to the recent changes. I argue that if we want to understand attitudes towards reading or the possible futures of reading habits, we need to understand the different practices, which the readers themselves attach to their readership. The main focus is on attitudes and expectations towards books as objects, but equally important is to scrutinise other digitally-related changes that have affected their reading practices. I am analysing these writings focusing especially to the different roles books as objects play in readers lives and to the attitudes towards digitalisation as a cultural change. The ideas behind the research questions have been based on my background as an ethnologist interested in material culture studies. I believe the concept of materiality and research of reading as a sensory experience are important in understanding of attitudes towards different book formats, readers choices and wishes towards the development of books.

Aspects of readership

The research material turned out to be rich in different viewpoints towards reading. As I expected, people wrote about their feelings about the different book formats and their reasons for choosing them. However, during the process of analysis, it become clear that to find answers to questions about the meanings of materialities of books, knowledge about the different aspects of reading habits, that the writers themselves connected to their identities as readers, was also needed. This meant focusing on writings about practices that reached further than only to book formats and reading moments. I am now in the phase of analysing the ways in which people, for example, searched for and found books, collected them and discussed literature with other readers. These activities were often connected to social media, digital book stores and libraries, values of (e-)books as objects and affects of different formats to the practices. What also became clear was that other free time activities and use of media affected to the amount of time used for reading, also for those writers that were interested in books and liked to read.

As the material was collected at the time when smartphones and tablets, which are generally considered having made an essential impact to reading habits, had only quite recently become popular and well known objects, the writings were often focused on the change and on uncertain futures of books. The mentioned practices were connected to concepts such as ownership, visibility and representation. As digital texts had changed the ways these aspects were understood, they also seemed to have caused negative associations towards digitalisation of books, especially among readers who saw the different aspects of print culture as positive parts of their readership. However, there were also friends of printed books who saw digital services as something very positive; as things that supported their reading habits. Writings about, for example, finding books to read or discussing literature with other readers online, writing and publishing book reviews in blogs or being active in GoodReads or Book Crossing websites were all seen as welcomed aspects of “new” readership. A small minority of the writers also wrote about fanfiction and electronic literature.

To compare the time of material collection with the present day, digital book services, such as e-book and audiobook services, have been gaining popularity, but the situation is not radically different from 2014. E-books have perhaps become better known since then, but they still are marginal in comparison with printed books. This means that they have not gained popularity as was expected in previous years. To my knowledge, the other aspects of new reading practices, such as the meanings of social media or the interests towards electronic literature have not yet been studied much in the Finnish context. These observations lead to the questions of the possible benefits of digital humanities for book and reading research.

Collaborating with digital humanists?

The changes in books and reading cause worries but also hopes and interest towards the future reading habits. To gain and produce knowledge about the change, we need to define what are ‘the digital’ or ‘digitalisation’, that are so often referred to in different contexts without any specific definitions. The problem is that they can mean and include various things that are attached to both technological and cultural sides of the phenomenon. For those interested in reading research, it would be important to theorise digitalisation from perspectives of readers and view the changes in reading from socio-cultural side; as concrete changes in material environment and as new possibilities to act as a reader. This field of research would benefit from collaboration with digital humanists who have knowledge about ‘the digital’ and the possibilities of reading related software and devices.

Secondly we could benefit from discussions about the possibilities to collect, use and save data that readers now leave behind, as they practice readership in digital environments. Digital book stores, library services and social media sites would be useful sources, but more knowledge is still needed about the nature of these kinds of data; which aspects affect the data, how to get the data, which tools use, etc.. Questions about collecting and saving data also include important questions related to research ethics, that also should be further discussed in book research; which data should be open and free to use, who owns the data, which permissions would be required to study certain websites? Changes in free time activities in general have also raised questions about data that could be used for comparing the time used different activities and on the other hand on reading habits.

Thirdly collaboration is needed when reading related databases are being developed. Some steps have already been taken, for example in the project Finnish Reading Experience Database, but these kinds of projects could be also further developed. Again collecting digital data but also opening and using them for different kinds of research questions is needed. At its best, multidisciplinary collaboration could help building new perspectives and research questions about the contemporary readership, and therefore all discussion and ideas that could benefit the field of books and reading would be welcome.

11:30am - 11:45am
Short Paper (10+5min) [publication ready]

Exploring Library Loan Data for Modelling the Reading Culture: project LibDat

Mats Neovius¹, Kati Launis², Olli Nurmi³

¹bo Akademi University; ²University of Eastern Finland; ³VTT research center

Reading is evidently a part of the cultural heritage. With respect to nourishing this, Finland is exceptional in the sense it has a unique library system, used regularly by 80% of the population. The Finnish library system is publicly funded free-of-charge. On this, the consortium “LibDat: Towards a More Advanced Loaning and Reading Culture and its Information Service” (2017-2021, Academy of Finland) set out to explore the loaning and reading culture and its information service to the end that this project’s results would help the officials to elaborate upon Finnish public library services. The project is part of the constantly growing field of Digital Humanities and wishes to show how large born-digital material, new computational methods and literary-sociological research questions can be integrated into the study of contemporary literary culture. The project’s collaborator Vantaa City Library collect the daily loan data. This loan data is objective, crisp, and big. In this position paper, the main contribution is a discussion on limitations the data poses and the literary questions that may be shed light on by computational means. For this, we de-scribe the data structure of a loan event and outline the dimensions in how to in-terpret the data. Finally, we outline the milestones of the project.

11:45am - 12:00pm
Short Paper (10+5min) [publication ready]

Virtual Museums and Cultural Heritage: Challenges and Solutions

Nadezhda Povroznik

Perm State National Research University, Center for Digital Humanities

The paper is devoted to demonstrate the significance of virtual museums’ study, to define more exactly the term “virtual museum” and its content, to show the problems of existing virtual museums and those complexities, which they represent for the study of cultural heritage, to show the problems of usage of virtual museum content in classical researches, which are connected with the specificity of virtual museums as informational resources and to demonstrate possible decisions of problems, sorting out all possible ways of the most effective usage of Cultural Heritage in humanities researches. The study pays attention to the main problems, related to the preservation, documentation, representation and use of CH associated with the virtual museums. It provides the basis for solving these problems, based on the subsequent development of an information system for the study of virtual museums and their wider use.

12:00pm - 12:15pm
Short Paper (10+5min) [abstract]

The Future of Narrative Theory in the Digital Age?

Hanna-Riikka Roine

University of Helsinki

As it has often been noted, digital humanities are to be understood in plural. It seems, however, that quite as often they are understood as the practice of introducing digital methods to humanities, or a way to analyse “the digital” within the humanist framework. This presentation takes a slightly different approach, as its aim is to challenge some of the traditional theoretical concepts within a humanist field, narrative theory, through the properties of today’s computational environment.

Narrative theory has originated from literary criticism and based its concepts and understanding of narrative in media on printed works. While few trends with a more broadly defined base are emerging (e.g. the project of “transmedial narratology”), the analysis of verbal narrative structures and strategies from the perspective of literary theory remains the primary concern of the field (see Kuhn & Thon 2017). Furthermore, the focus of current research is mostly medium-specific, while various phenomena studied by narratology (e.g. narrativity, worldbuilding) are agreed to be medium-independent.

My presentation starts from the fact that the ancient technology of storytelling has become enmeshed in a software-driven environment which not only has the potential to simulate or “transmediate” all artistic media, but also differs fundamentally from verbal language in its structure and strategies. This development or “digital turn” has so far mostly escaped the attention of narratologists, although it has had profound effects on the affordances and environments of storytelling.

In my presentation, I take up the properties of computational media that challenge the print-based bias of current narrative theory. As a starting point, I suggest that the scope of narrative theory should be extended to the machines of digital media instead of looking at their surface (cf. Wardrip-Fruin 2009). As software-driven, conditional, and process-based, storytelling in computational environments is not so much about disseminating a single story, but rather about multiplication of narrative, centering upon the underlying patterns on which varied instantiations can be based. Furthermore, they challenge the previous theoretical emphasis on fixed media content and author-controlled model of transmission. (See e.g. Murray 1997 and 2011; Bogost 2007, Hayles 2012, Manovich 2013, Jenkins et al. 2013.)

Because computational environments represent “a new norm” compared to the prototypical narrative developed in the study of literary fiction, Brian McHale has recently predicted that narrative theory “might become divergent and various, multiple narratologies instead of one – a separate narratology for each medium and intermedium” (2016, original emphasis). In my view, such a future fragmentation of the field would only diminish the potential of narrative theory. Instead, the various theories could converge or hybridize in a similar way that contemporary media has done – especially in the study of today’s transmedia which is hybridizing both in the sense of content being spread across media and in the sense of media being incorporated by computer and thus, acquiring the properties of computational environments.

The consequences of the recognition of media convergence or hybridization in narrative theory are not only (meta)theoretical. The primary emphasis on media content is still clearly visible in the division of modern academic study of culture and its disciplines – literary studies focus on literature, for example. While the least that narrative theory can do is expanding “potential areas of cross-pollination” (Kuhn & Thon 2017) with media studies, for example, and challenging the print-based assumptions behind concepts such as narrativity or storyworld, there may also be a need to affect some changes in the working methods of narratologists. Creating multidisciplinary research groups focusing on narrative and storytelling in current computational media is one solution (still somewhat unusual in the “traditional” humanities focused on single-authored articles and monographs), while the other is critically reviewing the academic curricula. N. Katherine Hayles, for example, has “Comparative Media Studies approach” (2012) to describe transformed disciplinary coherence that literary studies might embrace.

In my view, narrative theory can truly be “transmedial” and contribute to the study of storytelling practices and strategies in contemporary computational media, but various print- and content-based biases underlying its toolkit must be genuinely addressed first. The need for this is urgent not only because “narratives are everywhere”, but also because the old traditional online/offline distinction has begun to disappear.

References

Bogost, Ian. 2007. Persuasive Games: The Expressive Power of Videogames. Cambridge, Ma: The MIT Press.

Hayles, N. Katherine. 2012. How We Think: Digital Media and Contemporary Technogenesis. Chicago: Univ. of Chicago Press.

Jenkins, Henry, Sam Ford, and Joshua Green. 2013. Spreadable Media: Creating Value and Meaning in a Networked Culture. New York: New York Univ. Press.

Kuhn, Markus and Jan-Noël Thon. “Guest Editors’ Column. Transmedial Narratology: Current Approaches.” NARRATIVE 25:3 (2017): 253–255.

Manovich, Lev. 2013. Software Takes Command: Extending the Language of New Media. New York and London: Bloomsbury.

McHale, Brian. “Afterword: A New Normal?” In Narrative Theory, Literature, and New Media: Narrative Minds and Virtual Worlds, edited by Mari Hatavara, Matti Hyvärinen, Maria Mäkelä, and Frans Mäyrä, 295–304. London: Routledge, 2016.

Murray, Janet. 1997. Hamlet on the Holodeck: The Future of Narrative in Cyberspace. New York: The Free Press.

―――. 2011. Inventing the Medium. Principles of Interaction Design as a Cultural Practice. Cambridge, Ma: The MIT Press.

Wardrip-Fruin, Noah. 2009. Expressive Processing: Digital Fictions, Computer Games, and Software Studies. Cambridge, Ma. and London: The MIT Press.

12:15pm - 12:30pm
Short Paper (10+5min) [abstract]

Broken data and repair work

Minna Ruckenstein

Consumer Society Research Centre, University of Helsinki, Finland,

Recent research introduces a concept-metaphor of “broken data”, suggesting that digital data might be broken and fail to perform, or be in need of repair (Pink et al, forthcoming). Concept-metaphors, anthropologist Henrietta Moore (1999, 16; see also Moore 2004) argues, are domain terms that “open up spaces in which their meanings – in daily practice, in local discourses and in academic theorizing – can be interrogated”. By doing so, concept-metaphors become defined in practice and in context; they are not meant to be foundational concepts, but they work as partial and perspectival framing devices. The aim of a concept-metaphor is to arrange and provoke ideas and act as a domain within which facts, connections and relationships are presented and imagined.

In this paper, the concept-metaphor of broken data is discussed in relation to the open data initiative, Citizen Mindscapes, an interdisciplinary project that contextualizes and explores a Finnish-language social media data set (‘Suomi24’, or Finland24 in English), consisting of tens of millions of messages and covering social media over a time span of 15 years (see, Lagus et al 2016). The role of the broken data metaphor in this discussion is to examine the implications of breakages and consequent repair work in data-driven initiatives that take advantage of secondary data. Moreover, the concept-metaphor can sensitize us to consider the less secure and ambivalent aspects of data worlds. By focusing on how data might be broken, we can highlight misalignments between people, devices and data infrastructures, or bring to the fore the failures to align data sources or uses with the everyday.

As Pink et al (forthcoming) suggest the metaphorical understanding of digital data, aiming to underline aspects of data brokenness, brings together various strands of scholarly work, highlighting important continuities with earlier research. Studies of material culture explore practices of breakage and repair in relation to the materiality of objects, for instance by focusing on art restoration (Dominguez Rubio 2016), or car repair (Dant 2010). Drawing attention to the fragility of objects and temporal decay, these studies underline that objects break and have to be mended and restored. When these insights are brought into the field of data studies, the materiality of platforms and software and subsequent data arrangements, including material restrictions and breakages, become a concern (Dourish 2016; Tanweer et al 2016), emphasizing aspects of brokenness and following repair work in relation to digital data (Pink et al, forthcoming).

In the science and technology studies (STS), on the other hand, the focus on ‘breakages’ has been studied in relation to infrastructures, demonstrating that it is through instances of breakdown that structures and objects, which have become invisible to us in the everyday, gain a new kind of visibility. The STS scholar Stephen Jackson expands the notion of brokenness further to more everyday situations and asks ‘what happens when we take erosion, breakdown, and decay, rather than novelty, growth, and progress, as our starting points in thinking through the nature, use, and effects of information technology and new media?’ (2014: 174). Instances of data breakages can be seen in light of mundane data arrangements, as a recurring feature of data work rather than an exceptional event (Pink et al, forthcoming; Tanweer et al 2016).

In order to concretize further the usefulness of the concept-metaphor of broken data, I will detail instances of breakage and repair in the data work of the Citizen Mindscapes initiative, emphasizing efforts needed to overcome various challenges in working with large digital data. This kind of approach introduces obstacles and barriers that slow or derail the data science process as an important resource for knowledge production and innovation (Tanweer et al 2016). In the collaborative Citizen Mindscapes initiative, discussing the gaps, or possible anomalies in the data led to conversations concerning the production of data, deepening our understanding of the human and material factors at play in processes of data generation.

Identifying data breakages

The Suomi24 data was generated by a media company, Aller. The data set grew on the company servers for over a decade, gaining a new life and purpose when the company decided to open the proprietary data for research purposes. A new infrastructure was needed for hosting and distributing the data. One such data infrastructure was already in place, the Language Bank of Finland, maintained by CSC (IT Centre for Science), developed for acquiring, storing, offering and maintaining linguistic resources, tools and data sets for academic researchers. The Language Bank gave a material structure to the Suomi24 data: it was repurposed as research data for linguistics.

The Korp tool, developed for the analysis of data sets stored in the Language Bank, allowed word searches, in relation to individual sentences, retaining the Suomi24 data as a resource for linguistic research. Yet, the material arrangements constrained other possible uses of the data that were of interest to the Citizen Mindscapes research collective, aiming to work the data to accommodate the social science focus on topical patterns and emotional waves and rhythms characteristic of the social media. In the past two years, the research collective, particularly those members experienced in working with large data sets, have been repairing and cleaning the data in order to make it ready for additional computational approaches. The goal is to build a methodological toolbox that researchers, who do not possess computational skills, but are interested in using digital methods in the social scientific inquiry, can benefit from. This entails, for instance, developing user interfaces that narrow down the huge data set and allow to access data with topic-led perspectives.

The ongoing work has alerted us to breakages of data, raising more general questions about the origins and nature of data. Social media data, such as the Suomi24, is never an accurate, or complete representation of the society. From the societal perspective, the data is broken, offering discontinuous, partial and interrupted views to individual, social and societal aims. The preparation of data for research that takes societal brokenness seriously underlines the importance of understanding the limitations and biases in the production of the data, including insights into how the data might be broken. The first step towards this aim was a research report (Lagus et al 2016) that evaluated and contextualized the Suomi24 data in a wide variety of ways. We paid attention to the writers of the social media community as producers of the data; the moderation practices of the company were described to demonstrate how they shape the data set by filtering certain swearwords and racist terms, or certain kinds of messages, for instance, advertisement or messages containing personal information.

The yearly volume and daily rhythms of the data were calculated based on timestamps, and the topical hierarchies of the data were uncovered by attention to the conversational structure of the social media forum. When our work identified gaps, errors and anomalies in the data, it revealed that data might be broken and discontinuous due to human or technological forces: infrastructure failures, trolling, or automated spam bots. With the information of gaps in the data, we opened a conversation with the social media company’s employees and learned that nobody could tell us about the 2004-2005 gap in the data. A crack in the organizational memory was revealed, reminding of the links between the temporality of data and human memory. In contrast, the anomaly in the data volume in July 2009 which we first suspected was a day when something dramatic happened that created a turmoil in the social media, turned out to be a spam bot, remembered very well in the company.

In the field of statistics, for instance, research might require intimate knowledge of all possible anomalies of the data. What appears as incomplete, inconsistent and broken to some practitioners might be irrelevant for others, or a research opportunity. The role of the concept-metaphor of broken data is to open a space for discussion about these differences, maintaining them, rather than resolving them. One option is to highlight how data is seen as broken in different contexts and compare the breakages, and then follow what happens after them, and focus on the repair and cleaning work

Concluding remarks

The purpose of this paper has been to introduce the broken data metaphor that calls for paying more attention to the incomplete and fractured character of digital data. Acknowledging the incomplete nature of data in itself is of course nothing new, researcher are well aware of their data lacking perfection. With growing uses of secondary data, however, the ways in which data is broken might not be known beforehand, underlining the need to pay more careful attention to brokenness and the consequent work of repair. In the case of Suomi24data, the data breakages suggest that we need to actively question data production and the diverse ways in which data are adapted for different ends by practitioners. As described above, the repurposed data requires an infrastructure, servers and cloud storage; the software and analytics tools enable certain perspectives and operations and disable others, Data is always inferred and interpreted in infrastructure and database design and by professionals, who see the data, and its possibilities, differently depending on their training. As Genevieve Bell (2015: 16) argues, the work of coding data and writing algorithms determines ‘what kind of relationships there should be between data sets’ and by doing so, data work promotes judgments about what data should speak to what other data. As our Citizen Mindscapes collaboration suggests, making ‘data talk’ to other data sets, or to interpreters of data, is permeated by moments of breakdown and repair that call for a richer understanding of everyday data practices. The intent of this paper has been to suggest that a focus on data breakages is an opportunity to learn about everyday data worlds, and to account for how data breakages challenge the linear, solutionist, and triumphant stories of big data.

References:

Bell, G. (2015). ‘The secret life of big data’. In Data, now bigger and better! Eds. T. Boellstorf and B. Maurer. Publisher: Prickly Paradigm Press,7-26

Dant, T., 2010. The work of repair: Gesture, emotion and sensual knowledge. Sociological Research Online, 15(3), p.7.

Domínguez Rubio, F. (2016) ‘On the discrepancy between objects and things: An ecological approach’ Journal of Material Culture. 21(1): 59–86

Jackson, S.J. (2014) ‘Rethinking repair’ in T. Gillespie, P. Boczkowski, and K. Foot, eds. Media Technologies: Essays on Communication, Materiality and Society. MIT Press: Cambridge MA

Lagus, K. M. Pantzar, M. Ruckenstein, and M. Ylisiurua. (2016) Suomi24: Muodonantoa aineistolle. The Consumer Society Research Centre. Helsinki: Faculty of Social Sciences, University of Helsinki.

Moore, H (1999) Anthropological theory at the turn of the century in H. Moore (ed) Anthropological theory today. Cambridge: Polity Press, pp. 1-23.

Moore, H. L. (2004). Global anxieties: concept-metaphors and pre-theoretical commitments in anthropology. Anthropological theory, 4(1), 71-88.

Pink et al, forthcoming. Broken data: data metaphors for an emerging world. Big data & Society.

Tanweer, A., Fiore-Gartland, B., & Aragon, C. (2016). Impediment to insight to innovation: understanding data assemblages through the breakdown–repair process. Information, Communication & Society, 19(6), 736-752.

2:00pm - 3:30pm

T-PII-2: Cultural Heritage and Art
Session Chair: Bente Maegaard

PII

2:00pm - 2:30pm
Long Paper (20+10min) [publication ready]

Cultural Heritage `In-The-Wild': Considering Digital Access to Cultural Heritage in Everyday Life

David McGookin, Koray Tahiroglu, Tuomas Vaittinen, Mikko Kyto, Beatrice Monastero, Juan Carlos Vasquez

Aalto University,

As digital cultural heritage applications begin to be deployed outwith `traditional' heritage sites, such as museums, there is an increased need to consider their use amongst individuals who are open to learning about the heritage of a site, but where that is a clearly secondary purpose for their visit. Parks, recreational areas and the everyday built environment represent places that although rich in heritage, are often not visited primarily for that heritage. We present the results of a study of a mobile application to support accessing heritage on a Finnish recreational island. Evaluation with 45 participants, who were not there primarily to access the heritage, provided insight into how digital heritage applications can be developed for this user group. Our results showed how low immersion and lightweight interaction support individuals to integrate cultural heritage around their primary visit purpose, and although participants were willing to include heritage as part of their visit, they were not willing to be directed by it.

2:30pm - 2:45pm
Short Paper (10+5min) [publication ready]

Negative to That of Others, But Negligent of One’s Own? On Patterns in National Statistics on Cultural Heritage in Sweden

Daniel Brodén

Gothenburg University, Sweden,

In 2015–2016 the Centre for Critical Heritage Studies conducted an interdisciplinary pilot project in collaboration with the SOM-institute at the University of Gothenburg. A key ambition was to demonstrate the usefulness of combining an analysis rooted in the field of critical heritage studies and a statistical perspective. The study was based on a critical discussion of the concept of cultural heritage and collected data from the nationwide SOM-surveys.

The abstract will highlight some significant patterns in the SOM data from 2015 when it comes to differences between people regarding activities that are traditionally associated with national cul-tural heritage and culture heritage instititions: 1) women are more active than men when it comes to activities related to national cultural heritage; 2) class and education are also significant factors in this context. Since these patterns has been shown in prior research, perhaps the most interesting finding is that, 3) people who are negative to immigration from ‘other’ cultures to a lesser extent participates in activities that are associated with their ‘own’ cultural heritage.

2:45pm - 3:00pm
Distinguished Short Paper (10+5min) [publication ready]

Engaging Collections and Communities: Technology and Interactivity in Museums

Paul Arthur

Edith Cowan University,

Museum computing is a field with a long history that has made a substantial impact on humanities computing, now called ‘digital humanities,’ that dates from at least the 1950s. Community access, public engagement, and participation are central to the charter of most museums and interactive displays are one strategy used help to fulfil that goal. Over the past two decades interactive elements have been developed to offer more immersive, realistic and engaging possibilities through incorporating motion-sensing spaces, speech recognition, networked installations, eye tracking and multitouch tables and surfaces. As museums began to experiment with digital technologies there was an accompanying change of emphasis and policy. Museums aimed to more consciously connect themselves with popular culture by experimenting with the presentation of their collections in ways that would result in in-creased public appreciation and accessibility. In this paper these shifts are investigated in relation to interactive exhibits, virtual museums, the profound influence of the database, and in terms of a wider breaking down of institutional barriers and hierarchies, resulting in trends towards increasing collaboration.

3:00pm - 3:15pm
Short Paper (10+5min) [abstract]

Art of the Digital Natives and Predecessors of Post-Internet Art

Raivo Kelomees

Estonian Academy of Arts, Estonia,

The new normal or the digital environment surrounding us has in recent years surprised us, at least in the fine arts, with the internet's content returning to its physical space. Is this due to pressure from the galleries or something else; in any case, it is clearer than ever that the audience is not separable from the habitual space; there is a huge and primal demand for physical or material art.

Christiane Paul in her article "Digital Art Now: The Evolution of the Post-Digital Age" in "ARS17: Hello World!" exhibition catalogue, is critical of the exhibition. Her main message is that all this has been done before. In itself the statement lacks originality, but in the context of the postinternet apologists declaring the birth of a new mentality, the arrival of a new "after experiencing the internet" and "post-digital" generation, it becomes clear that indeed it is rather like shooting fish in a barrel, because art that is critical of the digital and interactive has existed since the 1990s, as have works concerned with the physicalisation of the digital experience.

The background to the exhibition is the discussion over "digitally created" art and the generation related to it. The notion of "digital natives" is related to the post-digital and post-internet generation and the notion of "post-contemporary" (i.e. art is not concerned with the contemporary but with the universal human condition). Apparently for the digital natives, the internet is not a way out of the world anymore, but an original experience in which the majority of their time is spent. At the same time, however, the internet is a natural information environment for people of all ages whose work involves data collection and intellectual work. Communication, thinking, information gathering and creation – all of these realms are related to the digital environment. These new digital nomads travel from place to place and work in a "post-studio" environment.

While digital or new media was created, stored and shared via digital means, post-digital art addresses the digital without being stored using these same means. In other words, this kind of art exists more in the physical space.

Considerable reference also exists in relation to James Bridle's new aesthetics concept from 2012. In short, this refers to the convergence and conjoinment of the virtual and physical world. It manifests itself clearly even in the "pixelated" design of consumer goods or in the oeuvre of sculptors and painters, whose work has emerged from something digital. For example, the art objects by Shawn Smith and Douglas Coupland are made using pixel-blocks (the sculpture by the latter is indeed reminiscent of a low resolution digital image). Analogous works induce confusion, not to say a surprising experience, in the minds of the audience, for they bring the virtual quality of the computerised environment into physical surroundings. This makes the artworks appear odd and surreal, like some sort of mistake, errors, images and objects out of place.

The so-called postinternet generation artists are certainly not the only ones making this kind of art. As an example of this, there is a reference to the abstract stained glass collage of 11,500 pixels by Gerhard Richter in the Cologne Cathedral. It is supposed to be a reference to his 1974 painting "4096 Farben" (4096 colours), which indeed is quite similar. It is said that Richter did not accept a fee; however, the material costs were covered by donations. And yet the cardinal did not come to the opening of the glasswork, preferring depictions of Christian martyrs over abstract windows, which instead reminded him of mosques.

One could name other such examples inspired by the digital world or schisms of the digital and physical world: Helmut Smits' "Dead Pixel in Google Earth" (2008); Aram Barholl's "Map" (2006); the projects by Eva and Franco Mattes, especially the printouts of Second Life avatars from 2006; Achim Mohné's and Uta Koppi's project "Remotewords" (2007–2011), computer-based instructions printed on rooftops to be seen from Google Maps or satellites or planes. There are countless examples where it is hard to discern whether the artist is deliberately and critically minded towards digital art or rather a representative of the post-digital generation who is not aware and wishes not to be part of the history of digital art.

From the point of view of researchers of digital culture, the so-called media-archaeological direction could be added to this as an inspirational source for artists today. Media archaeology or the examination of previous art and cultural experience signifies, in relation to contemporary media machines and practices, the exploration of previous non-digital cultural devices, equipment, means of communication, and so on, that could be regarded as the pre-history of today's digital culture and digital devices. With this point of view, the "media-archaeological" artworks of Toshio Iwai or Bernie Lubell coalesce. They have taken an earlier "media machine" or a scientific or technical device and created a modern creation on the basis of it.

Then there was the "Ars Electronica" festival (2006) that focused on the umbrella topic "Simplicity", which in a way turned its back on the "complexity" of digital art and returned to the physical space.

Therefore, in the context of digital media based art trends, the last couple of decades have seen many expressions – works, events and exhibitions – of "turning away" from the digital environment that would outwardly qualify as post-digital and postinternet art.

3:15pm - 3:30pm
Short Paper (10+5min) [abstract]

The Stanley Rhetoric: A Procedural Analysis of VR Interactions in 3D Spatial Environments of Stanley Park, BC

Raluca Fratiloiu

Okanagan College

In a seminal text on the language of new media, Manovitch (2002) argued:

Traditionally, texts encoded human knowledge and memory, instructed, inspired, convinced, and seduced their readers to adopt new ideas, new ways of interpreting the world, new ideologies. In short, the printed word was linked to the art of rhetoric. While it is probably possible to invent a new rhetoric of hypermedia […] the sheer existence and popularity of hyperlinking exemplifies the continuing decline of the field of rhetoric. (Manovitch, 2002).

Depending on the context of each “rhetorical situation” (Bitzer, 1968), it may be both good and bad news to think that interactivity and rhetoric might not always go hand in hand. However, despite the anticipated decline of rhetoric as announced by Manovitch (2002), in this paper we propose a closer examination of what constitutes a rhetorically effective discourse in new media, in general and virtual reality (VR), in particular. The reason we need to examine it more closely is that VR, especially when it has an educational goal, needs to be rhetorically effective to be successful with audiences. A consideration of the rhetorical impact of VR’s affordances may enhance the potential of meaningful interactions with students and users.

In addition to a very long disciplinary history, rhetoric has been investigated in relation to new media mainly through Bogost’s (2007) concept of “procedural rhetoric”. He argued that despite the fact that “rhetoric was understood as the art of oratory”, “videogames open a new domain for persuasion, thanks to their core representational mode, procedurality. (Bogost, 2007) This has implications, according to Bogost (2007) in three areas: politics, advertising and learning. Several of these implications have already been investigated. Procedural rhetorical analysis in videogames has since become a core methodological approach. Also, particular attention has been paid to how new media open new possibilities through play and how in turn this creates a renewed interest in digital rhetoric. (Daniel-Wariya, 2016) At the same time, procedural rhetoric has been also investigated at length in connection to learning through games (Gee, 2007). Learning also has been central in a few studies on VR in education (Dalgarno, 2010). However, specific assessments of procedural rhetoric outcomes of particular VR educational projects are non-existent.

In this paper, we will focus on analysing procedural interactions in a VR project developed by University of British Columbia’s Emerging Media Lab. This project, funded via an open education grant, led to the creation of a 3D spatial environment of Stanley Park located in Vancouver, British Columbia (BCCampus, 2017). This project focused on Stanley Park, one of the most iconic Canadian destinations as an experiential field trip, specifically using educational content for and 3D spatial environment models of Prospect Point, Beaver Lake, Lumberman’s Arch, and the Hollow Tree. Students will have opportunities to visit these locations in the park virtually and interact with the environment and remotely with other learners. In addition, VR provides opportunities to explore the complex history of this impressive location that was once home to Burrard, Musqueam and Squamish First Nations people (City of Vancouver, 2017).

This case analysis may open up new possibilities for investigating how students/users derive meaning from interacting in these environments and continue a dialogue between several connected areas of education and VR, games and pedagogy, games and procedural rhetoric. Also, we hope to contribute this feedback to this emerging project as it continues to evolve and share its results with the wider open education community.

References:

BCCampus. (2017, May 10). Virtual reality and augmented reality field trips funded by OER grants. Retrieved from BCCampus: https://bccampus.ca/2017/05/10/virtual-reality-and-augmented-reality-field-trips-funded-by-oer-grants/

Bitzer, L. (1968). The rhetorical situation. Philosophy and Rhetoric, 1, pp. 1-14.

Bogost, I. (2007). Persuasive Games: The Expressive Power of Videogames. . Cambridge: MA: MIT Press.

City of Vancouver. (2017). The History of Stanley Park. Retrieved from City of Vancouver: http://vancouver.ca/parks-recreation-culture/stanley-park-history.aspx

Dalgarno, L. (2010). What are the learning affordances of 3-D virtual environments? British Journal of Educational Technology, Vol 41 No 1 10-32.

Daniel-Wariya, J. (2016). A Language of Play: New Media’s Possibility Spaces. Computers and Composition, 40, pp 32-47.

Gee, J. P. (2007). What Video Games Have to Teach Us About Learning and Literacy. Second Edition: Revised and Updated Edition. New York: St. Martin's Griffin.

Manovitch, L. (2002). The language of new media. Cambridge, MA: MIT Press.

4:00pm - 5:30pm

T-PII-3: Augmented Reality
Session Chair: Sanita Reinsone

PII

4:00pm - 4:30pm
Long Paper (20+10min) [abstract]

Extending museum exhibits by embedded media content for an embodied interaction experience

Jan Torpus

University of Applied Sciences and Arts Northwestern Switzerland

Extending museum exhibits by embedded media content for an embodied interaction experience

Investigation topic

Nowadays, museums not only collect, categorize, preserve and present; a museum must also educate and entertain, all the while following market principles to attract visitors. To satisfy this mission, they started to introduce interactive technologies in the 1990s, such as multimedia terminals and audio guides, which have since become standard for delivering contextual information. More recently there has been a shift towards the creation of personalized sensorial experiences by applying user tracking and adaptive user modeling based on location-sensitive and context-aware sensor systems with mobile information retrieval devices. However, the technological gadgets and complex graphical user interfaces (GUIs) generate a separate information layer and detach visitors from the physical exhibits. The attention is drawn to the screen and the interactive technology becomes a competing element with the environment and the exhibited collection [Stille 2003, Goulding 2000, Wakkary 2007]. Furthermore, the vast majority of visitors comes in groups and the social setting gets interrupted by the digital information extension [Petrelli 2016].

First studies about museum visitor behavior were carried out at the end of the 19th and during the 20th Century [Robinson 1928, Melton 1972]. More recently, a significant body of ethnographic research about visitor experience of single persons and groups has contributed studies about technologically extended and interactive installations. Publications about visitor motivation, circulation and orientation, engagement, learning processes, as well as cognitive and affective relationship to the exhibits are of interest for our research approach [Bitgood 2006, Vom Lehn 2007, Dudley 2010, Falk 2011]. Most relevant are studies of the Human Computer Interaction (HCI) researcher community in the fields of Ubiquitous Computing, Tangible User Interfaces and Augmented Reality, investigating hybrid exhibition spaces and the bridging of the material and physical with the technologically mediated and virtual [Hornecker 2006, Wakkary 2007, Benford 2009, Petrelli 2016].

Approach

At the Institute of Experimental Design and Media Cultures (IXDM) we have conducted several design research projects applying AR for cultural applications but got increasingly frustrated with disturbing GUIs and physical interfaces such as mobile phones and Head Mounted Displays. We therefore started to experiment with Ubiquitous Computing, the Internet of Things and physical computing technologies that became increasingly accessible for the design community during the last twelve years because of shrinking size and price of sensors, actuators and controllers. In the presented research project, we therefore examine the extension of museum exhibits by physically embedded media technologies for an embodied interaction experience. We intend to overcome problems of distraction, isolation and stifled learning processes with artificial GUIs by interweaving mediated information directly into the context of the exhibits and by triggering events according to visitor behavior.

Our research approach was interdisciplinary and praxis-based including the observation of concept, content and design development and technological implementation processes before the final evaluations. The team was composed of two research partners, three commercial/engineering partners and three museums, closely working together on three tracks: technology, design and museology. The engineering partners developed and implemented a scalable distributed hardware node system and a Linux-based content management system. It is able to detect user behavior and accordingly process and display contextual information. The content design team worked on three case studies following a scenario-driven prototyping approach. They first elaborated criteria catalogues, suitable content and scenarios to define the requirement profiles for the distributed technological environment. Subsequently, they carried out usability studies in the Critical Media Lab of the IXDM and finally set up and evaluated three case studies with test persons. The three museums involved, the Swiss Open-Air Museum Ballenberg, the Roman City of Augusta Raurica and the Museum der Kulturen Basel, all have in common that they exhibit objects or rooms that function as staged knowledge containers and can therefore be extended by means of ubiComp technologies. The three case studies were thematically distinct and offered specific exhibition situations:

• Case study 1: Roman City of Augusta Raurica: “The Roman trade center Schmidmatt“. The primary imparting concept was “oral history”, and documentary film served as a related model: An archaeologist present during the excavations acted as a virtual guide, giving visitors information about the excavation and research methods, findings, hypotheses and reconstructions.

• Case study 2: Open-Air Museum Ballenberg: “Farmhouse from Uesslingen“. The main design investigation was “narratives” about the former inhabitants and the main theme “alcohol”: Its use for cooking, medical application, religious rituals and abuse.

• Case study 3: Museum der Kulturen Basel: “Meditation box“. The main design investigation was “visitor participation” with biofeedback technologies.

Technological development

This project entailed the development of a prototype for a commercial hardware and software toolkit for exhibition designers and museums. Our technology partners elaborated a distributed system that can be composed and scaled according to the specific requirements of an exhibition. The system consists of two main parts:

• A centralized database with an online content management system (CMS) to setup and control the main software, node scripts, media content and hardware configuration. After the technical installation it also allows the museums to edit, update, monitor and maintain their exhibitions.

• Different types of hardware nodes that can be extended by specific types of sensors and actuators. Each node, sensor and actuator has its own separate ID; they are all networked together and are therefore individually accessible via the CMS. A node can run on a Raspberry Pi, for example, an FPGA based on Cyclone V or any desktop computer and can thus be adapted to the required performance.

The modular architecture allows for technological adaption or extension according to specific needs. First modules were developed for the project and then implemented according to the case study scenarios.

Evaluation methods

Through a participatory design process, we developed a scenario for each case study, suitable for walkthrough with several test persons. Comparable and complementary case study scenarios allowed us to identify risks and opportunities for exhibition design and knowledge transfer and define the tasks and challenges for technical implementation. For the visitor evaluation, we selected end-users, experts and in-house museum personnel. The test persons were of various genders and ages (including families with children), had varying levels of technical understanding and little or no knowledge about the project. For each case study we asked about 12 persons or groups of persons to explore the setting as long as they wanted (normally 10–15 minutes). They agreed to be observed and video recorded during the walkthrough and to participate in a semi-structured interview afterwards. We also asked the supervisory staff about their observations and mingled with regular visitors to gain insight into their primary reactions, comments and general behavior. The evaluation was followed by a heuristic qualitative content analysis of the recorded audio and video files and the notes we took during the interviews. Shortly after each evaluation we presented and discussed the results in team workshops.

Findings and Conclusions

The field work lead to many detailed insights about interweaving interactive mediated information directly into the context of physical exhibits. The findings are relevant for museums, design researchers and practitioners, the HCI community and technology developers. We organized the results along five main investigation topics:

1. Discovery-based information retrieval

Unexpected ambient events generate surprise and strong experiences but also contain the risk of information loss if visitors do not trigger or understand the media aids. The concept of unfolding the big picture by gathering distributed, hidden information fragments requires visitor attentiveness. Teasing, timing and the choice of location are therefore crucial to generate flowing trajectories.

2. Embodied interaction

The ambient events are surprising but visitors are not always aware of their interactions. The unconscious mode of interaction lacks of an obvious interaction feedback. But introducing indicated hotspots or modes of interactions destroys the essence of the project’s approach. The fact that visitors do not have to interact with technical devices or learn how to operate graphical user interfaces means that no user groups are excluded from the experience and information retrieval.

3. Non-linear contextual information accumulation

When deploying this project’s approach as a central exhibition concept, information needs to be structured hierarchically. Text boards or info screens are still a good solution for introducing visitors to the ways they can navigate the exhibition. The better the basic topics and situations are initially introduced, the more freedom emerges for selective and memorable knowledge staged in close context to the exhibits.

4. Contextually extended physical exhibits

A crucial investigation topic was the correlation between the exhibit and the media extension. We therefore declined concepts that would overshadow the exhibition and would use it merely as a stage for storytelling with well-established characters or as an extensive media show. The museums requested that media content fade in only shortly when someone approached a hotspot and that there were no technical interfaces or screens for projections that challenged the authenticity of the exhibits. We also discussed to what extend the physical exhibit should be staged to bridge the gap to the media extension.

5. Invisibly embedded technology

The problem of integrating sensors, actuators and controllers into cultural heritage collections was a further investigation topic. We used no visible displays to leave the exhibition space as pure as possible and investigated the applicability of different types of media technologies.

Final conclusion

Our museum partners agreed that the approach should not be implemented as a central concept and dense setting for an exhibition. As often propagated by exhibition makers, first comes the well-researched and elaborated content and carefully constructed story line, and only then the selection of the accurate design approach, medium and form of implementation. This rule also seems to apply to ubiComp concepts and technologies for knowledge transfer. The approach should be applied as a discreet additional information layer or just as a tool to be used when it makes sense to explain something contextually or involve visitors emotionally.

References

Steve Benford et al. 2009. From Interaction to Trajectories: Designing Coherent Journeys Through User Experiences. Proc. CHI ’09, ACM Press. 709–718.

Stephen Bitgood. 2006. An Analysis of Visitor Circulation: Movement Patterns and the General Value Principle. Curator the museum journal, Volume 49, Issue 4,463–475.

John Falk. 2011. Contextualizing Falk’s Identity-Related Visitor Motivational Model. Visitors Studies. 14, 2, 141-157.

Sandra Dudley. 2010. Museum materialities: Objects, sense and feeling. In Dudley, S. (ed.) Museum Materialities: Objects, Engagements, Interpretations. Routledge, UK, 1-18.

Christina Goulding. 2000. The museum environment and the visitor experience. European Journal of marketing 34, no. 3/4, pp. 261-278.

Eva Hornecker and Jacob Buur. 2006. Getting a Grip on Tangible Interaction: A Framework on Physical Space and Social Interaction. CHI, Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 437-446.

Dirk vom Lehn, Jon Hindmarsh, Paul Luff, Christian Heath. 2007. Engaging Constable: Revealing art with new technology. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '07), 1485-1494.

Arthur W. Melton. 1972. Visitor behavior in museums: Some early research in environmental design. In Human Factors. 14(5): 393-403.

Edward S. Robinson. 1928. The behavior of the museum visitor. Publications of the American Association of Museums, New Series, Nr. 5. Washington D.C.

Daniela Petrelli, Nick Dulake, Mark T. Marshall, Anna Pisetti, Elena Not. 2016. Voices from the War: Design as a Means of Understanding the Experience of Visiting Heritage. Proceedings Human-Computer Interaction, San Jose, CA, USA.

Alexander Stille. 2003. The future of the past. Macmillan. Pan Books Limited.

Ron Wakkary and Marek Hatala. 2007. Situated play in a tangible interface and adaptive audio museum guide. Published online: 4 November 2006. Springer-Verlag London Limited.

4:30pm - 5:00pm
Long Paper (20+10min) [abstract]

Towards an Approach to Building Mobile Digital Experiences For University Campus Heritage & Archaeology

Ethan Watrall

Michigan State University,

The spaces we inhabit and interact with on a daily basis are made up of layers of cultural activity that are, quite literally, built up over time. While museum exhibits, archaeological narratives, and public programs communicate this heritage, they often don’t allow for the public to experience interactive, place-based, and individually driven exploration of content and spaces. Further, designers of public heritage and archaeology programs rarely explore the binary nature of both the presented content and the scholarly process by which the understanding of that content was reached. In short, the scholarly narrative of material culture, heritage, and archaeology is often hidden from public exploration, engagement, and understanding. Additionally, many traditional public heritage and archaeology programs often find it challenging to negotiate the balance between the voice and goals of the institution and those of communities and groups. In recent years, the maturation of mobile and augmented reality technology has provided heritage institutions, sites of memory and memorialization, cultural landscapes, and archaeological projects with interesting new avenues to present research and engage the public. We are also beginning to see exemplar projects that suggest fruitful models for moving the domain of mobile heritage forward considerably.

University campuses provide a particularly interesting venue for leveraging mobile technology in the pursuit of engaging, place-based heritage and archaeology experiences. University campuses are usually already well traveled public spaces, and therefore don’t elicit the same level of concern that you might find in other contexts for publicly providing the location of archaeological and heritage sites and resources. They have a built in audience of alumni and students eager to better understand the history and heritage of their home campus. Finally, many university campuses are starting to seriously think of themselves as places of heritage and memory, and are developing strategies for researching, preserving, and presenting their own cultural heritage and archaeology.

It is within this context that this paper will explore a deeply collaborative effort at Michigan State University that leverages mobile technology to build an interactive and place-based interpretive layer for campus heritage and archaeology. Driven by the work of the Michigan State University Campus Archaeology Program, an internationally recognized initiative that is unique in its approach to campus heritage, these efforts have unfolded across a number of years and evolved to meet the ever changing need to present the rich and well studied heritage and archaeology of Michigan State University's historic campus.

Ultimately, the goal of this paper is not only to present and discuss the efforts at Michigan State University, but to provide a potential model for other university campuses interested in leveraging mobile technology to produce engaging digital heritage and archaeology experiences.

5:00pm - 5:30pm
Long Paper (20+10min) [publication ready]

Zelige Door on Golborne Road: Exploring the Design of a Multisensory Interface for Arts, Migration and Critical Heritage Studies

Alda Terracciano

University College London,

In this paper I discuss the multisensory digital interface and art installation Zelige Door on Golborne Road as part of the wider research project ‘Mapping Memory Routes: Eliciting Culturally Diverse Collective Memories for Digital Archives’. The interface is conceived as a tool for capturing and displaying the living heritage of members of Moroccan migrant communities, shared through an artwork composed of a digital interactive sensorial map of Golborne Road (also known as Little Morocco), which includes physical objects related to various aspects of Moroccan culture, each requiring a different sense to be experienced (smell, taste, sight, hearing, touch). Augmented Reality (AR) and olfactory technologies have been used in the interface to superimpose pre-recorded video material and smells to the objects. As a result, the neighbourhood is represented as a living museum of cultural memories expressed in the form of artefacts, sensory stimulation and narratives of citizens living, working or visiting the area. Based on a model I developed for the multisensory installation ‘Streets of...7 cities in 7 minutes’, the interface was designed with Dr Mariza Dima (HCI designer), and Prof. Monica Bordegoni and Dr Marina Carulli (olfactory technology designers) to explore new methods able to elicit cultural Collective Memories through the use of multi-sensory technologies. The tool is also aimed at stimulating collective curatorial practices and democratise decision-making processes in urban planning and cultural heritage.

Date: Friday, 09/Mar/2018

11:00am - 12:00pm

F-PII-1: Creating and Evaluating Data
Session Chair: Koenraad De Smedt

PII

11:00am - 11:15am
Short Paper (10+5min) [publication ready]

Digitisation and Digital Library Presentation System – A Resource-Conscientious Approach

Tuula Pääkkönen, Jukka Kervinen, Kimmo Kettunen

National Library of Finland, Finland

The National Library of Finland (NLF) has done long-term work to digitise and make available our unique collections. The digitisation policy defines what is to be digitised, and it aims not only to target both rare and unique materials but also to create a large corpus of certain material types. However, as digitisation resources are scarce, the digitisation is planned annually, where prioritisation is done. This involves the library juggling the individual researcher needs with its own legal preservation and availability goals. The digital presentation system at digi.nationallibrary.fi plays a key role, since it enables fast operation by being next to the digitisation process, and it enables a streamlined flow of material via a digital chain from production and to the end users.

In this paper, we will describe our digitisation process and its cost-effective improvements, which have been recently applied at the NLF. In addition, we evaluate how we could improve and enrich our digital presentation system and its existing material by utilising results and experience from existing research efforts. We will also briefly examine the positive examples of other national libraries and identify universal features and local differences.

11:15am - 11:30am
Short Paper (10+5min) [publication ready]

Digitization of the collections at Ømålsordbogen – the Dictionary of Danish Insular Dialects: challenges and opportunities

Henrik Hovmark, Asgerd Gudiksen

University of Copenhagen,

Ømålsordbogen (the Dictionary of Danish Insular Dialects, henceforth DID) is an historical dictionary giving thorough descriptions of the dialects, i.e. the spoken vernacular of peasants and fishermen, on the Danish isles Seeland, Funen and surrounding islands. It covers the period from 1750 to 1950, the core period being 1850 to 1920. Publishing began in 1992 and the latest volume (11, kurv-lindorm) appeared in 2013 but the project was initiated in 1909 and data collection dates back to the 1920s and 1930s. The project is currently undergoing an extensive process of digitization: old, outdated editing tools have been replaced with modern (database, xml, Unicode), and the old, printed volumes have been extracted to xml as well and are now searchable as a single xml file. Furthermore, the underlying physical data collections are being digitized.

In the following we give a brief account of the digitization process, and we discuss a number of questions and dilemmas that this process gives rise to. The collections underlying the DID project comprise a variety of subcollections characterized by a large heterogeneity in terms of form as well as content. The information on the paper slips are usually densified, often idiosyncratic, and normally complicated to decode, even for other specialists. The digitization process naturally points towards web publication of the collections, either alone or in combination with the edited data, but it also gives rise to a number of questions. The current digitization process being very basic, only adding very few metadata (1-2 or 3), we point to the obvious fact that web publication of the collections presupposes an addition of further, carefully selected metadata, taking different user needs and qualifications into account. We also discuss the relationship between edited and non-edited data in a publication perspective. Some of the paper slips are very difficult to decipher due to handwriting or idiosyncratic densification and we point out that web publication in a raw, i.e. non-edited or non-annotated form, might be more misleading than helpful for a number of users.

11:30am - 11:45am
Short Paper (10+5min) [abstract]

Cultural heritage collections as research data

Toby Burrows^1,2

¹University of Oxford; ²University of Western Australia

This presentation will focus on the re-use of data relating to collections in libraries, museums and archives to address research questions in the humanities. Cultural heritage materials held in institutional collections are crucial sources of evidence for many disciplines, ranging from history and literature to anthropology and art. They are also the subjects of research in their own right – encompassing their form, their history, and their content, as well as their places in broader assemblages like collections and ownership networks. They can be studied for their unique and individual qualities, as Neil McGregor demonstrated in his History of the World in 100 Objects, but also as components within a much larger quantitative framework.

Large-scale research into the history and characteristics of cultural heritage materials is heavily dependent on the availability of collections data in appropriate formats and sufficient quantities. Unfortunately, this kind of research has been seriously limited, for the most part, by lack of access to suitable curatorial data. In some instances this is simply because collection databases have not been made fully available on the Web – particularly the case with art galleries and some museums. Even where databases are available, however, they often cannot be downloaded in their entirety or through bulk selections of relevant content. Data downloads are frequently limited to small selections of specific records.

Collections data are often available only in formats which are difficult to re-use for research purposes. In the case of libraries, the only export formats tend to be proprietary bibliographic schemas such as EndNote or RefCite. Even where APIs are made available, they may be difficult to use or limited in their functionality. CSV or XML downloads are relatively rare. Data licensing regimes may also discourage re-use, either by explicit limitations or by lack of clarity about terms and conditions.

Even where researchers are able to download usable data, it is very rare for them to be able to feed back any cleaning or enhancing they may have done. The cultural heritage institutions supplying the data may be unable or unwilling to accept corrections or improvements to their records. They may also be suspicious of researchers developing new digital services which appear to compete with the original database.

As a result, there has been a significant disconnect between curatorial databases and researchers, who have struggled to make effective use of what is potentially a very rich source of computationally usable evidence. One important consequence is that re-use of curatorial data by researchers often focuses on the data which are the easiest to obtain. The results are neither particularly representative nor exhaustive, and may weaken the validity of the conclusions drawn from the research.

Some recent “collections as data” initiatives (such as collectionsasdata.github.io) have started to explore approaches to best practice for “computationally amenable collections”, with the aim of “encouraging cultural heritage organizations to develop collections and systems that are more amenable to emerging computational methods and tools”. In this presentation, I will suggest some elements of best practice for curatorial institutions in this area.

My observations will be based on three projects which are addressing these issues. The first project is “Collecting the West”, in which Western Australian researchers are working with the British Museum to deploy and evaluate the ResearchSpace software, which is designed to integrate heterogeneous collection data into a cultural heritage knowledge graph. The second project is HuNI – the Humanities Networked Infrastructure – which has been building a “virtual laboratory” for the humanities by reshaping collections data into semantic information networks. The third project – “Reconstructing the Phillipps Collection”, funded by the European Union under its Marie Curie Fellowships scheme – involved combining collections data from a range of digital and physical sources to reconstruct the histories of manuscripts in the largest private collection ever assembled.

Curatorial institutions should recognize that there is a growing group of researchers who do not simply want to search or browse a collections database. There is an increasing demand for access to collections data for downloading and re-use, in suitable formats and on non-restrictive licensing terms. In return, researchers will be able to offer enhanced and improved ways of analyzing and visualizing data, as well as correcting and amplifying collection database records on the basis of research results. There are significant potential benefits for both sides of this partnership.

4:00pm - 5:30pm

F-PII-2: Computational Linguistics 2
Session Chair: Risto Vilkko

PII

4:00pm - 4:30pm
Long Paper (20+10min) [publication ready]

Verifying the Consistency of the Digitized Indo-European Sound Law System Generating the Data of the 120 Most Archaic Languages from Proto-Indo-European

Jouna Pyysalo¹, Mans Hulden², Aleksi Sahala¹

¹University of Helsinki,; ²University of Colorado Boulder

Using state-of-the-art finite-state technology (FST) we automatically generate data of the some 120 most archaic Indo-European (IE) languages from reconstructed Proto-Indo-European (PIE) by means of digitized sound laws. The accuracy rate of the automatic generation of the data exceeds 99%, which also applies in the generation of new data that were not observed when the rules

representing the sound laws were originally compiled. After testing and verifying the consistency of the sound law system with regard to the IE data and the PIE reconstruction, we report the following results:

a) The consistency of the digitized sound law system generating the data of the 120 most archaic Indo-European languages from Proto-Indo-European is verifiable.

b) The primary objective of Indo-European linguistics, a reconstruction theory of PIE in essence equivalent to the IE data (except for a limited set of open research problems), has been provably achieved.

The results are fully explicit, repeatable, and verifiable.

4:30pm - 4:45pm
Short Paper (10+5min) [publication ready]

Towards Topic Modeling Swedish Housing Policies: Using Linguistically Informed Topic Modeling to Explore Public Discourse

Anna Lindahl¹, Love Börjeson²

¹Gothenburg university; ²Graduate School of Education, Stanford University

This study examines how one can apply the method topic modeling to explore the public discourse of Swedish housing policies, as represented by documents from the Swedish parliament and Swedish newstexts. This area is relevant to study because of the current housing crisis in Sweden.

Topic modeling is an unsupervised method for finding topics in large collections of data and this makes it suitable for examining public discourse. However, in most studies which employ topic modeling there is a lack of using linguistic information when preprocessing the data. Therefore, this work also investigates what effect linguistically informed preprocessing has on topic modeling.Through human evaluation, filtering the data based on part of speech is found to have the largest effect on topic quality. Non-lemmatized topics are found to be rated higher than lemmatized topics. Topics from the filters based on dependency relations are found to have low ratings.

4:45pm - 5:00pm
Short Paper (10+5min) [abstract]

Embedded words in the historiography of technology and industry, 1931–2016

Johan Jarlbrink, Roger Mähler

University of Umeå, Sweden

From 1931 to 2016 The Swedish National Museum of Science and Technology published a yearbook, Dædalus. The 86 volumes display a great diversity of industrial heritage and cultures of technology. The first volumes were centered on the heavy industry, such as mining and paper plants located in North and Mid-Sweden. The last volumes were dedicated to technologies and products in people’s everyday lives – lipsticks, microwave ovens, and skateboards. During the years Dædalus has covered topics reaching from individual inventors to world fairs, media technologies from print to computers, and agricultural developments from ancient farming tools to modern DNA analysis. The yearbook presents the history of industry, technology and science, but can also be read as a historiographical source reflecting shifting approaches to history over an 80-year period. Dædalus was recently digitized and can now be analyzed with the help of digital methods.

The aim of this paper is twofold: To explore the possibilities of word embedding models within a humanities framework, and to examine the Dædalus yearbook as a historiographical source with such a model. What we will present is work in progress with no definitive findings to show at the time of writing. Yet, we have a general idea of what we would like to accomplish. Analyzing the yearbook as a historiographical source means that we are interested in what kinds of histories it represents, its focus and bias. We follow Ben Schmidt’s (admittedly simplified) suggestion that word embedding models for textual analysis can be viewed and used as supervised topic model tools (Schmidt, 2015). If words are defined by the distribution of the vocabulary of their contexts we can calculate relations between words and explore fields of related words as well as binary relations in order to analyze their meaning. Simple – and yet fundamental – questions can be asked: What is “technology” in the context of the yearbook? What is “industry”? Of special interest in the case of industrial and technological history are binaries such as rural/urban, man/woman, industry/handicraft, production/consumption, and nature/culture. Which words are close to “man”, and which are close to “woman”? Which aspects of the history of technology and industry are related to “production” and which are related to “consumption”?

Word embedding is a comparatively new set of tools and techniques within data science (NLP) with that in common that the words in a vocabulary of a corpus (or several corpora) are assigned numerical representations through some (of a wide variety of different) computation. In most cases, this comes down to not only mapping the words to numerical vectors, but doing so in such a way that the numerical values in the vectors reflect the contextual similarities between words. The computations are based on the distributional hypothesis stemming from (Zellig Harris, 1954), implicating that “words which are similar in meaning occur in similar contexts” (Rubenstein & Goodenough, 1965). The words are embedded (positioned) in a high-dimensional space, each word represented by a vector in the space i.e. a simple representational model based on linear algebra. The dimension of the space is defined by the size of the vectors and the similarity between words then become a matter of computing the difference between vectors in this space, for instance the difference in (euclidian) distance or difference in direction between the vectors (cosine similarity). Within vector space models the former is the most popular under the assumption that related words tend to have similar directions. The arguably most prominent and popular of these algorithms, and the one that we have used, is the skip-gram model Word2Vec (Mikolov et al, 2013). In short, this model uses a neural network to compute the word vectors as results from training the network to predict the probabilities of all the words in a vocabulary being nearby (as defined by a window size) a certain word in focus.

An early evaluation shows that the model works fine. Standard calculations often used to evaluate the performance and accuracy indicates that we have implemented the model correctly – we can indeed get the correct answers to equations such as “Paris - France + Italy = Rome” (Mikolov et al, 2013). In our case we were looking for “most_similar(positive=['sverige','oslo'], negative=['stockholm'])”. And the “most similar” was “norge”. We have also explored simple word similarity in order to evaluate the model and get a better understanding of our corpus. What remains to be done is to identify relevant words (or group of words) that can be used when we are examining “topics” and binary dimensions in the corpus. We are also experimenting with different ways to cluster and visualize the data. Although some work remains to be done, we will definitely have results to present at the time of the conference.

Harris, Zellig (1954). Distributional structure. Word, 10(23):146–162.

Mikolov, Tomas, Chen, Kai, Corrado, Greg & Dean, Jeffrey (2013). Efficient estimation of word representations in vector space. CoRR, abs/1301.3781

Rubenstein, Herbert & Goodenough, John (1965). Contextual Correlates of Synonymy. Communications of the ACM, 8(10): 627-633.

Schmidt, Ben (2015). Word Embeddings for the digital humanities. Blog post at http://bookworm.benschmidt.org/posts/2015-10-25-Word-Embeddings.html.

5:00pm - 5:15pm
Short Paper (10+5min) [abstract]

Revisiting the authorship of Henry VIII’s Assertio septem sacramentorum through computational authorship attribution

Marjo Kaartinen, Aleksi Vesanto, Anni Hella

University of Turku

Undoubtedly, one of the great unsolved mysteries of Tudor history through centuries has been the authorship of Henry VIII’s famous treatise Assertio septem sacramentorum adversus Martinum Lutherum (1521). The question of its authorship intrigued the contemporaries already in the 1520s. With Assertio, Henry VIII gained from the Pope the title Defender of the Faith which the British monarchs still use. Because of the exceptional importance of the text, the question of its authorship is not irrelevant in the study of history.

For various reasons and motivations each of their own, many doubted the king’s authorship. The discussion has continued to the present day. A number of possible authors have been named, Thomas More and John Fisher foremost among them. There is no clear consensus about the authorship in general – nor is there a clear agreement upon the extent of the King’s role in the writing process in the cases where joint authorship is suggested. The most commonly shared conclusion indeed is that the King was more or less helped in the writing process and that the authorship of the work was thus shared at least to some degree: that is, even if Henry VIII was active in the writing of Assertio, he was not the sole author but was helped by someone or by a group of theological scholars.

In the case of Assertio, The Academy of Finland funded consortium Profiling Premodern Authors (PROPREAU) has tackled the difficult Latin source situation and put an effort into developing more efficient machine learning methods for authorship attribution in a case where large training corpora are not available. This paper will present the latest discoveries in the development of such tools and will report on the results. These will give historians tools for opening a myriad of questions we have been hitherto unable to answer. It is of great significance for the whole discipline of history to be able to name authors to texts that are anonymous or of disputed origin.

Select Bibliography:

Betteridge, Thomas: Writing Faith and Telling Tales: Literature, Politics, and Religion in the Work of Thomas More. University of Notre Dame Press 2013.

Brown, J. Mainwaring: Henry VIII.’s Book, “Assertio Septem Sacramentorum,” and the Royal Title of “Defender of the Faith”. Transactions of the Royal Historical Society 1880, 243–261.

Nitti, Silvana: Auctoritas: l’Assertio di Enrico VIII contro Lutero. Studi e testi del Rinascimento europeo. Edizioni di storia e letteratura 2005.