Conference Agenda

Session Overview
Session
F-P674-2: Between the Manual and the Automatic
Time:
Friday, 09/Mar/2018:
4:00pm - 5:30pm

Session Chair: Eero Hyvönen
Location: P674

Presentations
4:00pm - 4:15pm
Short Paper (10+5min) [publication ready]

In search of Soviet wartime interpreters: triangulating manual and digital archive work

Svetlana Probirskaja

University of Helsinki

This paper demonstrates the methodological stages of searching for Soviet wartime interpreters of Finnish in the digital archival resource of the Russian Ministry of Defence called Pamyat Naroda (Memory of the People) 1941–1945. Since wartime interpreters do not have their own search category in the archive, other means are needed to detect them. The main argument of this paper is that conventional manual work must be done and some preliminary information obtained before entering the digital archive, especially when dealing with a marginal subject such as wartime interpreters.


4:15pm - 4:30pm
Distinguished Short Paper (10+5min) [abstract]

Digital Humanities Meets Literary Studies: the Challenges for Estonian Scholarship

Piret Viires1, Marin Laak2

1Tallinn University; 2Estonian Literary Museum

In recent years, the application of DH as a method of computerised analysis and the extensive digitisation of literary texts, making them accessible as open data and organising them into large text corpora, have made the relations between literature and information technology a hot topic.

New directions in literary history link together literary analysis, computer technology and computational linguistics, offering new possibilities for studying the authors’ style and language, analysing texts and visualising results.

Along such mainstream uses, DH still contain several other important directions for literary studies. The aim of this paper is to check out the limits and possibilities of DH as a concept

and to determine their suitability for literary research in the digital age. Our discussion is based, first, on the twenty-year-long experience of digital representing of Estonian literary

and cultural heritage and, second, on the synchronous study of digitally born literary forms; we shall also offer more representative examples.

We shall also discuss the concept of DH from the viewpoint of literary studies, e.g., we examine the ways of positioning the digitally created literature (both “electronic literature”

and the literature born in social media) under this renewed concept. This problem was topical in the early 2000s, but in the following decade it was replaced by the broader ideas of

intermedia and transmedia, which treated literary texts only as one medium among many others. Which are the specific features of digital literature, which are its accompanying effects

and how has the role of the reader as the recipient changed in the digital environment? These theoretical questions are also indirectly relevant for making the literature created in the era of

printed books accessible as e-books or open data.

Digitising of older literature is the responsibility of memory institutions (libraries, archives, museums). Extensive digitising of texts at memory institutions seems to have been done for

making reading more convenient – books can be read even on smartphones. Digitising works of fiction as part of the projects for digitising cultural heritage has been carried out for more

than twenty years. What is the relation of these virtual bookshelves with the digital humanities? We need to discover whether and how do both the digitally born literature and

the digitised literature that was born in the era of printing have an effect on literary theory. Our paper will also focus on mapping different directions, practices and applications of DH in

the present day literary theory. The topical question is how to bridge the gap between the research possibilities offered by the present day DH and the ever increasing resources of texts,

produced by memory institutions. We encounter several problems. Literary scholars are used to working with texts, analysing them as undivided works of poetry, prose or drama. Using of

DH methods requires the treating of literary works or texts as data, which can be analysed and processed with computer programmes (data mining, using visualisation tools, etc.). These

activities require the posing of new and totally different research questions in literary studies. Susan Schreibman, Ray Siemens and John Unsworth, the editors of the book A New Companion to Digital Humanities (2016), discuss the problems of DH and point out in their Foreword that it is still questioned whether DH should be considered a separate discipline or, rather, a set of different interlinked methods. In our paper we emphasise the diversity of DH as an academic field of research and talk about other possibilities it can offer for literary research in addition to computational analyses of texts.

In Estonia, research on the electronic new media and the application of digital technology in the field of literary studies can be traced back to the second half of the 1990s. The analysis of

social, cultural and creative effect (see Schreibman, Siemens, Unsworth 2016: xvii-xviii), as well as constant cooperation with social sciences in the research of the Internet usage have

played an important role in Estonian literary studies.


4:30pm - 4:45pm
Short Paper (10+5min) [abstract]

Digital humanities and environmental reporting in television during the Cold War Methodological issues of exploring materials of the Estonian, Finnish, Swedish, Danish, and British broadcasting companies

Simo Laakkonen

University of Turku, Degree Programme on Cultural Production and Landscape Studies

Environmental history studies have relied on traditional historical archival and other related source materials so far. Despite the increasing availability of new digitized materials studies in this field have not reacted to these emerging opportunities in any particular way. The aim of the proposed paper is to discuss possibilities and limitations that are embodied in the new digitized source materials in different European countries. The proposed paper is an outcome of a research project that explores the early days of television prior to the Earth Day in 1970 and frame this exploration from an environmental perspective. The focus of the project is reporting of environmental pollution and protection during the Cold War. In order to realize this study the quantity and quality of related digitized and non-digitized source materials provided by the national broadcasting companies of Estonia (ETV), Finland (YLE), Sweden (SVT), Denmark (DR), and United Kingdom (BBC) were examined. The main outcome of this international comparative study is that the quantity and quality of available materials varies greatly, even in a surprising way between the examined countries that belonged to different political spheres (Warsaw Pact, neutral, NATO) during the Cold War.


4:45pm - 5:00pm
Short Paper (10+5min) [abstract]

Prosodic clashes between music and language – challenges of corpus-use and openness in the study of song texts

Heini Arjava

University of Helsinki,

In my talk I will discuss the relationship between linguistic and musical rhythm, and the connections to digital humanities and open science that arise in their study. My ongoing corpus research discusses the relationship between linguistic and musical segment length in songs, focusing on instances where the language has adapt prosodically to the rhythmic frame provided by pre-existing music. More precisely, the study addresses the question of how syllable length and note length interact in music. To what extent can non-conformity between linguistic and musical segment length, clashes, be acceptable in song lyrics, and what other prosodic features, such as stress, may influence the occurrence of clashes in segment length?

Addressing these questions with a corpus-based approach leads to questions of retrieving information retrieval complicated corpora which combine two medias (music and language), and the openness and accessibility of music sources. In this abstract I will first describe my research questions and the song corpus used in my study in section 1, and discuss their relationship with the use, analysis and availability of corpora, and issues of open science in section 2.

1. Research setting and corpus

My study aims to approach the comparison of musical and linguistic rhythm by both qualitative and statistical methods. It bases on a self-collected song corpus in Finnish, a language where syllable length has a versatile relationship with stress (cf. Hakulinen et al 2004). Primary stress in Finnish is weight-insensitive and always falls on the first syllable of a word, and syllables of any length, long or short, can be stressed or unstressed. Finnish sound segment length is also phonemic, that is, creates distinctions of meaning. Syllable length in Finnish is therefore of particular interest in a study of musical segment length, because length deviations play an evident role in language perception.

Music and text can be combined into a composition in a number of ways, but my study focuses on the situations in which language is the most dependent of music. Usually there are three alternative orders in which music and language can be combined into songs: First, text and music may be written simultaneously and influence the musical and linguistic choices of the writer at the same time (Language < – > Music). Secondly, text can precede the music, as when composers compose a piece to existing poetry (Language –> Music). And finally, the melody may exist first, as when new versions of songs are created by translating or otherwise rewriting them to familiar tunes (Music –> Language).

My research is concerned with this third relationship, because it poses the strongest constraints on the language user. The language (text) must conform to the music’s already existing rhythmic frame that is in many respects inflexible, and in such cases, it is difficult to vary the rhythmic elements of the text, because the musical space restricts the rhythmic tools available for the language user. This in turn may lead to non-neutral linguistic output. Thus the crucial question arises: How does language adapt its rhythm to music?

My corpus contains songs that clearly and transparently represent the relationship of music being created first and providing the rhythmic frame, and language having to adjust to that frame. The pilot corpus consists of 15 songs and approximately 1500 prosodically annotated syllables of song texts in Finnish, translated or otherwise adapted from different languages, or written to instrumental or traditional music. The genres include chansons, drinking songs, Christmas songs and hymns, which originate from different eras and languages, namely English, French, German, Swedish, and Italian.

One data point in the table format of the corpus is a Finnish syllable, the prosodic properties of which I compare with the rhythm of the respective notes (musical length and stress). The most basic instance of a clash between segment lengths is the instance where a short syllable ((C)V in Finnish) falls on a long note (i.e. a longer note than a basic half-beat) . Both theoretical and empirical evidence will be used to determine which length values create the clearest cases of prosodic clashes.

A crucial presupposition when problematising the relationship between a musical form and the text written to it is the notion that a song is not poetry per se (I will return to this conception in section 2). The conventions of Western art music allow for a far greater range of length distinctions than language: the syllable lengths usually fall into binary or ternary categories (e.g. short and long syllables), whereas in music notes can be elongated infinitely. A translated song in which all rhythmic restrictions come from the music may follow the lines of poetic traditions, but must deviate from them if the limits of space within music do not allow for full flexibility. It is therefore an intermediate form of verbal art.

2. Challenges for digital humanities and open science

The corpus-based approach to language and music poses problematic questions regarding digital humanities. First of these is, of course, if useful music-linguistic corpora can be found at all at the present. Existent written and spoken corpora of the major European languages contain millions of words, often annotated to a great linguistic detail (cf. Korp of Kielipankki for Finnish (korp.csc.fi), which offers detailed contextual, morphological and syntactic analysis). For music as well, digital music scores can be found “in a huge number” (Ponce de León et al. 2008:560). Corpora of song texts with both linguistic and musical information seem to be more difficult to find.

One problem of music linguistic studies is related to the more restricted openness and shareability of sources than that of written or spoken language. The copyright questions of art are in general a more sensitive issue than for instance those of newspaper articles or internet conversations, and the reluctance of the owners of song texts and melodies may have made it difficult to create open corpora of contemporary music.

But even with ownership problems aside (such as with older or traditional music), building a music-linguistic corpus remains a difficult task to comply. A truly useful corpus of music for linguistic purposes would include metadata of both medias, both language and music. Thus even an automatically analysed metric corpus of poetry, like Anatoli Starostin’s Treeton for metrical analysis of Russian poems (Pilshcikov & Starostin 2011) or the rhythmic Metricalizer for determining meter by stress patterns in German poems (Bobenhausen 2011) does not answer to the questions of rhythm of a song text, which exists in a extra-linguistic medium, music, altogether. Vocal music is metrical, but it is not metrical in the strict sense of poetic conventions, with which it shares the isochronic base. Automated analysis of a song text without its music notation does not tell anything about its real metrical structure.

On a technical level, a set of tools that is necessary for researchers of music are the tools for quick visualization of music passages (notation tools, sound recognition). Such software can be found and used freely in the internet and are useful for depiction purposes. Mining of information from music requires more effort, but has been done in various projects for instance for melody information retrieval (Ponce de León et al. 2008), or metrical detection of notes (Temperley 2001). But again, these tools seem to rarely combine linguistic and musical meter simultaneously.

By raising these questions I hope to bring attention to the challenges of studying texts in the musical domain, that is, not simply music or poetry separately. The crux of the issue is that for the linguistic analysis of song texts we need actual textual data where the musical domain appears as annotated metadata. Means exist to analyse text automatically, and to analyse musical patterns with sound recognition or otherwise, but to combine the two raises the analysis to a more complicated level.

Literature

Blumenfeld, Lev. 2016. End-weight effects in verse and language. In: Studia Metrica Poet. Vol. 3.1 pp. 7–32.

Bobenhausen, Klemens. 2011. The Metricalizer – Automated Metrical Markup of German Poetry. In: Küper, C. (ed.), Current trends in metrical analysis, pp. 119-131. Frankfurt am Main; New York: Peter Lang.

Hayes, Bruce. 1995. Metrical Stress Theory: principals and case studies. Chicago: The University of Chicago Press.

Hakulinen, et al. (eds.). 2004. Iso suomen kielioppi, pp.44–48. Helsinki: Suomalaisen Kirjallisuuden Seura.

Jeannin, M. 2008. Organizational Structures in Language and Music. In: The World of Music,50(1), pp. 5–16.

Kiparsky, Paul. 2006. A modular metrics for folk verse. In: B. Elan Dresher & Nila Friedberg (eds.), Formal approaches to poetry: recent developments in metrics, pp.7–52. Berlin: Mouton de Gruyter.

Lerdahl, Fred & Jackendoff, Ray. 1983. A generative theory of tonal music. Cambridge (MA): MIT.

Lotz, John. 1960. Metric typology. In: Thomas Sebeok (ed.), Style in language. Massachusetts: The M.I.T. Press.

Palmer, Caroline & Kelly, Michael H. 1992. Linguistic Prosody and Musical Meter in Song.

Journal of memory and language 31, pp. 525–542.

Pilshchikov, Igor & Starostin, Anatoli. 2011. Automated Analysis of Poetic Texts and the Problem of Verse Meter. In: Küper, C. (ed.), Current trends in metrical analysis, pp. 133–140. Frankfurt am Main; New York: Peter Lang.

Ponce de León, Pedro J., Iñesta, José M. & Rizo, David. 2008. Mining Digital Music Score Collections: Melody Extraction and Genre Recognition. In: Peng-Yeng Yin (ed.), Pattern Recognition Techniques, Technology and Applications, pp. 626–. Vienna: I-Tech.

Temperley, D. 2001. The Cognition Of Basic Musical Structures. Cambridge, Mass: MIT Press.


5:00pm - 5:15pm
Distinguished Short Paper (10+5min) [abstract]

Finnish aesthetics in scientific databases

Darius Pacauskas, Ossi Naukkarinen

Aalto University School of Arts, Design and Architecture

The major academic databases such as Web of Science and Scopus are dominated by publications written in English, often by scholars affiliated to American and British universities. As such databases are repeatedly used as basis for assessing and analyzing activities and impact of universities and even individual scholars, there is a risk that everything published in other, especially minor languages, will be sidetracked. Standard data-mining procedures do not notice them. Yet, especially in humanities, other languages and cultures have an important role and scholars publish in various languages.

The aim of this research project is to critically look into how Finnish aesthetics is represented in scientific databases. What kind of picture of Finnish aesthetics can we draw if we rely on the metadata from commonly used databases?

We will address this general issue through one example. We will compare metadata from two different databases, in two different languages, English and Finnish, and form a picture of two different interpretations of an academic field, aesthetics - or estetiikka in Finnish. To achieve this target we will employ citation analysis, as well as text summarization techniques, in order to understand the differences lying between the largest world scientific database - Scopus, and the largest Finnish one - Elektra. Moreover, we will identify the most influential Finnish aestheticians and analyze their publications record in order to understand to what extent the scientific databases can represent Finnish aesthetics. Through this, we will present 1) two different maps containing actors and works recognized in the field, and 2) an overview of the main topics from two different databases.

For these goals, we will collect metadata from the both Scopus and Elektra databases and references from each relevant article. Relevant articles will be located by using keyword “aeshetics” or the Finnish equivalent “estetiikka”, as well as identifying scientific journals focusing on aesthetics. We will perform citation analysis to explore in which countries which publications are cited, based on Scopus data. This comparison will allow us to understand what are the most prominent works for different countries, as well as to find the countries in which those works are developed, e.g., works that are acknowledged by Finnish aestheticians according to international database. In addition, the comparison will allow us to understand how Finnish aesthetics differs from other countries.

Later, we will perform citation analysis with the data gathered from the Finnish scientific database Elektra. Results will indicate distribution between cited Anglo-American texts and the ones written in Finland or in Finnish language. Thus we could understand which language-family sources Finnish aestheticians rely on in their works. Further we will apply text summary techniques to see the differences in the topics both databases are discussing. Furthermore, we will collect a list names of the most influential Finnish aestheticians, and their works (as provided by the databases). We will perform searches within two databases to understand how much of their works are covered.

As additional contribution, we will be developing an interactive web based tool to represent results of this research. Such tool will give an opportunity for aesthetics researchers to explore Finnish aesthetics field through our established lenses and also comment on possible gaps in the pictures offered by the databases. It is possible that databases only give a very partial picture of the field and in this case new tools should be developed in co-operation with researchers. The similar situation might be true also in other sub-fields of humanities where non-English activities are usual.