400 Sprache
Refine
Year of publication
Document Type
- Article (233)
- Part of a Book (230)
- Preprint (96)
- Conference Proceeding (78)
- Report (30)
- Book (18)
- Working Paper (18)
- Doctoral Thesis (17)
- Part of Periodical (15)
- Bachelor Thesis (2)
Language
- English (741) (remove)
Is part of the Bibliography
- no (741)
Keywords
- Spracherwerb (43)
- Sprachtest (35)
- Sinotibetische Sprachen (31)
- Deutsch (26)
- Phonetik (25)
- Syntax (25)
- Informationsstruktur (20)
- Semantik (20)
- Tibetobirmanische Sprachen (15)
- Nominalisierung (14)
Institute
Highlights
• Gender cues are defined differently across languages.
• We propose a new refined and standardized definition of gender transparency.
• Gender transparency is quantifiable with values that match theoretical expectations.
• We present the first quantitative method to measure the gender transparency of languages.
Abstract
Languages can express grammatical gender through different ortho-phonological regularities present in nouns (e.g., the cues “-o” and “-a” for the masculine and the feminine respectively in Italian, Portuguese, or Spanish). The term “gender transparency” was coined to describe these regularities (Bates et al., 1995). In gendered languages, we can hence distinguish between transparent nouns, i.e., those displaying form regularities; opaque nouns, i.e., those with ambiguous endings; and irregular nouns, i.e., those that display the typical form regularities but are associated with the opposite gender. Following a descriptive analysis of such regularities, languages have been recently classified according to their degree of gender transparency, which seems relevant in regard to gender acquisition and processing. Yet, there are certain inconsistencies in determining which languages are overall transparent and which are opaque. In particular, it is not clear whether some other complex regularities such as derivational suffixes are also “transparent” cues for gender, what really constitutes an “opaque” noun, or which role orthography and morphology have in transparency. Given the existing inconsistencies in classifying languages as transparent or opaque, this work introduces a proposal to assess gender transparency systematically. Our methodology adapts the standardized factors proposed by Audring (2019) to analyse the relative complexity of gender systems. Such factors are adapted to gender transparency on the basis of the literature on gender acquisition and processing. To support the feasibility of such a proposal, the concepts have been instantiated in a quantitative model to obtain for the first time an objective measure of gender transparency using European Portuguese and Dutch as instances of target languages. Our results coincide with the theoretically expected outcome: European Portuguese obtains a high value of gender transparency while Dutch obtains a moderately low one. Future adaptations of this model to the gender systems of other languages could allow the continuum of gender transparency to sustain robust predictions in studies on gender processing and acquisition.
Pitch peaks tend to be higher at the beginning of longer than shorter sentences (e.g., ‘A farmer is pulling donkeys’ vs ‘A farmer is pulling a donkey and goat’), whereas pitch valleys at the ends of sentences are rather constant for a given speaker. These data seem to imply that speakers avoid dropping their voice pitch too low by planning the height of sentence-initial pitch peaks prior to speaking. However, the length effect on sentence-initial pitch peaks appears to vary across different types of sentences, speakers and languages. Therefore, the notion that speakers plan sentence intonation in advance due to the limitations in low voice pitch leaves part of the data unexplained. Consequently, this study suggests a complementary cognitive account of length-dependent pitch scaling. In particular, it proposes that the sentence-initial pitch raise in long sentences is related to high demands on mental resources during the early stages of sentence planning. To tap into the cognitive underpinnings of planning sentence intonation, this study adopts the methodology of recording eye movements during a picture description task, as the eye movements are the established approximation of the real-time planning processes. Measures of voice pitch (Fundamental Frequency) and incrementality (eye movements) are used to examine the relationship between (verbal) working memory (WM), incrementality of sentence planning and the height of sentence-initial pitch peaks.
The standard view of the form-meaning interfaces, as embraced by the great majority of contemporary grammatical frameworks, consists in the assumption that meaning can be associated with grammatical form in a one-to-one correspondence. Under this view, composition is quite straightforward, involving concatenation of form, paired with functional application in meaning. In this book, we discuss linguistic phenomena across several grammatical sub-modules (morphology, syntax, semantics) that apparently pose a problem to the standard view, mapping out the potential for deviation from the ideal of one-to-one correspondences, and develop formal accounts of the range of phenomena. We argue that a constraint-based perspective is particularly apt to accommodate deviations from one-to-many correspondences, as it allows us to impose constraints on full structures (such as a complete word or the interpretation of a full sentence) instead of deriving such structures step by step.
Most of the papers in this volume are formulated in a particular constraint-based grammar framework, Head-driven Phrase Structure Grammar. The contributions investigate how the lexical and constructional aspects of this theory can be combined to provide an answer to this question across different linguistic sub-theories.
This dissertation is about case competition in headless relatives. Case competition is a situation in which two cases are assigned but only one of them surfaces. One of the constructions in which case competition takes place is in headless relatives, i.e. relative clauses that lack a head. This dissertation has two goals: (i) to give an overview of the data, and (ii) to provide an account for the observed data.
The grammaticality of a headless relative is determined by two aspects. The first aspect concerns which case wins the case competition. In all languages with case competition that I am aware of, this is determined by the case scale in NOM < ACC < DAT. A case more to the right on the scale wins over a case more to the left on the scale. This scale is not specific to case competition in headless relatives, but it can also be observed in syncretism patterns and morphological case containment. I show that that the case scale can be derived from assuming the cumulative case decomposition (cf. Caha 2009). A case wins over another case when it contains all features that the other case contains.
The second aspect of case competition in headless relatives concerns whether the winner of the case competition is allowed to surface when it wins the case competition. The winning case can be either the internal case required by the predicate in the relative clause, or the external case required by the predicate in the main clause. It differs from language to language whether they allow the internal and the external case to surface.
All language types I discuss allow for a headless relative when the internal and the external case match. The unrestricted type of language allows both the internal case and the external case to surface when either of them wins the case competition. Examples of this language type are Old High German, Gothic and Ancient Greek. The internal-only type of language allows only the internal case to surface when it wins the case competition, and it does not allow the external case to do so. An example of this language type is Modern German. The external-only type of language allows only the external case to surface when it wins the case competition, and it does not allow the internal case to do so. To my knowledge, there is no language that behaves like this. The matching type of language allows neither the internal nor the external case to surface when either of them wins the case competition. An example of this language type is Polish.
To account for the data, I set up a proposal that generates the attested patterns and excludes the non-attested ones. I let the variation between languages follow from properties of languages that can be independently observed. By investigating the morphology of the languages, I suggest differences between the lexical entries in the different languages. These different lexical entries ultimately lead languages to be of different types. In my proposal, I assume that headless relatives are derived from light-headed relatives. Light-headed relatives contain a light head and a relative pronoun. In a headless relative either the light head or the relative pronoun is deleted. The necessary requirement for deletion is that the deleted element (either the light head or relative pronoun) is structurally or formally contained in the other element.
I motivate the analysis for the internal-only type of language for Modern German, for the matching type of language for Polish and for the unrestricted type of language for Old High German. I first identify the morphemes that the light heads and relative pronouns in the languages consist of, and then I show to which features each of the morphemes correspond. The crucial difference between the internal-only type of language Modern German and the matching type of language Polish is how the phi and case features are spelled out. In Modern German they are spelled out by a phi and case feature portmanteau, and, in Polish, the same features are spelled out by a phi feature morpheme and a case feature morpheme. Old High German differs from the other two languages in that it has light heads and relative pronouns that are syncretic. I show how these differences in the morphology of the languages ultimately leads to different grammaticality patterns in headless relatives.
Comparing my account to others shows that all proposals account for the case facts using some kind of case hierarchy. The proposals differ in how they model the variation, both in the technical details of the proposal, but more importantly, also in empirical scope and predictions they make.
Large language models have become widely available to the general public, especially due to ChatGPT's release. Consequently, the AI community has invested much effort into recreating language models of the same caliber as ChatGPT, since the latter is still a technical blackbox. This thesis aims to contribute to that cause by proposing R.O.B.E.R.T., a Robotic Operating Buddy for Efficiency, Research and Teaching. In doing so, it presents a first implementation of a lightweight environment which produces tailor-made, instruction-following language models with a heavy focus on conversational capabilities that instruct themselves into a given domain-context. Within this environment, the generation of datasets, the fine-tuning process and finally the inference of a unique R.O.B.E.R.T. instance are all carried out as part of an automated pipeline.
This introductory paper provides an overview of the main phenomena investigated in this Special Issue, such as the relation between the encoding of indefinites and the presence of genitive and definite markers, the relation between partitivity and indefiniteness and the distribution of these phenomena in minority, or “micro”, varieties – such as Italian dialects, Galloromance varieties, North and South Saami – compared to the distribution of the same phenomena in majority, or “macro”, varieties – such as French, Italian, Spanish, Brazilian Portuguese, Estonian, Finnish, Czech and Serbian. The second part of the paper, then, provides an overview of the content of each original paper collected in the special issue.
This thesis investigates the structure of research articles in the field of Computational Linguistics with the goal of establishing that a set of distinctive linguistic features is associated with each section type. The empirical results of the study are derived from the quantitative and qualitative evaluation of research articles from the ACL Anthology Corpus. More than 20,000 articles were analyzed for the purpose of retrieving the target section types and extracting the predefined set of linguistic features from them. Approximately 1,100 articles were found to contain all of the following five section types: abstract, introduction, related work, discussion, and conclusion. These were chosen for the purpose of comparing the frequency of occurrence of the linguistic features across the section types. Making use of frameworks for Natural Language Processing, the Stanford CoreNLP Module, and the Python library SpaCy, as well as scripts created by the author, the frequency scores of the features were retrieved and analyzed with state-of-the-art statistical techniques.
The results show that each section type possesses an individual profile of linguistic features which are associated with it more or less strongly. These section-feature associations are shown to be derivable from the hypothesized purpose of each section type.
Overall, the findings reported in this thesis provide insights into the writing strategies that authors employ so that the overall goal of the research paper is achieved.
The results of the thesis can find implementation in new state-of-the-art applications that assist academic writing and its evaluation in a way that provides the user with a more sophisticated, empirically based feedback on the relationship between linguistic mechanisms and text type. In addition, the potential of the identification of text-type specific linguistic characteristics (a text-feature mapping) can contribute to the development of more robust language-based models for disinformation detection.
In a similar way to dramatic performances and plays, song lyrics establish a complex discourse structure whereby listeners are placed in a position to overhear ‘the pretence of a conversation constructed to convey the performer’s meaning’ (Nahajec 2019: 25; see also Short 1996: 169). In Swift’s songwriting, listeners are positioned not only to eavesdrop on the narratives presented, but are also invited to conceptualise and enact particular roles and scenarios in the discourse. This paper offers a stylistic analysis of songwriting and narrative structure across Swift’s oeuvre to identify how disnarration strategies are used to build stories in her two sister albums written and produced during the Covid-19 pandemic, folklore (2020) and evermore (2020). Specifically, this study examines how disnarration characterises the albums’ narrators, establishes narrator-narratee relationships and invites listeners to adopt a participatory role in the meaning-making process. Through close analysis of four songs across the two albums, this paper builds on developing studies of the stylistics of songwriting (see West 2019), and argues that disnarration strategies foreground particular themes within the discourse, such as nostalgia, wistfulness and regret, and contribute to the fictionalisation and self-aware storytelling characteristic of these albums’ storyworlds.
The aim of this article is to show how linguistic and literary studies can benefit from the joint analysis of linguistic structures in poetry. Firstly, the analysis of poetry has an important impact on linguistic theory as it leads our attention to specific structures and meanings that so far have not been considered. Secondly, a close linguistic analysis can reveal hitherto overlooked facets of meaning which have a great significance for the overall interpretation of a poem. We focus on Bare Root Infinitives (BRIs) in German. As they lack the features for tense, mood, person and number, they are more flexible in meaning than finite forms. When looking at poetry, besides the well-known deontic and bouletic meanings (cf. Reis 1995, 2003; Gärtner 2014) a third meaning that we call reactive meaning stands out. Remarkably, this reactive meaning can also be found in everyday language. Its specific semantic properties show that a semantic analysis of BRIs in the style of Kaufmann (2012) is adequate: modality, but not non-referentiality, is a grammatically given semantic property of BRIs. The specific case study of the poem ‘muster fixieren’ (‘fixing patterns’) by Nico Bleutge reveals how the restricted context of the poem interacts with the different interpretations of BRIs, resulting in a complex interpretation of the text.