Linguistik
Refine
Year of publication
Document Type
- Doctoral Thesis (23) (remove)
Has Fulltext
- yes (23)
Is part of the Bibliography
- no (23)
Keywords
- Fokus (2)
- Semantik (2)
- Adjective (1)
- Adjektiv (1)
- Afroasiatisch (1)
- Alternative Questions (1)
- Alternativfragen (1)
- Armenian (1)
- Aussprache (1)
- Aussprache-Datenbank (1)
Institute
Die vorliegende Arbeit diente der Gewinnung neuer Erkenntnisse über die historische Entwicklung und Typisierung von Fragesätzen. Die Analyse basiert auf Materialien verschiedener indogermanischer Sprachen (Griechisch, Armenisch, Gotisch, Altkirchenslavisch, Altrussisch) sowie einer außerindogermanischen kaukasischen Sprache (Altgeorgisch). Primär wurden Bibeltexte aus dem Alten und Neuen Testament anhand von Faksimileausgaben und elektronischen Textcorpora untersucht. Die Arbeit demonstrierte anhand von über 540 Beispielen, welche Kriterien, graphische oder grammatische Mittel, Fragewortstellung oder Satzgliederfolge, in den überlieferten Texten für die Entschlüsselung von Fragesätzen hilfreich waren. Für jede betrachtete Sprache wurde eine möglichst ausführliche Klassifikation der Hauptfragesatztypen vorgelegt. Ferner gehörte zum Untersuchungsobjekt der Arbeit die in den Fragesätzen implizierte Antworterwartung. Für die weitere Analyse dieser Erscheinung wurden die Fragesätze aufgrund ihrer formalen Kennzeichen für Antworterwartung und Bedeutung eingeteilt. Anhand des übereinzelsprachlichen Vergleiches war es möglich, die spezifischen interrogativen Charakteristika aufzuzeigen, die in den einzelnen Sprachen für die jeweiligen schriftlich fixierten Perioden typisch waren. Wenn relevant, wurden auch Daten aus jüngeren Sprachstufen herangezogen. Die zum Schluss vorgestellte Gegenüberstellung von indogermanischen und kaukasischen Sprachen war erforderlich, um nicht nur genetisch bedingte sprachspezifische Charakteristika von Fragesätzen aufzuzeigen, sondern auch allgemeinsprachliche spezifische Merkmale zu eruieren.
Manual development of deep linguistic resources is time-consuming and costly and therefore often described as a bottleneck for traditional rule-based NLP. In my PhD thesis I present a treebank-based method for the automatic acquisition of LFG resources for German. The method automatically creates deep and rich linguistic presentations from labelled data (treebanks) and can be applied to large data sets. My research is based on and substantially extends previous work on automatically acquiring wide-coverage, deep, constraint-based grammatical resources from the English Penn-II treebank (Cahill et al.,2002; Burke et al., 2004; Cahill, 2004). Best results for English show a dependency f-score of 82.73% (Cahill et al., 2008) against the PARC 700 dependency bank, outperforming the best hand-crafted grammar of Kaplan et al. (2004). Preliminary work has been carried out to test the approach on languages other than English, providing proof of concept for the applicability of the method (Cahill et al., 2003; Cahill, 2004; Cahill et al., 2005). While first results have been promising, a number of important research questions have been raised. The original approach presented first in Cahill et al. (2002) is strongly tailored to English and the datastructures provided by the Penn-II treebank (Marcus et al., 1993). English is configurational and rather poor in inflectional forms. German, by contrast, features semi-free word order and a much richer morphology. Furthermore, treebanks for German differ considerably from the Penn-II treebank as regards data structures and encoding schemes underlying the grammar acquisition task. In my thesis I examine the impact of language-specific properties of German as well as linguistically motivated treebank design decisions on PCFG parsing and LFG grammar acquisition. I present experiments investigating the influence of treebank design on PCFG parsing and show which type of representations are useful for the PCFG and LFG grammar acquisition tasks. Furthermore, I present a novel approach to cross-treebank comparison, measuring the effect of controlled error insertion on treebank trees and parser output from different treebanks. I complement the cross-treebank comparison by providing a human evaluation using TePaCoC, a new testsuite for testing parser performance on complex grammatical constructions. Manual evaluation on TePaCoC data provides new insights on the impact of flat vs. hierarchical annotation schemes on data-driven parsing. I present treebank-based LFG acquisition methodologies for two German treebanks. An extensive evaluation along different dimensions complements the investigation and provides valuable insights for the future development of treebanks.
If we want to develop a semantic analysis for explicit performatives such as I promise you to free Willy, we are faced with the following puzzle: In order to account for the speech act expressed by the performative verb, one can assume that the so-called performative clause is purely performative and provides the illocutionary force of the speech act whose content is given by the semantic object denoted by the complement clause. Yet under this perspective, the performative clause that is, next to the performative verb, the indexicals I and you that refer to the speaker and to the addressee of the utterance context is semantically invisible and does not contribute compositionally its meaning to the meaning of the entire explicit performative sentence. Conversely, if we account for the truth conditional contribution of the performative clause and deny that the meaning of the performative verb is purely performative, then we have to find a way to account for the speech act expressed by the performative verb. Of course, there is already the widely accepted and very appealing indirectness account for explicit performative utterances developed by Bach & Harnish (1979). Roughly, Bach and Harnish solve this puzzle in deriving the performativity by means of a pragmatic inference process. According to them, the important speech act performed by means of the utterance of the explicit performative sentence is a kind of the conventionalized indirect speech act. However, the boundary between semantics and pragmatics can be drawn in many various ways. Therefore, I think there could be other perspectives regarding the interface between the truth-functional treatment of the declarative explicit performative sentences and the speech acts performed with their utterances and which are expressed by the performative verbs. Hence, this thesis consists in the experiment to develop a further analysis and to check out its consequences with respect to the semantics and pragmatics of explicit performative utterances and the new interface emerged. Briefly, the experiment runs as follows: First, I develop an analysis for explicit performative sentences framed by parenthetical structures such as in (1)(a). In a second step, this parenthetical analysis is applied to the proper Austinian explicit performative sentences in (1)(b). (1) a. Tomorrow, I promise you this, I will teach them Tyrolean songs. b. I promise you that I will teach them Tyrolean songs. To analyze at first explicit performatives framed by parenthetical structures bears the convenience that we are faced with two utterances of two main clauses. In (1)(a) there is the utterance of the host sentence Tomorrow I will teach them Tyrolean songs, and the utterance of the explicit parenthetical I promise you this, where the demonstrative this refers to the utterance of Tomorrow I will teach them Tyrolean songs. Since speakers perform speech acts with utterances of main clauses, I assume that the meaning of the explicit parenthetical I promise you this specifies that the actual illocutionary force of the utterance of Tomorrow I will teach them Tyrolean songs is the illocutionary force of a promise. Hence, instead of deriving an indirect illocutionary force by means of a pragmatic inference schema, we can deal with an ordinary direct speech act that is performed with the utterance of the host sentence. This kind of analysis stresses the particular discourse function of explicit performative utterances. Performative verbs are used whenever the contextual information is not sufficient to determine the illocutionary force of the corresponding implicit speech act. The resulting consequences of the parenthetical analysis are interesting since they cast a different light on performative verbs. Surprisingly, the performative verbs are not performative at all. They do not constitute the execution of a speech act, but are execution supporting. Instead of constituting the particular illocutionary force, they merely specify the illocutionary force of the utterance of the host sentence. For instance, the speaker utters the explicit parenthetical I promise you this for specifying what he is simultaneously doing. Hence the speaker does not succeed in performing the promise simply because he is uttering I promise you this. Rather, by means of the information conveyed by the utterance of I promise you this, the potential illocutionary forces of the utterance of the host sentence are disambiguated. Thus, it is not the case that explicit parentheticals are trivially true when uttered. Their function is more complex. Their self-verifying property (‘saying so makes it so’) is explained by means of disambiguation. Furthermore, according to the parenthetical analysis, instead of being purely performative, the performative verbs contribute compositionally their meanings to the truth conditions of the entire explicit performative sentence. Together with its consequences, this analysis is applied to the proper Austinian performatives, which display subordination. I assume that regardless of their structure, explicit performatives always semantically and pragmatically behave as the parenthetical analysis predicts.
Statistical machine translation (SMT) should benefit from linguistic information to improve performance but current state-of-the-art models rely purely on data-driven models. There are several reasons why prior efforts to build linguistically annotated models have failed or not even been attempted. Firstly, the practical implementation often requires too much work to be cost effective. Where ad-hoc implementations have been created, they impose too strict constraints to be of general use. Lastly, many linguistically-motivated approaches are language dependent, tackling peculiarities in certain languages that do not apply to other languages. This thesis successfully integrates linguistic information about part-of-speech tags, lemmas and phrase structure to improve MT quality. The major contributions of this thesis are: 1. We enhance the phrase-based model to incorporate linguistic information as additional factors in the word representation. The factored phrase-based model allows us to make use of different types of linguistic information in a systematic way within the predefined framework. We show how this model improves translation by as much as 0.9 BLEU for small German-English training corpora, and 0.2 BLEU for larger corpora. 2. We extend the factored model to the factored template model to focus on improving reordering. We show that by generalising translation with part-of-speech tags, we can improve performance by as much as 1.1 BLEU on a small French- English system. 3. Finally, we switch from the phrase-based model to a syntax-based model with the mixed syntax model. This allows us to transition from the word-level approaches using factors to multiword linguistic information such as syntactic labels and shallow tags. The mixed syntax model uses source language syntactic information to inform translation. We show that the model is able to explain translation better, leading to a 0.8 BLEU improvement over the baseline hierarchical phrase-based model for a small German-English task. Also, the model requires only labels on continuous source spans, it is not dependent on a tree structure, therefore, other types of syntactic information can be integrated into the model. We experimented with a shallow parser and see a gain of 0.5 BLEU for the same dataset. Training with more training data, we improve translation by 0.6 BLEU (1.3 BLEU out-of-domain) over the hierarchical baseline. During the development of these three models, we discover that attempting to rigidly model translation as linguistic transfer process results in degraded performance. However, by combining the advantages of standard SMT models with linguistically-motivated models, we are able to achieve better translation performance. Our work shows the importance of balancing the specificity of linguistic information with the robustness of simpler models.
This dissertation investigated the development of the complementiser that from the demonstrative pronoun in the Germanic languages; each chapter dealt with a different aspect. In the introduction, the terms ‘reanalysis’ and ‘analogy’ and their relevance for grammaticalisation were explained, and the issues of the chapters were presented. The second chapter introduced some information about the Germanic language family and the languages which were relevant for this investigation, namely Gothic, Old English, Old Icelandic, Old Saxon and Old High German. Previous assumptions about the diachrony of that were presented and discussed. One of these proposals which mainly draws on evidence from West Germanic involves the idea that the source construction contained two independent main clauses with a demonstrative pronoun (that) at the end of the first clause (cf. e.g. Paul 1962, § 248). In contrast to this, the Gothic evidence showed that the source construction of the reanalysis of ϸatei was not a proper paratactic construction (at least in Gothic) but already a complex construction which contained a complementiser (ei) in the appositional subordinate clause (cf. also e.g. Longobardi 1994 for the diachrony of ϸatei). This contradiction raised the question whether the analysis of the Gothic that-complementiser also applies to the diachrony of that in West Germanic. This issue was taken up in the third chapter which presented an overview of subordination and complementisers in Northwest Germanic. The aim was to show that the Northwest Germanic languages also show a subordinating particle, which functions like the Gothic ei, namely ϸe (OE), er/es (OI), the (OHG, OS). As a result, the subordinating particle could be observed in relative and adverbial clauses in all Northwest Germanic languages. In complement clauses, which are most crucial for the argumentation, the subordinating particle is found in Old English and Old Icelandic but not in Old Saxon. In Old High German, there are only combinations of the with a following pronoun, theih and theiz, in ‘Otfrids Evangelienbuch’ (see Wunder 1965). Consequently, the presence of a subordinating particle is confirmed in North and West Germanic. The fact that the patterns of subordination are quite similar in all Germanic languages suggested a unitary analysis of the development of that in Germanic was appropriate. In chapter four, the similarities and differences between the Germanic languages with respect to the development of that were explained. It was argued that the preconditions of the reanalysis were the same, whereas the consequences of the reanalysis are realised differently in each language. The most important precondition was that the appositional source construction (explained in more detail below) was generally available in Germanic. Since the demonstrative pronoun at the end of the matrix clause and the subordinating particle of the subordinate clause were adjacent, phonological combination might have been crucial for the subsequent reanalysis to take place. After reanalysis, however, different changes can be observed in the different languages. For instance, it appears that during the Old English period the final syllable of the form ϸætte was deleted (see chapter 4 for references), whereas the final –ei is still present in the Gothic ϸatei, and completely absent in Old High German and Old Saxon. The source structure of the reanalysis was discussed in detail in a separate subsection. The appositional source construction, which was already assumed for the reanalysis of Gothic ϸatei, was compared with analyses of clitic left dislocation which propose that two constituents with the same theta-role derive from a Big DP (see e.g. Grewendorf 2009, Belletti 2005). Based on the Big DP analysis of Grewendorf (2009), it was claimed that the appositional clause, introduced by the subordinating particle, is generated in the Spec of a DP, and adjoined to this DP on the surface. It was argued that this whole complement DP-node occurred in an extraposed position in OV-languages so that the verb, when it stays in-situ, does not appear between the demonstrative pronoun and the subordinating particle. The structure in (1) illustrates the syntactic source structure which is assumed to apply to the development of the complementiser that in Germanic. ...
Die vorliegende Arbeit stellt die Phonologie, Morphologie und Syntax des Nyam, einer westtschadischen Minoritätensprache Nordostnigerias, dar. Es handelt sich um eine Erstbeschreibung, die im Zuge eines von der DFG finanzierten Projekts mit dem Titel „Das Nyam – Dokumentation einer westtschadischen Minoritätensprache“ durchgeführt werden konnte.
Ziel dieser Arbeit ist es, eine grammatische Beschreibung des Nyam – eine bis dato unbekannte Sprache – vorzulegen. Mit nur ca. 5000 Sprechern ist sie schon im Hinblick auf die geringe Zahl, vor allem aber durch die regionale Dominanz der mit ihr genetisch verwandten Verkehrssprache Hausa, akut in ihrer Existenz bedroht. Zudem befindet sich diese Sprache in einer geographisch exponierten Lage, d.h. sie ist weitgehend von Benue-Kongo-Sprachen umgeben. Vor diesem Hintergrund kann die Dokumentation des Nyam einerseits den Nyam-Sprechern selbst zur Erhaltung ihrer kulturellen Identität und der damit verbundenen Traditionen dienen. Andererseits ist dieser wissenschaftliche Beitrag als Ergänzung zu den noch fehlenden Grammatiken innerhalb der tschadischen Sprachfamilie und im Besonderen der Bole-Tangale-Sprachgruppe zu sehen und kann als Grundstein zukünftiger Forschungen für vergleichende Arbeiten mit den benachbarten Benue-Kongo-Sprachen betrachtet werden.
This dissertation provides an analysis of Finnish prosody, with a focus on the sentence or phrase level. The thesis analyses Finnish as a phrase language. Thus, it accounts for prosodic variation through prosodic phrasing and explains intonational differences in terms of phrase tones.
Finnish intonation has traditionally been described in terms of accents associated with stressed syllables, i.e. similarly as prototypical intonation languages like English or German. However, accents are usually described as uniform instead of forming an inventory of contrasting accent types. The present thesis confirms the uniformity of Finnish tonal contours and explains it as based on realisations of tones associated with prosodic phrases instead of accents. Two levels of phrasing are discussed: Prosodic phrases (p-phrases) and intonational phrases (i-phrases). Most prominently, the p-phrase is marked by a high tone associated with its beginning and a low tone associated with its end; realisations of these tones form the rise-fall contours traditionally analysed as accents. The i-phrase is associated with a final tone that is either low or high and additionally marked by voice quality and final lengthening. While the tonal specifications of these phrases are thus predominantly invariant, variation arises from different distributions of phrases.
This analysis is based on three studies, two production experiments and one perception study. The first production study investigated systematic variation in information structure, first syllable vowel quantity and the target word's position in the sentence, while the second production experiment induced variation in information structure, first and second syllable type and number of syllables. In addition to fundamental frequency, the materials were analysed regarding duration, the occurrence of pauses and voice quality. The perception study investigated the interpretation of compound/noun phrase minimal pairs with manipulated fundamental frequency contours using a two-alternative forced-choice picture selection task. Additionally, a pilot perception study on variation in peak height and timing supported the assumption of uniform tonal contours.
Die hier vorgelegte empirische Untersuchung der Fokuspartikeln im Georgischen zeichnet sich u.a. durch die sprach¬immanente Tatsache aus, dass die Fokusstrukturen im Georgischen mit expliziten Partikeln markiert werden können. Die in dieser Arbeit untersuchten Fokuspartikelgruppen ( ġa, c und c+ḳi) sind entsprechend den semantischen Implikationen der Restriktion, Addition und der Skalierung gegliedert worden.
Trotz gewisser Unterschiede im Einzelnen ergab sich folgendes gemeinsames Modell für die Stellungseinschränkungen in Relation zum Prädikatsverb:
• Durch Fokuspartikeln fokussierte Wörter stehen im Georgischen in der Regel unmittelbar vor dem Prädikatsverb.
• Die Skopi der Fokuspartikeln (wenn die fokusmarkierten Worte grammatische Köpfe der NPs sind) stehen im Georgischen in der Regel vor dem Prädikatsverb.
• Die nächstmögliche optimale Interpretationsposition für fokusmarkierte Wörter ist in der Regel die unmittelbare Verbnachstellung.
• Die nächstmögliche optimale Interpretationsposition der
Fokusgruppe ist in der Regel die unmittelbare Verbnachstellung.
Aufgrund der herausgearbeiteten Stellungseinschränkungen entwerfe ich das pragmatische Modell der informationsgliedernden Verbfinalität als Basisabfolge im georgischen Satz.
This dissertation explores the linguistic identity changes of Chinese international students in Germany, and the relationship between their identity reconstruction and their multilingual competence. With the social turn (Block, 2003) of applied linguistics, research on study abroad has shown that student sojourners abroad encounter challenges not only to their language abilities, but also to their identities, which explains the vast individual differences in the measurable outcomes of student sojourns abroad. However, the realm of learners’ linguistic identity development in the English as a lingua franca (ELF) and multilingual contexts remains to be further explored, since most existing studies examined learners in the target language community. Guided by poststructuralist views and sociocultural theories, this study is designed with a view towards investigating the lived experience of Chinese international students at German universities.
Employing a qualitative approach, my research tracked seventeen Chinese students’ experiences of language learning and use in both their social lives and academic settings over one year. The empirical work combined semi-structured, in-depth interviews and emails. Three rounds of one-to-one interviews were conducted every 6 months and each round focused on students’ respective past, present and future. The grounded theory approach (Corbin & Strauss, 2015) was used in this study to analyse the data, aiming at generating theoretical explanations for phenomena through constant comparison.
The results of the category-based analysis offer a new lens on the intricate linguistic identity development of Chinese students in the study abroad context. The construction of their new identity facets is related to various contextual elements in experiences of their language learning and use. More importantly, learners’ identity changes related to the use of ELF is conceived as within a framework of multilingualism (Jenkins, 2015). In any given social interaction, learners’ linguistic identities are influenced by a combination of factors: perceived linguistic proficiency gap, power distribution,preferred communication styles, sensitivity to second/third language self-images and openness to new cultures. It is these factors, instead of the lingua franca context or
target language context per se, that come into play in the reformation of learners’
linguistic identities. Learners’ linguistic identity changes, together with their priority setting in studying abroad, are in turn interconnected with their multilingual competence development.
The findings of my study suggest theories for understanding learners’ linguistic identity development and the outcomes of their language learning in the study abroad context in the face of the complexity of individual experiences. My study also demonstrates the importance to foster learners’ “self-presentational competence” (Pellegrino Aveni, 2005: 145-146) so that they could successfully negotiate new subject positions when crossing the borders.
This thesis investigates the acquisition of compositional and lexical semantic properties of adjectives in German-speaking children between the age of two and five years.
According to formal semantic approaches, there are intersective and non-intersective adjectives, subsective and non-subsective adjectives as well as gradable and non-gradable adjectives. These properties concern the compositional mechanisms involved in nominal modification, i.e., the combination of adjectives and nouns. In addition, adjectives differ regarding lexical semantic properties that contribute to the adjectives' meaning. Differences in the adjectives' scale structure have led to the theoretical assumption that gradable adjectives should be distinguished into relative and absolute gradable adjectives. In addition, meaning components such as multidimensionality or subjectivity have led to the distinction between dimensional and evaluative gradable adjectives. These properties have been mostly investigated independently of each other in both theory and acquisition research. I suggest a classification system for adjectives that combines different semantic properties. This system results in six adjective classes constituting a Semantic Complexity Hierarchy. Assuming that these adjective classes differ in semantic complexity, I propose an operationalization of semantic complexity that takes into account the adjectives' length of description, their type complexity, and lexical properties that contribute to the adjectives' meaning.
Regarding the question of how monolingual German-speaking children acquire the semantics of adjectives, I hypothesize that the order of acquisition of adjectives is determined by their semantic complexity. This hypothesis is tested in a spontaneous speech study and a comprehension experiment.
The spontaneous speech study is a longitudinal investigation of the production of adjectives from 2;00 to 2;11 years based on transcripts from a dense data corpus. The results provide evidence that the mean age of acquisition for the adjective classes in the Semantic Complexity Hierarchy follows the order predicted by semantic complexity. The same order was observed for the age at which the number of types for each class increased most. A preliminary analysis of the input indicates that the frequency of parental adjective use is related to the order of acquisition, but it is unlikely that frequency determines the order completely.
The comprehension experiment focuses on two specific adjective classes. I examine children's and adults' interpretation of relative (big, small) and absolute (clean, dirty) gradable dimensional adjectives with a picture-choice task. These two classes are of the same semantic complexity because they are both gradable, but they have different scale structures. As a result, they must be interpreted differently due to lexical semantic properties. I investigate whether children calculate different standards of comparison for relative and absolute gradable adjectives and whether they distinguish between relative and absolute gradable adjectives regarding the relevance of the explicit comparison class. The results indicate that as of age 3, children distinguish between relative and absolute gradable adjectives with regard to the standard of comparison. However, with respect to the relevance of the comparison class, for 3-year-old children, unlike for 4- and 5-year-olds, changes in the noun, i.e., in the explicit comparison class, led to non-adult-like responses regarding both relative and absolute gradable adjectives.
On the basis of the empirical findings, I propose an acquisition path stating that children enter the acquisition process with inherent linguistic knowledge, the Semantic Complexity Hierarchy, and cognitive abilities to categorize their environment. I suggest that initially, children apply the least complex interpretation available in the Semantic Complexity Hierarchy to all adjectives: all adjectives are interpreted as properties of individuals that are not gradable. To access other levels of the Semantic Complexity Hierarchy and to establish more complex adjective classes, positive evidence from the input and conceptual properties of adjectives, e.g., COLOR, MENTAL STATE, PHYSICAL PROPERTY etc., can operate as triggers.