Linguistik
Refine
Year of publication
- 2009 (56) (remove)
Document Type
- Article (38)
- Preprint (8)
- Review (5)
- Book (3)
- Part of a Book (1)
- Conference Proceeding (1)
Has Fulltext
- yes (56)
Is part of the Bibliography
- no (56)
Keywords
- Linguistik (17)
- Rezension (17)
- Kajkavisch (5)
- Kroatisch (4)
- Namenkunde (3)
- Phonologie (3)
- Grammatik (2)
- Intransitives Verb (2)
- Konsonant (2)
- Lehnübersetzung (2)
Institute
- Extern (56) (remove)
Während Anglizismen in deutscher Jugend- und Standardsprache bereits gut untersucht sind, stellt der Einfluss des Englischen auf multiethnolektale Varietäten des Deutschen noch ein unbestelltes Feld dar. Mit diesem Beitrag möchten wir einen Anstoß für künftige Forschungsarbeit in diesem Gebiet geben und zugleich einige erste Schritte unternehmen
Multicomponent Tree Adjoining Grammars (MCTAGs) are a formalism that has been shown to be useful for many natural language applications. The definition of non-local MCTAG however is problematic since it refers to the process of the derivation itself: a simultaneity constraint must be respected concerning the way the members of the elementary tree sets are added. Looking only at the result of a derivation (i.e., the derived tree and the derivation tree), this simultaneity is no longer visible and therefore cannot be checked. I.e., this way of characterizing MCTAG does not allow to abstract away from the concrete order of derivation. In this paper, we propose an alternative definition of MCTAG that characterizes the trees in the tree language of an MCTAG via the properties of the derivation trees (in the underlying TAG) the MCTAG licences. We provide similar characterizations for various types of MCTAG. These characterizations give a better understanding of the formalisms, they allow a more systematic comparison of different types of MCTAG, and, furthermore, they can be exploited for parsing.
This paper investigates the class of Tree-Tuple MCTAG with Shared Nodes, TT-MCTAG for short, an extension of Tree Adjoining Grammars that has been proposed for natural language processing, in particular for dealing with discontinuities and word order variation in languages such as German. It has been shown that the universal recognition problem for this formalism is NP-hard, but so far it was not known whether the class of languages generated by TT-MCTAG is included in PTIME. We provide a positive answer to this question, using a new characterization of TT-MCTAG.
Traditionally, parsers are evaluated against gold standard test data. This can cause problems if there is a mismatch between the data structures and representations used by the parser and the gold standard. A particular case in point is German, for which two treebanks (TiGer and TüBa-D/Z) are available with highly different annotation schemes for the acquisition of (e.g.) PCFG parsers. The differences between the TiGer and TüBa-D/Z annotation schemes make fair and unbiased parser evaluation difficult [7, 9, 12]. The resource (TEPACOC) presented in this paper takes a different approach to parser evaluation: instead of providing evaluation data in a single annotation scheme, TEPACOC uses comparable sentences and their annotations for 5 selected key grammatical phenomena (with 20 sentences each per phenomena) from both TiGer and TüBa-D/Z resources. This provides a 2 times 100 sentence comparable testsuite which allows us to evaluate TiGer-trained parsers against the TiGer part of TEPACOC, and TüBa-D/Z-trained parsers against the TüBa-D/Z part of TEPACOC for key phenomena, instead of comparing them against a single (and potentially biased) gold standard. To overcome the problem of inconsistency in human evaluation and to bridge the gap between the two different annotation schemes, we provide an extensive error classification, which enables us to compare parser output across the two different treebanks. In the remaining part of the paper we present the testsuite and describe the grammatical phenomena covered in the data. We discuss the different annotation strategies used in the two treebanks to encode these phenomena and present our error classification of potential parser errors.
U radu se promatraju akuzativne dopune uz desetak neprijelaznih glagola u hrvatskome jeziku te se razmatra njihova pripadnost kategoriji unutrašnjih objekata. Uspoređuju se sintaktička i semantička svojstva takvih dopuna: broj imenica koje se pojavljuju uz neprijelazni glagol, obvezatnost modifikacije imenice, paralelnost instrumentalnih i akuzativnih sintagmā, mogućnost parafraze instrumentalnom sintagmom te mogućnost pronominalizacije i pasivizacije. Autorice zaključuju da sve dopune istraživanih neprijelaznih glagola ne pripadaju istomu tipu dopuna te ih je potrebno gramatički i terminološki razdvojiti. Također pretpostavljaju da unutrašnji objekti u hrvatskome mogu imati argumentno i adjunktno čitanje, što je u skladu s nekim nedavno iznesenim tvrdnjama za druge jezike.
Prefácio Estas notas gramaticais são o resultado dos trabalhos da Sociedade Internacional de Linguística (SIL), em Moçambique. O propósito da série Mongrafias Linguísticas Moçambicanas é de encorajar o uso da língua local, neste caso concreto, do Echuwabo através da descrição estruturada e facilitar ao público em geral um melhor acesso a mais um aspecto da rica cultura moçambicana. As notas sobre Ecuwabu foram produzidas durante o workshop “Descubra a Sua Língua”, conduzido no centro de treinamento da SIL, na cidade de Nampula, de 4 a 20 de Junho de 2006. Os participantes receberam formação na estrutura das línguas bantu em geral, depois investigaram suas línguas maternas. Esta brochura não serve como “a última palavra” sobre a língua Ecuwabu, mas serve para estimular mais interesse no uso e estudo da língua Ecuwabu, seja pelos falantes, seja por não falantes deste idioma. Importa salientar que a audiência que tinhamos na mente é o cidadão sem formação académica, aos estudiosos recomendamos a leitura dos livros e artigos linguísticos indicados no anexo bibliográfico. Queria agradecer o Sr. Romão Marçal, que teclou este documento no computador, e as nossas colegas, Sra. Susan Seiler e Sra. Marijane Beutler que fizeram o trabalho de formatação e impressão do presente livro. dr. Oliver Kröger Nampula, Junho de 2003
We present a CYK and an Earley-style algorithm for parsing Range Concatenation Grammar (RCG), using the deductive parsing framework. The characteristic property of the Earley parser is that we use a technique of range boundary constraint propagation to compute the yields of non-terminals as late as possible. Experiments show that, compared to previous approaches, the constraint propagation helps to considerably decrease the number of items in the chart.
Distributional approximations to lexical semantics are very useful not only in helping the creation of lexical semantic resources (Kilgariff et al., 2004; Snow et al., 2006), but also when directly applied in tasks that can benefit from large-coverage semantic knowledge such as coreference resolution (Poesio et al., 1998; Gasperin and Vieira, 2004; Versley, 2007), word sense disambiguation (Mc- Carthy et al., 2004) or semantical role labeling (Gordon and Swanson, 2007). We present a model that is built from Webbased corpora using both shallow patterns for grammatical and semantic relations and a window-based approach, using singular value decomposition to decorrelate the feature space which is otherwise too heavily influenced by the skewed topic distribution of Web corpora.
Mit der Möglichkeit, anhand digitaler Telefonanschlüsse Familiennamen nach Bestand, Trägerzahl und räumlicher Verbreitung mit großer Genauigkeit zu erfassen, hat eine neue Epoche der Anthroponomastik begonnen. Der Schatz von 850661 verschiedenen Familiennamen, die im Jahre 2005 in 28205713 privaten Festnetzanschlüssen registriert waren, ist immens, und die Fragestellungen zu seiner Erforschung sind in ihrer Ausrichtung und in ihrer Anzahl unerschöpflich. In dieser Situation ergaben sich vordringlich zwei Aufgaben: Erstens musste angesichts der von Jahr zu Jahr wachsenden Bevölkerungsmobilität, angesichts der Auswirkung neuerer Namengesetzgebung und angesichts der schnell zunehmenden Ablösung lokalisierter Festnetzanschlüsse durch Mobiltelefone der Namenbestand spätestens jetzt aufgrund der zuverlässigsten Quelle und in legitim nutzbarer Weise gesichert und archiviert werden. Die geschichtlich gewachsenen Namenlandschaften sind gerade noch, und zwar in erstaunlicher Stabilität, erhalten. Die Daten wurden nach Klärung der Datenschutzfragen von der Deutschen Telekom auf Stand Juni 2005 dem Deutschen Familiennamenatlas zur Verfügung gestellt und ihre Nutzung zur namenkundlichen Forschung mit Vertrag vom 28.06.2005 geregelt.