Linguistik
Refine
Year of publication
- 2007 (60) (remove)
Document Type
- Article (42)
- Preprint (7)
- Working Paper (5)
- Part of a Book (3)
- Conference Proceeding (2)
- Report (1)
Has Fulltext
- yes (60)
Is part of the Bibliography
- no (60)
Keywords
- Kroatisch (35)
- Rezensionen (17)
- Kajkavisch (5)
- Familienname (3)
- Lexikographie (3)
- Personennamenkunde (3)
- Geschlechterforschung (2)
- Numerale (2)
- Phraseologie (2)
- Präposition (2)
Institute
- Extern (60) (remove)
U radu se daje pregled problema povezanih s normativnim statusom čestičnih/vezničkih skupina da li, je li i čestice/veznika li. Pokazuje se da postoji nekoliko pogrješaka povezanih s tumačenjem normativnog statusa i raspodjele tih skupina i te čestice te se provjerava normativno pravilo prema kojemu skupinu da li treba u standardnome jeziku zamijeniti česticom li (o tome se posve pogrješno često piše kao o zamjeni da li s je li, a skupina je li, s iznimkom skupine je li da koja ima funkciju dopunskoga pitanja, u standardnome jeziku ne postoji kao čestična/veznička skupina jer je njezin prvi član uvijek 3. lice prezenta glagola biti). Određuje se normativni status skupine je li, tj. pokazuje se da je ona u hrvatskome jeziku ili zastarjela ili da pripada razgovornomu stilu. Također se provjeravaju pravila u skladu s kojima se normativni status skupine da li u izravnome pitanju razlikuje od njezina statusa u neizravnome pitanju i prema kojima se skupina da li i u standardnome jeziku pojavljuje pri izricanju potvrdnosti te u alternativnim pitanjima. Donose se uvjeti zamjenjivosti skupina da li česticom/veznikom li, tj. izdvajaju se sintaktički konteksti u kojima ta zamjena nije potrebna ili nije moguća.
Multicomponent Tree Adjoining Grammars (MCTAG) is a formalism that has been shown to be useful for many natural language applications. The definition of MCTAG however is problematic since it refers to the process of the derivation itself: a simultaneity constraint must be respected concerning the way the members of the elementary tree sets are added. This way of characterizing MCTAG does not allow to abstract away from the concrete order of derivation. In this paper, we propose an alternative definition of MCTAG that characterizes the trees in the tree language of an MCTAG via the properties of the derivation trees (in the underlying TAG) the MCTAG licences. This definition gives a better understanding of the formalism, it allows a more systematic comparison of different types of MCTAG, and, furthermore, it can be exploited for parsing.
We investigate methods to improve the recall in coreference resolution by also trying to resolve those definite descriptions where no earlier mention of the referent shares the same lexical head (coreferent bridging). The problem, which is notably harder than identifying coreference relations among mentions which have the same lexical head, has been tackled with several rather different approaches, and we attempt to provide a meaningful classification along with a quantitative comparison. Based on the different merits of the methods, we discuss possibilities to improve them and show how they can be effectively combined.
Prepositional phrase (PP) attachment is one of the major sources for errors in traditional statistical parsers. The reason for that lies in the type of information necessary for resolving structural ambiguities. For parsing, it is assumed that distributional information of parts-of-speech and phrases is sufficient for disambiguation. For PP attachment, in contrast, lexical information is needed. The problem of PP attachment has sparked much interest ever since Hindle and Rooth (1993) formulated the problem in a way that can be easily handled by machine learning approaches: In their approach, PP attachment is reduced to the decision between noun and verb attachment; and the relevant information is reduced to the two possible attachment sites (the noun and the verb) and the preposition of the PP. Brill and Resnik (1994) extended the feature set to the now standard 4-tupel also containing the noun inside the PP. Among many publications on the problem of PP attachment, Volk (2001; 2002) describes the only system for German. He uses a combination of supervised and unsupervised methods. The supervised method is based on the back-off model by Collins and Brooks (1995), the unsupervised part consists of heuristics such as ”If there is a support verb construction present, choose verb attachment”. Volk trains his back-off model on the Negra treebank (Skut et al., 1998) and extracts frequencies for the heuristics from the ”Computerzeitung”. The latter also serves as test data set. Consequently, it is difficult to compare Volk’s results to other results for German, including the results presented here, since not only he uses a combination of supervised and unsupervised learning, but he also performs domain adaptation. Most of the researchers working on PP attachment seem to be satisfied with a PP attachment system; we have found hardly any work on integrating the results of such approaches into actual parsers. The only exceptions are Mehl et al. (1998) and Foth and Menzel (2006), both working with German data. Mehl et al. report a slight improvement of PP attachment from 475 correct PPs out of 681 PPs for the original parser to 481 PPs. Foth and Menzel report an improvement of overall accuracy from 90.7% to 92.2%. Both integrate statistical attachment preferences into a parser. First, we will investigate whether dependency parsing, which generally uses lexical information, shows the same performance on PP attachment as an independent PP attachment classifier does. Then we will investigate an approach that allows the integration of PP attachment information into the output of a parser without having to modify the parser: The results of an independent PP attachment classifier are integrated into the parse of a dependency parser for German in a postprocessing step.
Deklinacija brojeva dva, oba, tri i četiri u kajkavskim pravnim tekstovima od 16. do 18. Stoljeća
(2007)
Autori se u članku bave deklinacijom brojeva dva, oba, tri i četiri u kajkavskim tekstovima pravne regulative od 16. do 18. stoljeća. Kao korpus za jezičnu analizu uzimaju 23 teksta iz 16. st., 40 tekstova iz 17. st. i 19 tekstova iz 18. st. U jezičnoj se analizi posebna pažnja posvećuje usporedbi između oblika dvojine i množine u deklinaciji brojeva dva i oba, kao i razvoju množinskih oblika u deklinaciji brojeva tri i četiri. Autori navode sve zabilježene oblike brojeva dva, oba, tri i četiri, uspoređuju njihovu pojavnost u različitom vremenskom presjeku i na temelju rezultata jezične analize nude deklinacijski tip navedenih brojeva. Deklinacija brojeva u kosim padežima promatra se s obzirom na to jesu li navedeni brojevi dijelom prijedložnih ili neprijedložnih izraza, a posebno je pitanje učestalosti indeklinabilnih oblika.
Die Familiennamen sind als einziger Bereich der europäischen Sprachen in ihrer ausgeprägten räumlichen Vielfalt noch höchst unzureichend erfasst. Noch sind die geschichtlich gewachsenen Namenlandschaften in erstaunlicher Stabilität erhalten. Sie werden im Bereich der Bundesrepublik Deutschland durch den seit 2005 in Kooperation der Universitäten Freiburg und Mainz in Angriff genommenen und durch die DFG geförderten 'Deutschen Familiennamenatlas' (OFA) auf der Basis von Telefonanschlüssen (Stand 2005) dokumentiert. Im vorliegenden Beitrag werden Vorarbeiten, Ziele, Gesamtanlage des Projekts, Systematik und Repräsentativität der Themenauswahl in den beiden Hauptteilen (grammatischer und lexikalischer Teil) sowie Kriterien und Methoden der inhaltlichen Konzipierung und formalen Gestaltung der Karten und Kommentare vorgestellt und begründet. Aus den genannten Vorarbeiten werden auch schon Perspektiven künftiger Auswertung der in den Datenbanken archivierten Materialien und der im Atlas exemplarisch dokumentierten Strukturen der Namenlandschaften ersichtlich.
Im ersten Teil wird zunächst die wenige Forschungsliteratur zum Thema Deskriptivität selbst und eng verwandten Themen vorgestellt und besprochen. Daraus soll sich im Anschluss auch eine Definition des Begriffes ergeben, die weit genug gefasst ist, um die übliche Verwendungsweise des Begriffs bei Autoren, die ihn zwar benutzen, aber nicht theoretisch behandeln, zu erfassen, die sich aber andererseits dennoch in klar definierten und nachvollziehbaren Grenzen bewegt. Dabei soll weiterhin deutlich werden, dass es sich bei Deskriptivität um ein prinzipiell in allen Sprachen anzutreffendes Phänomen handelt, dass sich aber die Frequenz deskriptiver Ausdrücke von Sprache zu Sprache stark unterscheiden kann. Dabei werde ich Daten aus ausgewählten Sprachen einbeziehen und eine quantitative Analyse des Ausmaßes, mit dem verschiedene Sprachen von deskriptiven Bildungen Gebrauch machen vorstellen. Der zweite Hauptteil der Arbeit beschäftigt sich mit folgender Frage: Wenn jede Sprache zu einem gewissen Grad von deskriptiven Benennungen Gebrauch macht, welche Mechanismen des Sprachwandels gibt es, die die Position einer Sprache auf dieser Skala in die eine oder die andere Richtung verändern können?
Die zielsprachliche Verwendung des Artikels als grammatikalisiertem Mittel der NP-Determination im Deutschen stellt im Zweitspracherwerb besonders für Deutschlernende mit einer artikellosen Muttersprache eine große Schwierigkeit dar. Die vorliegende Arbeit untersucht die NP-Determination auf der Basis eines Spontansprachkorpus, welches Erwerbsdaten einer achtjährigen russischen Deutschlernenden in einer frühen und einer späten Erwerbsphase liefert. Das Ziel der Untersuchung ist, Erkenntnisse über Entwicklungsverlauf, Transferphänomene und insbesondere referenzsemantische und phonologische Determinanten der Artikelwahl zu gewinnen.
Govorni se činovi najlakše prepoznaju i razgraničuju u dijalogu pa su dramski tekstovi vrlo pogodni za analizu i propitivanje teorije govornih činova. Krležinoj drami U agoniji možemo pristupiti kao korpusu za oprimjerenje konstativnoga i performativnoga shvaćanja jezika. U toj se drami sukob doista gradi na oprečnome shvaćanju jezika, a to se i verbalno eksplicira, pa se drama odvija na svojevrsnoj metajezičnoj razini gdje se glavni karakteri “svađaju” zato što govore različitim jezicima. Govorni činovi u drami, posebice komplimenti, analizirani su i s aspekta feminističke lingvistike.