Linguistik
Refine
Year of publication
Document Type
- Article (186)
- Preprint (69)
- Part of a Book (65)
- Working Paper (40)
- Conference Proceeding (33)
- Book (24)
- Review (12)
- Part of Periodical (7)
- Course Material (1)
- Report (1)
Language
- Croatian (150)
- English (141)
- German (120)
- Portuguese (9)
- Turkish (7)
- mis (4)
- French (3)
- Italian (2)
- Multiple languages (1)
- Spanish (1)
Has Fulltext
- yes (438) (remove)
Is part of the Bibliography
- no (438)
Keywords
- Kroatisch (50)
- Linguistik (50)
- Rezension (48)
- Deutsch (35)
- Computerlinguistik (32)
- Syntax (19)
- Japanisch (18)
- Grammatik (17)
- Namenkunde (17)
- Rezensionen (17)
Institute
- Extern (438) (remove)
U radu se prvi put objavljuje jedina zasad pronađena inačica glagoljičke pasionske pjesme, koju smo nazvali Ja, Marija, glasom zovu, zapisane u Berčićevu kodeksu br. 5 s kraja 15. st. Donosi se latinička transkripcija teksta te njegove osnovne književnopovijesne, grafijsko-ortografske i jezično-stilske značajke.
U radu se analizira sintaktička funkcija participa u hrvatskome jeziku 15./16. st. jer su se otprilike u to vrijeme u sintaktičkom ustrojstvu (staro)hrvatskoga jezika događale vrlo krupne jezične promjene, koje su posljedica “departicipijalizacije” participa, tj. preobrazbe naslijeđenih participnih oblika u glagolske priloge.
U radu se analizira poglavlje Sprichwörter – Prirečja iz Kristijanovićeva Anhanga, aneksnog rječnika dodanog njegovoj Grammatik der kroatischen Mundart. Prirečja sadrže kajkavsku paremiološku građu s njemačkim ekvivalentima koja se analizira s obzirom na njezine izvore i leksikografsku obradbu. Pokazuje se da osim poslovica rječnik sadrži i frazeme i kolokvijalne izraze. Posebna se pozornost posvećuje semantičkoj analizi poslovica i naznačuje na koji se aspekt ljudskog života pouka i poruka odnose.
U radu se raspravlja o prefiksu ne- u kajkavskome književnom jeziku. Proučavaju se rječotvorbeni procesi u kojima sudjeluje, tvorbeni načini u kojima se javlja te se ispituje njegova frekventnost u tvorbi pojedinih vrsta riječi. Istražuju se i rječotvorbene i semantičke veze tvorenica s predmetkom ne- i riječi koje su ih motivirale.
U radu se analiziraju pravi tvorbeni mocijski parnjaci u kajkavskome književnom jeziku. Utvrđuju se sufiksi produktivni u mocijskoj tvorbi u književnoj kajkavštini, njihova učestalost i korelativni parnjaci u kojima se javljaju. Rezultati se uspoređuju s osobitostima mocijske tvorbe u hrvatskome standardnom jeziku.
U radu se analizira uloga deiktičkih obilježivača (markera) u generičkom strukturiranju diskursa. Najprije se podsjeća da u postojećim tipologijama žanrova diskursa prisutnost deiktičkih obilježivača i drugih tragova subjektivnosti predstavlja važan kriterij za razgraničenje žanrova koji koegzistiraju u određenom društveno-povijesnom okviru unutar određenog tipa diskursa i za deskripciju uvjeta njihove diversifikacije. Zatim se, na primjeru diskursa medijske informacije na čije generičko strukturiranje utječu različite strategije objektivizacije, nastoji pokazati da primjena tog kriterija dobiva svoj puni smisao tek u kombinaciji s kriterijima koji se odnose na tekstualna i situacijska obilježja relevantna za generičko strukturiranje diskursa, a to su, s jedne strane, komunikacijski ciljevi sudionika u interakciji i specifične diskurzivne aktivnosti koje oblikuju relacijski profil tekstualnih struktura i, s druge strane, složenost interakcijskog okvira i stupanj heterogenosti deiktičkih i polifonijskih struktura.
U radu se analizira uloga jednog tipa referencijalnih izraza – anaforičkih izraza – u diskurzivnom oblikovanju odabranog medijsko-znanstvenog događaja (“uskrsnuće” bakterije Deinococcus radiodurans). Predlaže se transverzalna analiza anaforičkih izraza utemeljena na modularnom pristupu kompleksnosti organizacije diskursa i na dinamičnoj koncepciji anaforičke referencije, shvaćene kao segment šireg procesa konceptualnog strukturiranja svijeta diskursa i usuglašavanja mentalnih predodžbi sudionika u interakciji.
Predmet ovog rada su kajkavizmi u Tkonskom zborniku – glagoljskom rukopisu koji je početkom 16. stoljeća pisan na frankopanskim posjedima. Utvrđeno je da su u tom rukopisu prisutni kajkavizmi na svim razinama: fonološkoj, morfološkoj, leksičkoj i sintaktičkoj. Najviše je kajkavizama na leksičkoj razini, a oni se mogu podijeliti u dvije skupine: 1. zajednički čakavsko- kajkavski sloj, npr. betegь, gdo, nigdar, hiniti, hud, kaštigati, lotar itd.; 2. kajkavski sloj, npr. fajtati, gorup, nekoteri, pokrivača, škoda, špotati, tanac itd. Prva je kategorija leksema interpolirana u gotovo svim dijelovima CTk, a druga je najčešća u Cvetu od kreposti i Muci. Tkonski zbornik čuva jedno ogromno leksičko bogatstvo, a pri usporedbi pojedinih leksema s onima u hrvatskoglagoljskim misalima i brevijarima, zaključeno je da su neki od njih potvrđeni i ranije, npr. betegь, kaštigati, praviti, gorup, tanac itd. To je potvrda o kontinuitetu hrvatskoglagoljske književnosti. Interpolacija kajkavizama nije ujednačena u svim dijelovima zbornika, kajkavske su intervencije najčešće u Cvetu od kreposti (f. 67 – 85) i u Muci Spasitelja našega (f. 109 – 161). Na temelju provedenog istraživanja može se zaključiti da je Tkonski zbornik rukopis sastavljen iz različitih dijelova, koji nisu nastali u istom razdoblju, ni na istom mjestu. Budući da kajkavizme u pojedinim dijelovima nalazimo na svim razinama (Cvet od kreposti i Muka), može se pretpostaviti da su oni nastali u sjevernom području, tj. bliže kajkavskom.
After giving an overview of the implementation of Business German in the curricula of German Departments outside of Germany and showing which place Business German has taken within these departments today, this article focuses on the teaching goals and contents as well as on the competences that ought to be achieved by the students in the German Department at Istanbul University in order to explain which chances and opportunities this study field opens up to students of German language and literature.
Speakers of Russian from the former Soviet Union and speakers of Turkish form the two biggest groups of immigrants in Germany. There is a number of surveys, that focus on early second language acquisition of kindergarten and primary school children in these ethnic groups. In this article, I will discuss differences and similarities of the second language acquisition process, that Russian and Turkish speaking children go through. I will compare not only the interlingual development (pronunciation, lexicon, syntax and morphology) but also the sociocultural context. For this purpose the data of my case studies will be contrasted with the other research results.
Opisuje se i analizira tvorba etnika i ktetika u kajkavskom narječju. Raščlamba se temelji na podatcima prikupljenima terenskim istraživanjima posljednjih gotovo pedeset godina u Upitnicima za Hrvatski jezični atlas (HJA), koji se izrađuje u Institutu za hrvatski jezik i jezikoslovlje, te na podatcima iz dijalektnih rječnika.
U članku se prikazuje razvoj miljevačkoga prezimenskoga sustava od prvih prezimena zabilježenih u matičnim knjigama potkraj 17. stoljeća do prezimena koja se javljaju tek potkraj 19. stoljeća. Utvrđuje se koja su prezimena u međuvremenu ugašena, tj. koja su nestala zbog izumiranja loze ili zamjene novim prezimenom, najčešće dotadašnjim obiteljskim nadimkom. Analiziraju se motivacijsko-strukturna svojstva današnjih miljevačkih prezimena i njihovo jezično podrijetlo.
Bilježenje palatalnih konsonanata najviše je problema izazivalo u slavenskih naroda koji su nastojali prilagoditi osnovnu latinicu fonemima svojih jezika. U ovom se radu promatraju bilježenja palatala u pisaca zadarsko-šibenskoga kruga, i to u vremenu od 14. do 17. stoljeća. U toj su regiji nastali prvi hrvatski tekstovi pisani latinicom. Cilj rada jest ustvrditi kako su pojedini autori latinicom zapisali sporne foneme hrvatskoga (čakavskoga) jezika, koje su razlike i istosti njihovih grafijskih rješenja i koje se tendencije ogledaju u višestoljetnoj uporabi latinice na ovom području.
„Football“, „soccer“ in British terms, is the most famous sport of the world. The history of the football goes back to the ancient times. In this article, the football terms used in Germany and Turkey are handled together with the historical development of football. Various differences and similarities between these terms and their features are also demonstrated.
Surrounding globalism , due to digital connections, is felt in all the fields of our life. Globalism causes changes in local conditions. However, there are also local realities and peope live with local conditions. As a result of this, according to R. Robertsson emerge “globalocalisation”. How is a language influenced from this “globalocalisation” process? This study trys to research with samples the changes in language as a consequence of globalocal interactions.
Pokazatelji brojivosti
(2007)
Standardisierung ist der bedeutendste Ansatz zu Qualitätssteigerung und Kostensenkung in der Technischen Dokumentation. Es gibt eine Reihe von Standardisierungsansätzen: Modularisierung, Informationsstrukturen, Terminologie, Sprachstrukturen. Dennoch werden diese Ebenen meist getrennt voneinander beschrieben. Wir untersuchen, wie Standardisierungen im Informationsmodell, in der Terminologie und in den sprachlichen Strukturen verknüpft werden und miteinander interagieren.
Intimität und Geschlecht : zur Syntax und Pragmatik der Anrede im Liebesbrief des 20. Jahrhunderts
(2000)
Die Trennung der Lebenswelt in Privatsphäre und Öffentlichkeit käme der Verortung von Intimität entgegen. Es scheint aber, als ob Intimität nicht einem klar abgegrenzten Bereich zugeordnet werden kann, sondern nunmehr als relationale Kategorie zu fassen ist. Gerade der historische Vergleich (Vgl. CORBIN 1992) erlaubt weder einheitlich räumliche oder körperliche noch ästhetische Kriterien zur Abgrenzung von Intimität. ...
Die Ressource "Wissen" rückte in den letzten Jahrzehnten als Quelle wissenschaftlicher Innovation immer stärker ins Zentrum des Interesses. Diese Fokussierung mündete in eine Selbstreflexion der Wissenschaft und der wissenschaftlichen Disziplinen: Thematisiert werden vor allem die Art und Weise, wie Wissen gewonnen wird, sowie die damit zusammenhängende Frage nach der Konstruktion von Wissenschaftlichkeit, womit das Bewusstsein gleichzeitig auf die mehr und mehr sich auflösende Abgrenzung zwischen den Disziplinen beziehungsweise zwischen den drei hauptsächlichen Wissenschaftskulturen, von Natur-, Geistes- und Kultur- sowie Sozialwissenschaften gelenkt wird. Innerhalb und außerhalb der Universitäten bildeten und bilden sich nicht immer klar verortbare "trading zones" (Gallison 1997), in denen neue Formen und Techniken der Wissensproduktion und Wissensvermittlung geprüft, geübt und teilweise auch institutionalisiert werden. ...
In linguistics and the philosophy of language, the mass/count distinction has traditionally been regarded as a bi-partition on the nominal domain, where typical instances are nouns like "beef" (mass) vs."cow" (count). In the present paper, we argue that this partition reveals a system that is based on both syntactic features and conceptual features, and present experimental evidence suggesting that the discrimination of the two kinds of features has a psychological reality.
Zur Versprachlichung des Raums in Bildergeschichten deutschsprachiger Vor- und Grundschulkinder
(2002)
Gegenstand der vorliegenden Arbeit ist die Versprachlichung des Raums in Bildergeschichten deutschsprachiger Vor- und Grundschulkinder. Methodisch fügt sich die Untersuchung in Arbeiten zur Entwicklung der narrativen Kompetenz des Kindes anhand von Bildergeschichten ein, wie sie in neuerer Zeit […] durchgeführt wurden (s. vor allem Berman & Slobin 1994). Hinsichtlich der allgemein-sprachwissenschaftlichen Analyse der Versprachlichung des Raums ist die Arbeit vor allem den typologischen Studien von L. Talmy (1985, 1991) verpflichtet. […] Ziel der vorliegenden Arbeit ist es, die Rolle der Versprachlichung räumlicher Beziehungen unter zwei Aspekten zu untersuchen: hinsichtlich der Erstellung kohärenten narrativen Diskurses und hinsichtlich der sprachlichen Mittel, mittels derer die Kinder auf statische und dynamische räumliche Beziehungen referieren. Entsprechend der Dreiteilung der Ich-Jetzt-Hier-Origo stellen räumliche Beziehungen neben der Referenz auf Personen und zeitliche Beziehungen einen der drei Bereiche dar, in denen sich textuelle Kohärenz manifestiert. Bei der Versprachlichung des Raums geht es einerseits um Einführung, Beibehaltung und Verschiebung narrativer Orte und andererseits um statische räumliche Befindlichkeiten gegenüber dynamischen räumlichen Ereignissen. […] Die Arbeit gliedert sich in einen theoretischen und einen empirischen Hauptteil.
In this paper I will present some empirical studies concerning a linguistic construction called binomials, e.g. auf und ab (‚up and down‘). Binomials consist of two coordinated elements in a fixed order ‚A and B‘, whereas empirically the reversed order ‚B and A‘ is rarely found and, asked for acceptability judgements, native speakers tend to reject it. In two corpus studies hypotheses on phonological principles responsible for the ordering of the constituents are tested. Furthermore I present a pseudoword experiment with German native speakers and Russian and Turkish learners of German as a second language. Results are discussed in the framework of optimality theory.
This paper describes the creation and preparation of TUSNELDA, a collection of corpus data built for linguistic research. This collection contains a number of linguistically annotated corpora which differ in various aspects such as language, text sorts / data types, encoded annotation levels, and linguistic theories underlying the annotation. The paper focuses on this variation on the one hand and the way how these heterogeneous data are integrated into one resource on the other hand.
U radu se predstavljaju rezultati terenskog istraživanja o nazivima za uskršnju pletenicu, provedenog na gotovo dvjesto punktova duž hrvatske obale te u unutrašnjosti Istre i nadopunjenog podatcima iz objavljene literature. Istraživanjem su, radi usporedbe, zahvaćena i neka naselja u Gorskom kotaru te u unutrašnjosti Dalmacije. Analiza prikupljenog korpusa ukazuje na snažnu prisutnost slike sadržaja ‘ptica’ u našim jadranskim nazivima za uskršnju pletenicu, bilo da je riječ o nazivima slavenskog ili romanskog (dalmatskog, istroromanskog ili mletačkog) podrijetla. Na temelju tih zapažanja ponovno se razmatraju neka dosad predložena etimološka rješenja (Skok, Vinja) te donose novi etimološki prijedlozi.
U radu se analizira položaj i sudbina jadertinskog, autohtonoga zadarskog romanskog idioma, pripadnika dalmatske, odnosno iliroromanske skupine romanskih jezika. Čitavo XIV. stoljeće vrijeme je najbolje potvrđenosti, ali i neumitnog propadanja jezične strane autohtonoga zadarskog romanstva: nikada prije i nikada poslije toga vremena na tako jasan i očit način prisutan u privatnim i javnim ispravama, jadertinski se u njima pojavljuje već duboko venecijaniziran, poklapajući se više-manje s jezičnim modelom koji se u novoj talijanskoj literaturi naziva volgare venezianeggiante. Analizirajući jezik zadarskih inventara, oporuka, pisama i cedulja iz XIV. st., u potpunosti pisanih na lokalnome romanskom idiomu, autor nestanak autohtonih jadertinskih jezičnih značajki promatra u kontekstu procesa jezične konvergencije – polaganog i sigurnog približavanja mletačkim jezičnim modelima.
J. Melvinger u radu o supstandardnome prijedložnom infinitivu (1982.) ne spominje mogućnost infinitivne kondenzacije posljedičnih ustrojstava, ni prijedložnog ni besprijedložnog infinitiva, iako donosi primjere u kojima je riječ o infinitivnoj prijedložnoj konstrukciji koja je priložna oznaka posljedice, a ne priložna oznaka načina, kako ona tvrdi: Kožnata jakna smiješna, a šal oko vrata škaklja za poludjeti. Tu mogućnost ne spominje ni u svojoj disertaciji (iako navodi primjere koje mi razumijevamo kao posljedične konstrukcije), a ne navodi je ni M. Ivić.
Red surečenica
(2009)
Pitanje reda surečenica u posljedičnim rečenicama, tj. mogućnost njihova premetanja (obrtanja), jedno je od onih nerijetkih pitanja u hrvatskome jezikoslovlju koje se smatra riješenim, a da se nitko njime nije valjano i sustavno bavio. Jednodušno se i beziznimno naime smatra da je red surečenica u posljedičnim rečenicama (i red surečenica u nekim drugim zavisnosloženim rečenicama) glavna surečenica – zavisna surečenica stalan i neobratljiv. Nije međutim točna tvrdnja da zavisnosložene rečenice za razliku od nezavisnosloženih mogu premetati red surečenica i da to ne vrijedi samo za posljedične i neke druge rečenice. Naime u nekim tipovima posljedičnih i drugih rečenica, pokazuje se to u ovome radu, zavisna surečenica može prethoditi glavnoj, tj. njezine sastavnice mogu zamijeniti mjesta.
U radu se opisuju sintaktičke funkcije participa prezenta aktivnog i participa preterita aktivnog I. u Katančićevu prijevodu Svetoga pisma (1831.). Posebno se istražuje participska konstrukcija apsolutni nominativ (particip u nominativu + ime u nominativu) te se utvrđuju njezine sintaktičke funkcije.
U radu se analizira drugi cjeloviti objavljeni prijevod Svetoga pisma na hrvatski jezik, Škarićevo Sveto pismo Staroga i Novoga uvita (Beč, 1858. – 1861.); opisuju se njegove jezične osobine, utvrđuje se njegovo mjesto u dugoj hrvatskoj svetopisamskoj prevodilačkoj tradiciji te njegov utjecaj na proces standardizacije hrvatskoga jezika.
U članku se na temelju podataka dobivenih usmjerenim terenskim istraživanjem u govoru Novalje na otoku Pagu prikazuju naglasni tipovi imenica u tom govoru s obzirom na mjesto i vrstu naglaska (i podtipovi s obzirom na postojanje prednaglasnih duljina te varijante s obzirom na stupanj inovativnosti): a) tip a sa stalnim mjestom naglaska na vokalu osnove, b) tip b sa stalnim mjestom naglaska na vokalu nastavka, c) tip c s alterniraju263;im mjestom naglaska na vokalu osnove i na vokalu nastavka.
Weak function word shift
(2004)
The fact that object shift only affects weak pronouns in mainland Scandinavian is seen as an instance of a more general observation that can be made in all Germanic languages: weak function words tend to avoid the edges of larger prosodic domains. This generalisation has been formulated within Optimality Theory in terms of alignment constraints on prosodic structure by Selkirk (1996) in explaining thedistribution of prosodically strong and weak forms of English functionwords, especially modal verbs, prepositions and pronouns. But a purely phonological account fails to integrate the syntactic licensing conditions for object shift in an appropriate way. The standard semantico-syntactic accounts of object shift, onthe other hand, fail to explain why it is only weak pronouns that undergo object shift. This paper develops an Optimality theoretic model of the syntax-phonology interface which is based on the interaction of syntactic and prosodic factors. The account can successfully be applied to further related phenomena in English and German.
This paper argues for a particular architecture of OT syntax. This architecture hasthree core features: i) it is bidirectional, the usual production-oriented optimisation (called ‘first optimisation’ here) is accompanied by a second step that checks the recoverability of an underlying form; ii) this underlying form already contains a full-fledged syntactic specification; iii) especially the procedure checking for recoverability makes crucial use of semantic and pragmatic factors. The first section motivates the basic architecture. The second section shows with two examples, how contextual factors are integrated. The third section examines its implications for learning theory, and the fourth section concludes with a broader discussion of the advantages and disadvantages of the proposed model.
This paper is part of a research project on OT Syntax and the typology of the free relative (FR) construction. It concentrates on the details of an OT analysis and some of its consequences for OT syntax. I will not present a general discussion of the phenomenon and the many controversial issues it is famous for in generative syntax.
The aim of this paper is the exploration of an optimality theoretic architecture for syntax that is guided by the concept of "correspondence": syntax is understood as the mechanism of "translating" underlying representations into a surface form. In minimalism, this surface form is called "Phonological Form" (PF). Both semantic and abstract syntactic information are reflected by the surface form. The empirical domain where this architecture is tested are minimal link effects, especially in the case of "wh"-movement. The OT constraints require the surface form to reflect the underlying semantic and syntactic representations as maximally as possible. The means by which underlying relations and properties are encoded are precedence, adjacency, surface morphology and prosodic structure. Information that is not encoded in one of these ways remains unexpressed, and gets lost unless it is recoverable via the context. Different kinds of information are often expressed by the same means. The resulting conflicts are resolved by the relative ranking of the relevant correspondence constraints.
The argument that I tried to elaborate on in this paper is that the conceptual problem behind the traditional competence/performance distinction does not go away, even if we abandon its original Chomskyan formulation. It returns as the question about the relation between the model of the grammar and the results of empirical investigations – the question of empirical verification The theoretical concept of markedness is argued to be an ideal correlate of gradience. Optimality Theory, being based on markedness, is a promising framework for the task of bridging the gap between model and empirical world. However, this task not only requires a model of grammar, but also a theory of the methods that are chosen in empirical investigations and how their results are interpreted, and a theory of how to derive predictions for these particular empirical investigations from the model. Stochastic Optimality Theory is one possible formulation of a proposal that derives empirical predictions from an OT model. However, I hope to have shown that it is not enough to take frequency distributions and relative acceptabilities at face value, and simply construe some Stochastic OT model that fits the facts. These facts first of all need to be interpreted, and those factors that the grammar has to account for must be sorted out from those about which grammar should have nothing to say. This task, to my mind, is more complicated than the picture that a simplistic application of (not only) Stochastic OT might draw.
U ovome su radu obradena 232 obiteljska nadimka u Puciscima na otoku Bracu. Obiteljski su nadimci, kao dodatan vid identifikacije koji se razvio još u pretprezimenskome razdoblju, a kasnije je sve zastupljeniji zbog brojnosti nositelja pojedinih prezimena, svojevrsni specifikum hrvatskih otoka koji dosad nije dostatno proucen. U Puciscima se obiteljski nadimci bilježe od konca 16. st. te se na temelju njihove motivacije može djelomicno rekonstruirati fond osobnih imena (odnos hrvatskih narodnih imena te hrvatskih i novijih romanskih prilagodenica kršcanskih imena), vanjština (posebice tjelesne mane), karakterne crte (uglavnom nekonvencionalne) te podrijetlo i svakodnevni život Puciscana. Fond je obiteljskih nadimaka znatno otvoreniji inojezicnim sustavima (poglavito romanskim) te je odraz svojevrsne tisucljetne hrvatsko-romanske simbioze na istocnoj obali Jadranskoga mora.
U ovome se radu nastoji dati pregled mnogobrojnih i raznolikih odraza svetačkog imena Ivan u hrvatskome antroponimijskom fondu s osobitim naglaskom na područje južne Dalmacije (uključujući Boku kotorsku) i Donje Hercegovine. U uvodnome se dijelu rada donose odrazi hebrejskoga muškog osobnog imena Jehochánán u raznim indoeuropskim i neindoeuropskim jezicima, potom se tumači postanje hrvatskoga svetačkog imena Ivan i njegovi odrazi u hrvatskome antroponimijskom fondu s posebnim naglaskom na sličnosti i razlike s antroponimijskim fondovima bliskih južnoslavenskih jezika.
U ovome se radu pokušava dati pregled mnogobrojnih i raznolikih odraza svetačkog imena Juraj u hrvatskome antroponimijskom sustavu s osobitim naglaskom na područje Zažablja (prostora između rječice Misline, istočno od Metkovića, i zapadnih granica nekadašnje Dubrovačke Republike, a danas općine Dubrovačko primorje, te prostora od Hrasna na sjeveru do Neuma na jugu) i Popova (jugozapadne Hercegovine). Na temelju odabrane literature i autorova terenskog istraživanja nastoje se iznijeti i neke izvanjezične (poglavito povijesne i sociolingvističke) činjenice koje su uzrok takvu stanju.
U ovome se radu na temelju terenskog istraživanja obrađuje toponimija danas gotovo posve napuštenoga sela Dubljani u Popovu u istočnoj Hercegovini. U mjesnoj su toponimiji najzastupljeniji toponimi antroponimnoga postanja s pomoću kojih se upoznajemo s negdašnjim i današnjim imovinsko-pravnim ustrojem srednjovjekovnog Huma, toponim Satùlija (‘Sanctus Elias’) spomen je na davne romansko-hrvatske dodire, a na primjeru toponima Sačìvišće upoznajemo se s veoma složenom dijalektnom slikom istočne Hercegovine.
In the past, a divide could be seen between ’deep’ parsers on the one hand, which construct a semantic representation out of their input, but usually have significant coverage problems, and more robust parsers on the other hand, which are usually based on a (statistical) model derived from a treebank and have larger coverage, but leave the problem of semantic interpretation to the user. More recently, approaches have emerged that combine the robustness of datadriven (statistical) models with more detailed linguistic interpretation such that the output could be used for deeper semantic analysis. Cahill et al. (2002) use a PCFG-based parsing model in combination with a set of principles and heuristics to derive functional (f-)structures of Lexical-Functional Grammar (LFG). They show that the derived functional structures have a better quality than those generated by a parser based on a state-of-the-art hand-crafted LFG grammar. Advocates of Dependency Grammar usually point out that dependencies already are a semantically meaningful representation (cf. Menzel, 2003). However, parsers based on dependency grammar normally create underspecified representations with respect to certain phenomena such as coordination, apposition and control structures. In these areas they are too "shallow" to be directly used for semantic interpretation. In this paper, we adopt a similar approach to Cahill et al. (2002) using a dependency-based analysis to derive functional structure, and demonstrate the feasibility of this approach using German data. A major focus of our discussion is on the treatment of coordination and other potentially underspecified structures of the dependency data input. F-structure is one of the two core levels of syntactic representation in LFG (Bresnan, 2001). Independently of surface order, it encodes abstract syntactic functions that constitute predicate argument structure and other dependency relations such as subject, predicate, adjunct, but also further semantic information such as the semantic type of an adjunct (e.g. directional). Normally f-structure is captured as a recursive attribute value matrix, which is isomorphic to a directed graph representation. Figure 5 depicts an example target f-structure. As mentioned earlier, these deeper-level dependency relations can be used to construct logical forms as in the approaches of van Genabith and Crouch (1996), who construct underspecified discourse representations (UDRSs), and Spreyer and Frank (2005), who have robust minimal recursion semantics (RMRS) as their target representation. We therefore think that f-structures are a suitable target representation for automatic syntactic analysis in a larger pipeline of mapping text to interpretation. In this paper, we report on the conversion from dependency structures to fstructure. Firstly, we evaluate the f-structure conversion in isolation, starting from hand-corrected dependencies based on the TüBa-D/Z treebank and Versley (2005)´s conversion. Secondly, we start from tokenized text to evaluate the combined process of automatic parsing (using Foth and Menzel (2006)´s parser) and f-structure conversion. As a test set, we randomly selected 100 sentences from TüBa-D/Z which we annotated using a scheme very close to that of the TiGer Dependency Bank (Forst et al., 2004). In the next section, we sketch dependency analysis, the underlying theory of our input representations, and introduce four different representations of coordination. We also describe Weighted Constraint Dependency Grammar (WCDG), the dependency parsing formalism that we use in our experiments. Section 3 characterises the conversion of dependencies to f-structures. Our evaluation is presented in section 4, and finally, section 5 summarises our results and gives an overview of problems remaining to be solved.
In this paper, we investigate the usefulness of a wide range of features for their usefulness in the resolution of nominal coreference, both as hard constraints (i.e. completely removing elements from the list of possible candidates) as well as soft constraints (where a cumulation of violations of soft constraints will make it less likely that a candidate is chosen as the antecedent). We present a state of the art system based on such constraints and weights estimated with a maximum entropy model, using lexical information to resolve cases of coreferent bridging.
When a statistical parser is trained on one treebank, one usually tests it on another portion of the same treebank, partly due to the fact that a comparable annotation format is needed for testing. But the user of a parser may not be interested in parsing sentences from the same newspaper all over, or even wants syntactic annotations for a slightly different text type. Gildea (2001) for instance found that a parser trained on the WSJ portion of the Penn Treebank performs less well on the Brown corpus (the subset that is available in the PTB bracketing format) than a parser that has been trained only on the Brown corpus, although the latter one has only half as many sentences as the former. Additionally, a parser trained on both the WSJ and Brown corpora performs less well on the Brown corpus than on the WSJ one. This leads us to the following questions that we would like to address in this paper: - Is there a difference in usefulness of techniques that are used to improve parser performance between the same-corpus and the different-corpus case? - Are different types of parsers (rule-based and statistical) equally sensitive to corpus variation? To achieve this, we compared the quality of the parses of a hand-crafted constraint-based parser and a statistical PCFG-based parser that was trained on a treebank of German newspaper text.
We investigate methods to improve the recall in coreference resolution by also trying to resolve those definite descriptions where no earlier mention of the referent shares the same lexical head (coreferent bridging). The problem, which is notably harder than identifying coreference relations among mentions which have the same lexical head, has been tackled with several rather different approaches, and we attempt to provide a meaningful classification along with a quantitative comparison. Based on the different merits of the methods, we discuss possibilities to improve them and show how they can be effectively combined.
Using a qualitative analysis of disagreements from a referentially annotated newspaper corpus, we show that, in coreference annotation, vague referents are prone to greater disagreement. We show how potentially problematic cases can be dealt with in a way that is practical even for larger-scale annotation, considering a real-world example from newspaper text.
In this paper, we argue that difficulties in the definition of coreference itself contribute to lower inter-annotator agreement in certain cases. Data from a large referentially annotated corpus serves to corroborate this point, using a quantitative investigation to assess which effects or problems are likely to be the most prominent. Several examples where such problems occur are discussed in more detail, and we then propose a generalisation of Poesio, Reyle and Stevenson’s Justified Sloppiness Hypothesis to provide a unified model for these cases of disagreement and argue that a deeper understanding of the phenomena involved allows to tackle problematic cases in a more principled fashion than would be possible using only pre-theoretic intuitions.
Distributional approximations to lexical semantics are very useful not only in helping the creation of lexical semantic resources (Kilgariff et al., 2004; Snow et al., 2006), but also when directly applied in tasks that can benefit from large-coverage semantic knowledge such as coreference resolution (Poesio et al., 1998; Gasperin and Vieira, 2004; Versley, 2007), word sense disambiguation (Mc- Carthy et al., 2004) or semantical role labeling (Gordon and Swanson, 2007). We present a model that is built from Webbased corpora using both shallow patterns for grammatical and semantic relations and a window-based approach, using singular value decomposition to decorrelate the feature space which is otherwise too heavily influenced by the skewed topic distribution of Web corpora.
We adopt Markert and Nissim (2005)’s approach of using the World Wide Web to resolve cases of coreferent bridging for German and discuss the strength and weaknesses of this approach. As the general approach of using surface patterns to get information on ontological relations between lexical items has only been tried on English, it is also interesting to see whether the approach works for German as well as it does for English and what differences between these languages need to be accounted for. We also present a novel approach for combining several patterns that yields an ensemble that outperforms the best-performing single patterns in terms of both precision and recall.
This paper aims to determine and classify by syntactic criteria, the functions of reflexivity (reflexive pronoun kendi) in Turkish, in contrast to German.
Reflexivity in Turkish can be expressed by synthetic elements such as affixes, but also by an analytical element – the reflexive pronoun kendi. And in German it is formed by the reflexive pronoun sich. The reflexive pronoun sich in German used both in anaphorical and lexical functions, which can be distinguished from each other by certain criteria.
Hybrid robust deep and shallow semantic processing for creativity support in document production
(2004)
The research performed in the DeepThought project (http://www.project-deepthought.net) aims at demonstrating the potential of deep linguistic processing if added to existing shallow methods that ensure robustness. Classical information retrieval is extended by high precision concept indexing and relation detection. We use this approach to demonstrate the feasibility of three ambitious applications, one of which is a tool for creativity support in document production and collective brainstorming. This application is described in detail in this paper. Common to all three applications, and the basis for their development is a platform for integrated linguistic processing. This platform is based on a generic software architecture that combines multiple NLP components and on robust minimal recursive semantics (RMRS) as a uniform representation language.
Im ersten Teil wird zunächst die wenige Forschungsliteratur zum Thema Deskriptivität selbst und eng verwandten Themen vorgestellt und besprochen. Daraus soll sich im Anschluss auch eine Definition des Begriffes ergeben, die weit genug gefasst ist, um die übliche Verwendungsweise des Begriffs bei Autoren, die ihn zwar benutzen, aber nicht theoretisch behandeln, zu erfassen, die sich aber andererseits dennoch in klar definierten und nachvollziehbaren Grenzen bewegt. Dabei soll weiterhin deutlich werden, dass es sich bei Deskriptivität um ein prinzipiell in allen Sprachen anzutreffendes Phänomen handelt, dass sich aber die Frequenz deskriptiver Ausdrücke von Sprache zu Sprache stark unterscheiden kann. Dabei werde ich Daten aus ausgewählten Sprachen einbeziehen und eine quantitative Analyse des Ausmaßes, mit dem verschiedene Sprachen von deskriptiven Bildungen Gebrauch machen vorstellen. Der zweite Hauptteil der Arbeit beschäftigt sich mit folgender Frage: Wenn jede Sprache zu einem gewissen Grad von deskriptiven Benennungen Gebrauch macht, welche Mechanismen des Sprachwandels gibt es, die die Position einer Sprache auf dieser Skala in die eine oder die andere Richtung verändern können?
Wenn wie im Falle des Instituts für Angewandte Linguistik und Translatologie der Universität Leipzig eine mehr als zehnjährige Germanistische Institutspartnerschaft mit gleich zwei russischen Partnern – den Übersetzer-Fakultäten der Linguistischen Universitäten Moskau und Pjatigorsk – nunmehr ihren Abschluss findet, so bietet es sich natürlich an zu fragen, was die GIP-Langzeitkooperation beiden Seiten an messbaren wissenschaftlichen, wissenschaftsmethodischen und curricularen Ergebnissen, an „Zuwächsen“ im Sinne der Nachwuchsförderung, des Austauschs von Dozenten und Studierenden gebracht hat. Die Bilanz – von uns dargelegt im Jubiläumsband 52 der Dokumente & Materialien des Deutschen Akademischen Austausch Dienstes – kann sich durchaus sehen lassen und rechtfertigt nicht nur die aufgewandten Mittel, sondern auch die kontinuierliche Arbeit, den nachhaltigen Einsatz und die vielfältigen Initiativen der zahlreichen Beteiligten auf beiden Seiten.
Transforming constituent-based annotation into dependency-based annotation has been shown to work for different treebanks and annotation schemes (e.g. Lin (1995) has transformed the Penn treebank, and Kübler and Telljohann (2002) the Tübinger Baumbank des Deutschen (TüBa-D/Z)). These ventures are usually triggered by the conflict between theory-neutral annotation, that targets most needs of a wider audience, and theory-specific annotation, that provides more fine-grained information for a smaller audience. As a compromise, it has been pointed out that treebanks can be designed to support more than one theory from the start (Nivre, 2003). We argue that information can also be added to an existing annotation scheme so that it supports additional theory-specific annotations. We also argue that such a transformation is useful for improving and extending the original annotation scheme with respect to both ambiguous annotation and annotation errors. We show this by analysing problems that arise when generating dependency information from the constituent-based TüBa-D/Z.
Deutsch im Kreis Schanfigg
(2012)
In dieser Arbeit wird unter Schanfigg nach Kessler "Schanfigg im weitern Sinne" verstanden, d.h. die Dörfer des politischen Kreises Schanfigg [...]. Da Dialekte im Gegensatz zu Hochsprachen nicht-normierte Sprachvarietäten darstellen, zeichnen sich die Ortsgrammatiken durch eine jeweils enorme Formenvielfalt in lautlicher und in morphologischer Hinsicht aus. Dies war denn auch eines der Ziele der Untersuchung: Mit Hilfe der Prager Phonologie und der auf ihr beruhenden Morphologie sollte aufgezeigt werden, wie groß die allophonische und allomorphische Bandbreite ist, derer sich die Sprecher im Gespräch unbewußt bedienen. Sehr schön läßt sich dies anhand der Verbalmorphologie bei den unregelmäßigen Verben (Kurzverben) aufzeigen. Ein weiteres Ziel der Untersuchung war es, die Stellung der Ortsdialekte des Schanfiggs und ihres Gesamts, also das Schanfigger Diasystem, innerhalb der dem Schanfigg benachbarten Mundarten darzustellen. Idealerweise hätten das Prättigau, das Churwaldner Tal und die Churer bzw. Churerrheintaler Mundarten herangezogen werden müssen. Da aber leider keine Untersuchungen zu den Verhältnissen im Prättigau und im Churwaldner Tal vorhanden sind, wurden die Schanfigger Verhältnisse mit denjenigen der Stadt Chur (vgl. Eckhardt 1991) und des Deutschen im Bezirk Imboden (vgl. Toth und Ebneter 1996) verglichen.
Neugriechische Wortbildung
(1988)
Ziel dieser Arbeit ist es, einen Überblick über das ngr. Wortbildungssystem zu geben. und zugleich die wichtigsten Probleme, die mit der Abgrenzung der ,verschiedenen Wortbildungsverfahren voneinander im NGR. zusammenhängen, so weit wie möglich zu behandeln. Die Arbeit ist in drei Hauptteile gegliedert: der erste Teil (Kap. 2 und 3) ist allgemeinen Problemen gewidmet; die sich auf die Abgrenzung des Bereichs der Wortbildung von der Flexion sowie auf die wichtigsten Aspekte der Wortstruktur im NGR. beziehen. In den beiden .anderen Teilen (Kap. 4 und 5) werden die Wortbildungsverfahren der Ableitung und der Komposition im Bereich des Nomens und im Bereich des Verbs diskutiert. Eine ausführliche Darstellung der Präfixbildung im NGR. ist im Rahmen dieser Arbeit nicht möglich; jedoch werden die Probleme, die mit der Abgrenzung von Präfixbildungen und Komposita zusammenhängen, in Kap. 5.1 kurz besprochen. Besondere Arten der Wortbildung wie z.B. Akronymie, (Wort)Kürzung, "blending" werden nicht behandelt.
The purpose of this paper is to describe the TüBa-D/Z treebank of written German and to compare it to the independently developed TIGER treebank (Brants et al., 2002). Both treebanks, TIGER and TüBa-D/Z, use an annotation framework that is based on phrase structure grammar and that is enhanced by a level of predicate-argument structure. The comparison between the annotation schemes of the two treebanks focuses on the different treatments of free word order and discontinuous constituents in German as well as on differences in phrase-internal annotation.
Das ausgehende 19. und beginnende 20. Jahrhundert setzt sich von den erkenntnistheoretischen Konzepten der vorangegangenen Zeit deutlich ab:Während – stark vereinfacht – die Philosophie bis dahin die Möglichkeit der Erkenntnis entweder in der subjektiven oder objektiven Dimension zu finden glaubte,wobei die Funktion der Sprache im Erkenntnisprozess kaum hinterfragt wurde, wird zur Jahrhundertwende eine Tendenz deutlich, die einerseits die Adäquatheit der sprachlichen Vermittlung entweder in Frage stellt oder zumindest thematisiert, andererseits die tradierten Erkenntnismodi neu reflektiert oder ihnen sogar den Rücken kehrt.
U ovome se članku obrađuju posuđenice mletačkoga podrijetla u sjevernočakavskom govoru Boljuna u sjeveroistočnoj Istri. Cilj rada bio je etimološki obraditi pridjeve i imenice iz semantičke domene karakternih osobina koji nisu bili uvršteni u Skokov Etimologijski rječnik ni u Vinjine Jadranske etimologije. Polazišna građa ekscerpirana je iz rukopisnoga Rječnika boljunskih govora Ivana Francetića, provjerena je na terenu te je etimološkom i leksičkom analizom dovedena u vezu s istromletačkim, venecijanskim, tršćanskim i talijanskim (etymologia proxima) te s latinskim ili drugim etimonom (etymologia remota), a na sinkronijskoj i dijatopijskoj razini s rječničkim potvrdama u ostalim čakavskim govorima Istre, Kvarnera i Dalmacije.
[D]ie polnischen Familiennamen [unterlagen] bis ins 19. Jahrhundert hinein nur geringer amtlicher Kontrolle [...]. Diese Situation begünstigte den sukzessiven Aufbau onymischer Allomorphik aus den […] Flexions- und Derivationsmorphemen, die ursprünglich zur Bildung von Herkunftsbezeichnungen, Patronymika und Übernamen angewendet wurden. Die sekundäre Nutzung dieser Flexions- und Wortbildungsmorpheme als onymische Suffixe trieb den […] Dissoziationsprozess der Familiennamen voran. Die wachsende Produktivität dieser onymischen Morphe, die bis heute andauert, sicherte ihnen die Spitzenposition unter den Proprialitätsmarkern im polnischen Familiennamensystem. Heute sind die onymischen Allomorphe -ska, -ski, -icz, -ak das wichtigste Mittel, mit dem die Zugehörigkeit eines Wortes zum Onomastikon gekennzeichnet wird. […] In diesem Beitrag werden die Entstehungswege und die Ausbreitungspfade der drei produktivsten Gruppen der polnischen onymischen Suffixe präsentiert. Es werden auch die außersprachlichen Faktoren berücksichtigt, die die Erhöhung der Produktivität durch sukzessive Erweiterung der Kombinationsmöglichkeiten der einzelnen Suffixe ermöglicht haben. Es wird gezeigt, dass die ursprünglichen Selektionsbeschränkungen der Basen mit den Suffixen (Toponyme + -ska-Suffixe, Appellative und Adjektive + k-haltige Suffixe, Vornamen + -icz-Suffixe) im Zuge ihrer Ausbreitung und Festigung aufgegeben wurden. Die onymischen Allomorphe sind heute frei kombinierbar und können im Falle des Namenwechsels zur Bildung eines neuen Namens herangezogen werden.
Das hethitische Phonem /xw/
(2014)
In the Hittite phonological system there was a labialized velar fricative /xw/ beside the plain velar fricative /x/ parallel to the opposition between the velar stops /kw/ and /k/. The frequent syllable /xwa/ was spelled either hu-(u) or hu-wa. Evidence from the frequency of words with initial hu in the lexicon, from spelling variations and from ablaut alternations is presented to demonstrate the existence of /xw/. It is suggested that Hittite /xw/ regularly corresponds to the reflexes of *w in the non-Anatolian Indo-European languages.
To reach even language users not acquainted to the use of grammars the Institut für Deutsche Sprache in Mannheim (Germany) looked for new way to handle grammatical problems. Instead of confronting users with abstractions frequent difficulties of German grammar are introduced in form of exemplary questions like „Which form should be used or preferred: Anfang dieses Jahre or Anfang diesen Jahres?” Looking through the long list of such questions even laymen may find solutions of grammatical problems they might not be able to formulate as such.
Broj njemackih posudenica u hrvatskome jeziku je manji nego što bi se moglo ocekivati, s obzirom na to da je višestoljetna politicka i kulturna povezanost Hrvatske s habsburškom državom uvjetovala izravni dodir njemackoga i hrvatskoga jezika. Razlog je tome jezicna politika koja se svjesno odupirala snažnom utjecaju njemackoga jezika na hrvatski, dajuci u standardnome jeziku prednost hrvatskim rijecima. U supstandardnom jeziku se, medutim, održao veci broj njemackih posudenica, iako za te rijeci postoje hrvatski ekvivalenti. U ovome ce se radu preispitati odnos njemacke posudenice i njezine domace zamjene, tj. u kojoj mjeri je hrvatski ekvivalent uspješna zamjena njemackoj posudenici kao i to o cemu sve ovisi ta uspješnost.
The early acquisition of Greek compounds by two monolingual Greek girls aged between 1;8 and 3;0 years is studied in a usage-based theoretical framework. Special importance is attached to the morphological structure of Greek compound types occurring in child speech and child-directed speech. Greek nominal compound formation does not consist in the mere juxtaposition of words or roots, but involves stems as well as a compound marker. Major questions addressed are the transparency of compounds and productive nominal compound formation. Evidence for productivity of nominal compound formation has been found with only one of the two girls. In contrast to other languages, neoclassical nominal compounds by far exceed endocentric subordinative ones tokenwise in Greek child speech and child-directed speech providing evidence of entrenchment rather than productivity.
In a cross-linguistic comparison it is shown that, in spite of the fact that both Standard Modern Greek and German are rich in nominal compounds, their number is much more limited in Greek than in German child speech. An explanation for this apparent paradox is provided by an onomasiological approach to lexical typology based on a sample list of nominal compounds occurring in German child language and their Greek translational equivalents. It has been found that while use of nominal compounds is common in colloquial German including child-centered situations, it is more typical of Greek formal than colloquial registers.
Children […] growing up with highly inflected languages such as Modern Greek will frequently hear different grammatical forms of a given lexeme used in different grammatical and semantic-pragmatic contexts. In spite of the fact that the Greek noun is not as highly inflected as the verb, acquisition of nominal inflection of this inflecting-fusional language is quite complex, comprising the three categories of case, number, and gender. As is usual in this type of language, the formation of case-number forms obeys different patterns that apply to largely arbitrary classes of nominal lexemes partially based on gender. Further, frequency of the occurrence of the three gender classes and case-number forms of nouns greatly differs in spoken Greek, regarding both the types and tokens. […] [A] child learning an inflecting-fusional language like Greek must construct different inflectional patterns depending not only on parts of speech but also on subclasses within a given part of speech, such as gender classes of nouns and inflectional classes within or (exceptionally) across genders. It is therefore to be expected that the early development of case and number distinctions will apply to specific nouns and subclasses of nouns rather than the totality of Greek nouns. The two main theoretical approaches of morphological development that will be discussed in the present paper are the usage-based approach and the pre- and protomorphology approach.
The two papers included in this volume have developed from work with the CHILDES tools and the Media Editor in the two research projects, "Second language acquisition of German by Russian learners", sponsored by the Max Planck Institute for Psycholinguistics, Nijmegen, from 1998 to 1999 (directed by Ursula Stephany, University of Cologne, and Wolfgang Klein, Max Planck Institute for Psycholinguistics, Nijmegen) and "The age factor in the acquisition of German as a second language", sponsored by the German Science Foundation (DFG), Bonn, since 2000 (directed by Ursula Stephany, University of Cologne, and Christine Dimroth, Max Planck Institute for Psycholinguistics, Nijmegen). The CHILDES Project has been developed and is being continuously improved at Carnegie Mellon University, Pittsburgh, under the supervision of Brian MacWhinney. Having used the CHILDES tools for more than ten years for transcribing and analyzing Greek child data there it was no question that I would also use them for research into the acquisition of German as a second language and analyze the big amount of spontaneous speech gathered from two Russian girls with the help of the CLAN programs. When in the spring of 1997, Steven Gillis from the University of Antwerp (in collaboration with Gert Durieux) developed a lexicon-based automatic coding system based on the CLAN program MOR and suitable for coding languages with richer morphologies than English, such as Modern Greek. Coding huge amounts of data then became much quicker and more comfortable so that I decided to adopt this system for German as well. The paper "Working with the CHILDES Tools" is based on two earlier manuscripts which have grown out of my research on Greek child language and the many CHILDES workshops taught in Germany, Greece, Portugal, and Brazil over the years. Its contents have now been adapted to the requirements of research into the acquisition of German as a second language and for use on Windows.
In this paper we show an approach to the customization of GermaNet to the German HPSG grammar lexicon developed in the Verbmobil project. GermaNet has a broad coverage of the German base vocabulary and fine-grained semantic classification; while the HPSG grammar lexicon is comparatively small und has a coarse-grained semantic classification. In our approach, we have developed a mapping algorithm to relate the synsets in GermaNet with the semantic sorts in HPSG. The evaluation result shows that this approach is useful for the lexical extension of our deep grammar development to cope with real-world text understanding.
Ein einer Äußerung können Nullpronomina aus mehreren [...] Gruppen vorkommen. Die [...] Gruppen können auf die Ebenen eines Schicht-Dialogmodells bezogen werden; andererseits können sie Hinweise geben, welche Informationen in einem Dialogmodell verfügbar sein sollten. Dies wird in der Folgezeit genauer zu untersuchen sein. Im folgenden werden die genannten Typen von Nullpronomina genauer dargestellt und Lösungsverfahren zum Auffinden der Referenten genannt.
Die Entwicklung eines individuellen Standards „vom grünen Tisch“ führt selten zu zufriedenstellenden Ergebnissen. Bei der automatischen Prüfung stellt man schnell fest, dass die „ausgedachten“ Regeln einer systematischen Anwendung nicht standhalten. Bei der Implementierung solcher Richtlinien stellt man fest, dass sie oft zu wenig konkret formuliert sind, wie z.B. „formulieren Sie Handlungsanweisungen knapp und präzise“. Wie jedoch kann ein Standard entwickelt werden, der zu einem Unternehmen, seiner Branche und Zielgruppen passt und für die automatische Prüfung implementiert werden kann? Sprachtechnologie hilft effizient bei der Entwicklung individueller Richtlinien. Durch Datenanalyse, Satzcluster und Parametrisierung entsteht ein textspezifischer individueller Standard. Ist damit aber der Gegensatz von Kreativität und Standardisierung aufgehoben?
Japanese is often taken to be strictly head-final in its syntax. In our work on a broad-coverage, precision implemented HPSG for Japanese, we have found that while this is generally true, there are nonetheless a few minor exceptions to the broad trend. In this paper, we describe the grammar engineering project, present the exceptions we have found, and conclude that this kind of phenomenon motivates on the one hand the HPSG type hierarchical approach which allows for the statement of both broad generalizations and exceptions to those generalizations and on the other hand the usefulness of grammar engineering as a means of testing linguistic hypotheses.
We present a broad coverage Japanese grammar written in the HPSG formalism with MRS semantics. The grammar is created for use in real world applications, such that robustness and performance issues play an important role. It is connected to a POS tagging and word segmentation tool. This grammar is being developed in a multilingual context, requiring MRS structures that are easily comparable across languages.
In this text, we describe the development of a broad coverage grammar for Japanese that has been built for and used in different application contexts. The grammar is based on work done in the Verbmobil project (Siegel 2000) on machine translation of spoken dialogues in the domain of travel planning. The second application for JACY was the automatic email response task. Grammar development was described in Oepen et al. (2002a). Third, it was applied to the task of understanding material on mobile phones available on the internet, while embedded in the project DeepThought (Callmeier et al. 2004, Uszkoreit et al. 2004). Currently, it is being used for treebanking and ontology extraction from dictionary definition sentences by the Japanese company NTT (Bond et al. 2004).
Das Problem des Transfers in der maschinellen Übersetzung von Japanisch nach Englisch ist fehlende Information über Numerus und Definitheit im Japanischen, die für die Wahl der englischen Artikel und die Nomenmarkierung gebraucht wird. Obwohl dieses Problem signifikant ist, beschäftigt sich die Forschungsliteratur kaum damit. [...] Wir bsaieren unsere Untersuchungen auf experimentell erhobenen Daten aus einem Experiment über deutsch-japanische gedolmetschte Terminaushandlungsdialoge [...]. Auf diese Weise können Phänomene bestimmt werden, die für die Domäne von VERBMOBIL relevant sind. Wir sehen unser Vorgehen in Übereinstimmung mit dem 'Sublanguage'-Ansatz [...].
Eins der signifikanten Probleme in der maschinellen Übersetzung japanische in deutsche Sprache ist die fehlende Information und Definitheit im japanischen Analyse-Output. Eine effiziente Lösung dieses Problems ist es, die Suche nach der relevanten Information in den Transfer zu integrieren. Transferregeln werden mit Präferenzregeln und Default-Regeln kombiniert. Dadurch wird Information über lexikalische Restriktionen der Zielsprache, über die Domäne und über den Diskurs zugänglich.
Die Domäne in VERBMOBIL sind Terminaushandlungsdialoge. Für die Syntax bedeutet das zunächst, daß die Sytnax sich an gesprochener Sprache orientieren muß. Das beinhaltet Nullanaphern, Phrasen, die auf die Kommunikationssituation bezogen sind und Phrasen, die für geschriebene Sprache als nicht wohlgeformt bezeichnet werden. Weitergehend gibt es einige domänenspezifische syntaktische besonderheiten, wie zum Biepsiel die Realisierung von Zeitangaben.
We present a solution for the representation of Japanese honorifical information in the HPSG framework. Basically, there are three dimensions of honorification. We show that a treatment is necessary that involves both the syntactic and the contextual level of information. The japanese grammar is part of a machine translation system.
Preferences and defaults for definiteness and number in japanese to german machine translation
(1996)
A significant problem when translating Japanese dialogues into German is the missing information on number and definiteness in the Japanese analysis output. The integration of the search for such information into the transfer process provides an efficient solution. General transfer includes conditions to make it possible to consider external knowledge. Thereby, grammatical and lexical knowledge of the source language, knowledge of lexical restrictions on the target language, domain knowledge and discourse knowledge are accessible.
A comprehensive investigation of Japanese particle was missing up to now. General implications were set up without the fact that a comprehensive analysis was carried out. [...] We offer a lexicalist treatment of the problem. Instead of assuming different phrase structure rules we state a type hierarchy of Japanese particles. This makes a uniform treatment of phrase structure as well as a differentiation of subcategorization patterns possible.
Particles fullfill several distinct central roles in the Japanese language. They can mark arguments as well as adjuncts, can be functional or have semantic functions. There is, however, no straightforward matching from particles to functions, as, e.g., 'ga' can mark the subject, the object or the adjunct of a sentence. Particles can cooccur. Verbal arguments that could be identified by particles can be eliminated in the Japanese sentence. And finally, in spoken language particles are often omitted. A proper treatment of particles is thus necessary to make an analysis of Japanese sentences possible. Our treatment is based on an empirical investigation of 800 dialogues. We set up a type hierarchy of particles motivated by their subcategorizational and modificational behaviour. This type hierarchy is part of the Japanese syntax in VERBMOBIL.
Sprachtechnologie für übersetzungsgerechtes Schreiben am Beispiel Deutsch, Englisch, Japanisch
(2009)
Wir [...] haben uns zur Aufgabe gesetzt, Wege zu finden, wie linguistisch basierte Software den Prozess des Schreibens technischer Dokumentation unterstützen kann. Dabei haben wir einerseits die Schwierigkeiten im Blick, die japanische und deutsche Autoren (und andere Nicht-Muttersprachler des Englischen) beim Schreiben englischer Texte haben. Besonders japanische Autoren haben mit Schwierigkeiten zu kämpfen, weil sie hochkomplexe Ideen in einer Sprache ausdrücken müssen, die von Informationsstandpunkt her sehr unterschiedlich zu ihrer Muttersprache ist. Andererseits untersuchen wir technische Dokumentation, die von Autoren in ihrer Muttersprache geschrieben wird. Obwohl hier die fremdsprachliche Komponente entfällt, ist doch auch erhebliches Verbesserungspotential vorhanden. Das Ziel ist hier, Dokumente verständlich, konsistent und übersetzungsgerecht zu schreiben. Der fundamentale Ansatz in der Entwicklung linguistisch-basierter Software ist, dass gute linguistische Software auf Datenmaterial basiert und sich an den konkreten Zielen der besseren Dokumentation orientiert.
Der Übersetzungsprozess der Technischen Dokumentation wird zunehmend mit Maschineller Übersetzung (MÜ) unterstützt. Wir blicken zunächst auf die Ausgangstexte und erstellen automatisch prüfbare Regeln, mit denen diese Texte so editiert werden können, dass sie optimale Ergebnisse in der MÜ liefern. Diese Regeln basieren auf Forschungsergebnissen zur Übersetzbarkeit, auf Forschungsergebnissen zu Translation Mismatches in der MÜ und auf Experimenten.
In der folgenden Darstellung geht es einerseits darum, an Beispielen aufzuzeigen, inwiefern die schweizerdeutschen Mundarten und die deutsche Standardsprache in Lautung, Formenbildung, Satzbau und Wortschatz auseinandergehen können, andererseits aber immer auch um das Aufweisen von Gemeinsamkeiten. Oft werden nämlich bestimmte Erscheinungen des dialektalen Sprachbaus vorschnell als Eigenarten der Mundart verstanden, obwohl dieselben Erscheinungen auch im gesprochenen Hochdeutschen anzutreffen sind. Somit liegen also häufig nicht Unterschiede zwischen Mundart und Standardsprache vor, sondern Unterschiede zwischen gesprochener Sprache und geschriebener Sprache. [vollständige Überarbeitung für eine zweite Auflage]
In terms of their functions and issues, the use of selection posters is possible in language teaching. Therefore, the present study aims to investigate the didactic potential of selection posters in German language teaching. Because of this reason, with this study, it is tried to show that the selection posters can be dealt with as materials in the courses in German Language teaching, which can be used parallel to the needs and interests. Accordingly, the alternative ways or approaches are tried to be made concrete throughout the courses. Consequently, the selection posters constitutes a wide range in German language teaching in terms of local culture, vocabulary knowledge, the processes of linguistic studies, visualization, authenticity, actuality, and spoken and written studies.