Annotating a low-resource language with llod technology: Sumerian morphology and syntax

This paper describes work on the morphological and syntactic annotation of Sumerian cuneiform as a model for low resource languages in general. Cuneiform texts are invaluable sources for the study of history, languages, 
This paper describes work on the morphological and syntactic annotation of Sumerian cuneiform as a model for low resource languages in general. Cuneiform texts are invaluable sources for the study of history, languages, economy, and cultures of Ancient Mesopotamia and its surrounding regions. Assyriology, the discipline dedicated to their study, has vast research potential, but lacks the modern means for computational processing and analysis. Our project, Machine Translation and Automated Analysis of Cuneiform Languages, aims to fill this gap by bringing together corpus data, lexical data, linguistic annotations and object metadata. The project’s main goal is to build a pipeline for machine translation and annotation of Sumerian Ur III administrative texts. The rich and structured data is then to be made accessible in the form of (Linguistic) Linked Open Data (LLOD), which should open them to a larger research community. Our contribution is two-fold: in terms of language technology, our work represents the first attempt to develop an integrative infrastructure for the annotation of morphology and syntax on the basis of RDF technologies and LLOD resources. With respect to Assyriology, we work towards producing the first syntactically annotated corpus of Sumerian.
show moreshow less

Download full text files

Export metadata

  • Export Bibtex
  • Export RIS
Metadaten
Author:Christian Chiarcos, Ilya Khait, Émilie Pagé-Perron, Niko Schenk, Jayanth Jayanth, Christian Fäth, Julius Steuer, William Mcgrath, Jinyan Wang
URN:urn:nbn:de:hebis:30:3-514893
DOI:http://dx.doi.org/10.3390/info9110290
ISSN:2078-2489
Parent Title (English):Information
Publisher:MDPI Publ.
Place of publication:Basel
Document Type:Article
Language:English
Year of Completion:2018
Date of first Publication:2018/11/19
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Release Date:2019/11/06
Tag:Cuneiform; RDF; SPARQL; Sumerian; linguistic linked open data; linked open data; low-resource languages; morphology; parsing; syntax
Volume:9
Issue:11, Art. 290
Pagenumber:16
First Page:1
Last Page:16
Note:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited
HeBIS PPN:455691916
Institutes:Neuere Philologien
Informatik
Dewey Decimal Classification:540 Chemie und zugeordnete Wissenschaften
570 Biowissenschaften; Biologie
Sammlungen:Universitätspublikationen
Licence (German):License LogoCreative Commons - Namensnennung 4.0

$Rev: 11761 $