Reproducible annotations

  • This bachelor thesis presents a software solution which implements reproducible annotations in the context of the UIMA framework. This is achieved by creating an automated containerization of arbitrary analysis engines and annotating every analysis engine configuration in the processed CAS document. Any CAS document created by this solution is self sufficient and able to reproduce the exact environment under which it was created. A review of the state-of-the art software in the field of UIMA reveals that there are many implementations trying to increase reproducibility for a given application relying on UIMA, but no publication trying to increase the reproducibility of UIMA itself. This thesis improves upon that technological gap and provides a throughout analysis at the end which shows a negligible overhead in memory consumption, but a significant performance regression depending on the complexity of the analysis engine which was examined.

Download full text files

Export metadata

Metadaten
Author:Alexander Leonhardt
URN:urn:nbn:de:hebis:30:3-676886
Place of publication:Frankfurt am Main
Document Type:Bachelor Thesis
Language:English
Date of Publication (online):2022/01/04
Year of first Publication:2022
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Granting Institution:Johann Wolfgang Goethe-Universität
Release Date:2022/05/17
Tag:Docker; NLP; UIMA
Page Number:58
HeBIS-PPN:496553569
Institutes:Informatik und Mathematik
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Sammlungen:Universitätspublikationen
Licence (German):License LogoDeutsches Urheberrecht