Refine
Year of publication
Document Type
- Conference Proceeding (98) (remove)
Language
- English (69)
- German (26)
- French (1)
- mis (1)
- Multiple languages (1)
Is part of the Bibliography
- no (98)
Keywords
Institute
- Universitätsbibliothek (98) (remove)
The Specialized Information Service Biodiversity Research (BIOfid) has been launched to mobilize valuable biological data from printed literature hidden in German libraries for over the past 250 years. In this project, we annotate German texts converted by OCR from historical scientific literature on the biodiversity of plants, birds, moths and butterflies. Our work enables the automatic extraction of biological information previously buried in the mass of papers and volumes. For this purpose, we generated training data for the tasks of Named Entity Recognition (NER) and Taxa Recognition (TR) in biological documents. We use this data to train a number of leading machine learning tools and create a gold standard for TR in biodiversity literature. More specifically, we perform a practical analysis of our newly generated BIOfid dataset through various downstream-task evaluations and establish a new state of the art for TR with 80.23% F-score. In this sense, our paper lays the foundations for future work in the field of information extraction in biology texts.
Veranstalter: Bernadette Biedermann, Universitätsmuseum, Universität Graz; Judith Blume, Universitätsbibliothek J.C. Senckenberg, Goethe-Universität Frankfurt am Main; Franziska Hormuth, Projekt „Digitales Netzwerk Sammlungen“, Berlin University Alliance / Humboldt-Universität zu Berlin
Datum, Ort: 22.04.2021–23.04.2021, digital
The scientific innovation process embraces the steps from problem definition through the development and evaluation of innovative solutions to their successful exploitation. The challenges imposed by this process can be answered by the creation of a powerful and flexible next-generation e-Science infrastructure, which exploits leading edge information and knowledge technologies and enables a comprehensive and intelligent means of supporting this process. This paper describes our vision of a Knowledge-based eScience infrastructure, which is based on the results of an in-depth study of the researchers requirements. Furthermore, it introduces the Fraunhofer e-Science Cockpit as a first implementation of our vision.
The correspondence between the terminology used for querying and the one used in content objects to be retrieved, is a crucial prerequisite for effective retrieval technology. However, as terminology is evolving over time, a growing gap opens up between older documents in (long-term) archives and the active language used for querying such archives. Thus, technologies for detecting and systematically handling terminology evolution are required to ensure "semantic" accessibility of (Web) archive content on the long run. As a starting point for dealing with terminology evolution this paper formalizes the problem and discusses issues, first ideas and relevant technologies.
Web archives created by the Internet Archive (IA) (https://archive.org), national libraries and other archiving services contain large amounts of information collected for a time period of over twenty years. These archives constitute a valuable source for research in many disciplines, including the digital humanities and the historical sciences by offering a unique possibility to look into past events and their representation on the Web.
Most Web archive services aim to capture the entire Web (IA) or national top-level domains and are therefore broad in their scope, diverse regarding the topics they contain and the time intervals they cover. Due to the large size and the broad scope it is difficult for interested researchers to locate relevant information in the archives as search facilities are very limited. Many users are more interested in studying smaller and topically coherent event-centric collections of documents contained in a Web archive [1,2]. Such collections can reflect specific events such as elections, or natural disasters, e.g. the Fukushima nuclear disaster (2011) or the German federal elections.
The Specialised Information Service Performing Arts (SIS PA) is part of a funding programme by the German Research Foundation that enables libraries to develop tailor-made services for individual disciplines in order to provide researchers direct access to relevant materials and resources from their field. For the field of performing arts, the SIS PA is aggregating metadata about theater and dance resources from currently, mostly, German-speaking cultural heritage institutions in a VuFind-based search portal.
In this article, we focus on metadata quality and its impact on the aggregation workflow by describing the different, possibly data provider-specific, process stages of improving data quality in order to achieve a searchable, interlinked knowledge base. We also describe lessons learned and limitations of the process.
Unter dem Titel "Vade mecum! Nächste Schritte in den Historischen Grundwissenschaften" fand sich am 8. und 9. April 2016 an der Universität zu Köln eine vor allem aus Doktorandinnen und Doktoranden bestehende Gruppe junger Wissenschaftler zu einer von Stefanie Menke und Lena Vosding organisierten Tagung zusammen. Die als offene Diskussion mit Impulsvorträgen konzipierte Veranstaltung war zugleich das diesjährige Treffen des Netzwerks Historische Grundwissenschaften, eines Zusammenschlusses hilfswissenschaftlich arbeitender Nachwuchswissenschaftler verschiedener Disziplinen und Qualifikationsstufen. Das Netzwerk hat sich zum Ziel gesetzt, einerseits eine Plattform für den Austausch und das Sichtbarmachen der eigenen Projekte zu bieten, andererseits die Perspektive des wissenschaftlichen Nachwuchses in die aktuelle Diskussion um die Zukunft der Historischen Grundwissenschaften einzubringen, die derzeit auch vor dem Hintergrund der Digitalisierung und den Entwicklungen innerhalb der Digital Humanities geführt wird. ...