Refine
Year of publication
- 2021 (2) (remove)
Document Type
- Article (2)
Language
- English (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Keywords
- Annotation (1)
- BIOfid (1)
- Biodiversity (1)
- Inter-annotator agreement (1)
- Named entity recognition (1)
- Semantic portal (1)
- Specialized information service (1)
- Taxon (1)
- corpus study (1)
- economics (1)
Institute
Biodiversity information is contained in countless digitized and unprocessed scholarly texts. Although automated extraction of these data has been gaining momentum for years, there are still innumerable text sources that are poorly accessible and require a more advanced range of methods to extract relevant information. To improve the access to semantic biodiversity information, we have launched the BIOfid project (www.biofid.de) and have developed a portal to access the semantics of German language biodiversity texts, mainly from the 19th and 20th century. However, to make such a portal work, a couple of methods had to be developed or adapted first. In particular, text-technological information extraction methods were needed, which extract the required information from the texts. Such methods draw on machine learning techniques, which in turn are trained by learning data. To this end, among others, we gathered the BIOfid text corpus, which is a cooperatively built resource, developed by biologists, text technologists, and linguists. A special feature of BIOfid is its multiple annotation approach, which takes into account both general and biology-specific classifications, and by this means goes beyond previous, typically taxon- or ontology-driven proper name detection. We describe the design decisions and the genuine Annotation Hub Framework underlying the BIOfid annotations and present agreement results. The tools used to create the annotations are introduced, and the use of the data in the semantic portal is described. Finally, some general lessons, in particular with multiple annotation projects, are drawn.
The ongoing digitalization of educational resources and the use of the internet lead to a steady increase of potentially available learning media. However, many of the media which are used for educational purposes have not been designed specifically for teaching and learning. Usually, linguistic criteria of readability and comprehensibility as well as content-related criteria are used independently to assess and compare the quality of educational media. This also holds true for educational media used in economics. This article aims to improve the analysis of textual learning media used in economic education by drawing on threshold concepts. Threshold concepts are key terms in knowledge acquisition within a domain. From a linguistic perspective, however, threshold concepts are instances of specialized vocabularies, exhibiting particular linguistic features. In three kinds of (German) resources, namely in textbooks, in newspapers, and on Wikipedia, we investigate the distributive profiles of 63 threshold concepts identified in economics education (which have been collected from threshold concept research). We looked at the threshold concepts' frequency distribution, their compound distribution, and their network structure within the three kinds of resources. The two main findings of our analysis show that firstly, the three kinds of resources can indeed be distinguished in terms of their threshold concepts' profiles. Secondly, Wikipedia definitely shows stronger associative connections between economic threshold concepts than the other sources. We discuss the findings in relation to adequate media use for teaching and learning—not only in economic education.