Refine
Document Type
- Article (1)
- Conference Proceeding (1)
Language
- English (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Keywords
- text mining (2) (remove)
Institute
- Biowissenschaften (1)
- Neuere Philologien (1)
Nature's non-material contributions to people are difficult to quantify and one aspect in particular, nature's contributions to communication (NCC), has so far been neglected. Recent advances in automated language processing tools enable us to quantify diversity patterns underlying the distribution of plant and animal taxon labels in creative literature, which we term BiL (biodiversity in literature). We assume BiL to provide a proxy for people's openness to nature's non-material contributions enhancing our understanding of NCC. We assembled a comprehensive list of 240,000 English biological taxon labels. We pre-processed and searched a subcorpus of digitised literature on Project Gutenberg for these labels. We quantified changes in biodiversity indices commonly used in ecological studies for 16,000 books, encompassing 4,000 authors, as proxies for BiL between 1705 and 1969. We observed hump-shape patterns for taxon label richness, abundance and Shannon diversity indicating a peak of BiL in the middle of the 19th century. This is also true for the ratio of biological to general lexical richness. The variation in label use between different sections within books, quantified as β-diversity, declined until the 1830s and recovered little, indicating a less specialised use of taxon labels over time. This pattern corroborates our hypothesis that before the onset of industrialisation BiL may have increased, reflecting several concomitant influences such as the general broadening of literary content, improved education and possibly an intensified awareness of the starting loss of biodiversity during the period of romanticism. Given that these positive trends continued and that we do not find support for alternative processes reducing BiL, such as language streamlining, we suggest that this pronounced trend reversal and subsequent decline of BiL over more than 100 years may be the consequence of humans’ increasing alienation from nature owing to major societal changes in the wake of industrialisation. We conclude that our computational approach of analysing literary communication using biodiversity indices has a high potential for understanding aspects of non-material contributions of biodiversity to people. Our approach can be applied to other corpora and would benefit from additional metadata on taxa, works and authors.
With the ongoing loss of global biodiversity, long-term recordings of species distribution patterns are increasingly becoming important to investigate the causes and consequences for their change. Therefore, the digitization of scientific literature, both modern and historical, has been attracting growing attention in recent years. To meet this growing demand the Specialised Information Service for Biodiversity Research (BIOfid) was launched in 2017 with the aim of increasing the availability and accessibility of biodiversity information. Closely tied to the research community the interdisciplinary BIOfid team is digitizing data sources of biodiversity related research and provides a modern and professional infrastructure for hosting and sharing them. As a pilot project, German publications on the distribution and ecology of vascular plants, birds, moths and butterflies covering the past 250 years are prioritized. Large parts of the text corpus defined in accordance with the needs of the relevant German research community have already been transferred to a machine-readable format and will be publicly accessible soon. Software tools for text mining, semantic annotation and analysis with respect to the current trends in machine learning are developed to maximize bioscientific data output through user-specific queries that can be created via the BIOfid web portal (https://www.biofid.de/). To boost knowledge discovery, specific ontologies focusing on morphological traits and taxonomy are being prepared and will continuously be extended to keep up with an ever-expanding volume of literature sources.