Refine
Year of publication
- 2017 (7) (remove)
Document Type
- Article (6)
- Doctoral Thesis (1)
Language
- English (7)
Has Fulltext
- yes (7)
Is part of the Bibliography
- no (7) (remove)
Keywords
- copyright (3)
- piracy (3)
- publishing (3)
- LibGen (2)
- Sci-Hub (2)
- data-science (2)
- journals (2)
- literature (2)
- paywalls (2)
- intellectual property (1)
Institute
- Biowissenschaften (7) (remove)
Molluscs are the second most species-rich phylum in the animal kingdom, yet only 11 genomes of this group have been published so far. Here, we present the draft genome sequence of the pulmonate freshwater snail Radix auricularia. Six whole genome shotgun libraries with different layouts were sequenced. The resulting assembly comprises 4,823 scaffolds with a cumulative length of 910 Mb and an overall read coverage of 72×. The assembly contains 94.6% of a metazoan core gene collection, indicating an almost complete coverage of the coding fraction. The discrepancy of ∼690 Mb compared with the estimated genome size of R. auricularia (1.6 Gb) results from a high repeat content of 70% mainly comprising DNA transposons. The annotation of 17,338 protein coding genes was supported by the use of publicly available transcriptome data. This draft will serve as starting point for further genomic and population genetic research in this scientifically important phylum.
The website Sci-Hub provides access to scholarly literature via full text PDF downloads. The site enables users to access articles that would otherwise be paywalled. Since its creation in 2011, Sci-Hub has grown rapidly in popularity. However, until now, the extent of Sci-Hub's coverage was unclear. As of March 2017, we find that Sci-Hub's database contains 68.9% of all 81.6 million scholarly articles, which rises to 85.2% for those published in closed access journals. Furthermore, Sci-Hub contains 77.0% of the 5.2 million articles published by inactive journals. Coverage varies by discipline, with 92.8% coverage of articles in chemistry journals compared to 76.3% for computer science. Coverage also varies by publisher, with the coverage of the largest publisher, Elsevier, at 97.3%. Our interactive browser at https://greenelab.github.io/scihub allows users to explore these findings in more detail. Finally, we estimate that over a six-month period in 2015–2016, Sci-Hub provided access for 99.3% of valid incoming requests. Hence, the scope of this resource suggests the subscription publishing model is becoming unsustainable. For the first time, the overwhelming majority of scholarly literature is available gratis to anyone with an Internet connection.
Characterizing the hologenome of Lasallia pustulata and tracing genomic footprints of lichenization
(2017)
The lichen symbiosis – consisting of fungal mycobionts and photoautotroph photobionts (green algae or cyanobacteria) – is globally successful. It covers an estimated 6% of the global surface with habitats ranging from deserts to the arctic. This success is reflected in the diversity of the mycobionts, with around 21% of all fungal species participating in lichen symbioses that can be facultative or obligate. Lichenization is furthermore evolutionary old, with fossil evidence for lichens reaching back 415 million years. For an individual fungal lineage, the Lecanoromycetes, the lichenization happened around 300 million years ago. This longstanding symbiotic relationship and the diversity of observed symbiotic dependency make them promising models to study the genomic consequences that follow the establishment of symbioses. Despite this, only little is known about the genomic effects of lichenization and extreme symbiotic dependency. To fill this gap we sequenced the hologenome of the lichen Lasallia pustulata, where the mycobiont could so far not been cultivated, suggesting that it might be more dependent on its symbionts.
As the poor culturability of lichen symbionts renders their genomes inaccessible to standard sequencing practices, we evaluated the extent to which different metagenome sequencing- and de novo assembly-strategies can be used to sequence and reconstruct the genomes of the individual symbionts. We find that the abundances of individual genomes present in the L. pustulata hologenome vary substantially, with the mycobiont being most abundant. Using in silico generated data sets and real Illumina sequencing data for L. pustulata we observe that the skewed abundances prevent a contiguous assembly of the underrepresented genomes when using only short-read sequencing. We conclude that short-read sequencing can offer first insights into lichen hologenomes. The fragmentation of the reconstructions hinders downstream analyses into the genomic consequences of lichenization though, as these are focused on identifying the gain and loss of genes.
We thus demonstrate a hybrid genome assembly strategy that is based on both short- and long-read sequencing. We show that this strategy is capable of creating highly contiguous genome reconstructions, not only for the L. pustulata mycobiont but also its photobiont Trebouxia sp., along with substantial amounts of the bacterial microbiome. A subsequent analysis of the microbiome of L. pustulata – performed over nine different samples collected in Germany and Italy – showed a stable taxonomic composition across the geographic range. We find that Acidobacteriaceae, which are known to thrive in nutrient poor habitats, are the dominant taxa. These would make them well adapted for the co-habitation with L. pustulata, which largely grows on rocks. Whether the Acidobacteriaceae are functionally involved in the lichen symbiosis is unclear so far.
As further comparative genomic studies rely on comprehensive genome annotations, we evaluate the completeness and fidelity of the gene annotations for the mycobiont L. pustulata as well as four further Lecanoromycetes. This reveals that un- and mis-annotated genes impact all evaluated genomes, with artificially joined genes and unannotated genes having the largest impact. In addition to these factors we find that the sequence composition – especially G/C-rich inverted repeats – lead to sequencing errors that interfere with the gene prediction. We minimize the effects of these artifacts through a rigorous curation.
Given the extremely sparse taxon sampling of available green alga genomes, we focus our search for the genomic footprints of lichenization on the mycobionts. We compare the genomes of the Lecanoromycetes to their closest relatives, the Eurotiomycetes and Dothideomycetes. This reveals that the last common ancestor of the Lecanoromycetes has lost around 10% of its genes after they split from the non-lichenized ancestor they share with the Eurotiomycetes. These losses are furthermore enriched, showing an excessive loss of genes involved with the degradation of polysaccharides. The loss of these genes fits a change from an ancestral saprotrophic lifestyle that depends on degrading complex plant matter, to the symbiotic lifestyle that relies on simpler nutrients provided by the photobionts. While the last common ancestor of the Lecanoromycetes additionally gained around 400 genes these could so far not be further characterized due to a lack of functionally annotated reference data.
As the mycobiont L. pustulata could so far not been grown in axenic culture, we initially expected to find an extensive genomic remodeling compared to the other mycobionts that easily grow in culture. We do not find evidence for this. Analyzing both the contraction of gene families and the loss of genes, we observe that L. pustulata and Umbilicaria muehlenbergii – its close relative that is easily grown in culture – share most of these. Furthermore, L. pustulata does not show an excessive loss of evolutionary old and well-conserved genes. These effects are mirrored on the functional level, as neither gene family contractions nor gene losses show a functional enrichment. This is partially due to the lack of functional reference data, analogous to the genes gained in the Lecanoromycetes, rendering their characterization hard. Thus, further studies on the genomic consequences of lichenization and differences in symbiotic dependence will have to be conducted, including larger taxon sets. This will be even more important for the photobionts, as the Chlorophyta are even more sparsely sampled today, hindering an effective functional and evolutionary study.
The website Sci-Hub provides access to scholarly literature via full text PDF downloads. The site enables users to access articles that would otherwise be paywalled. Since its creation in 2011, SciHub has grown rapidly in popularity. However, until now, the extent of Sci-Hub’s coverage was unclear. As of March 2017, we find that Sci-Hub’s database contains 68.9% of all 81.6 million scholarly articles, which rises to 85.2% for those published in toll access journals. Coverage varies by discipline, with 92.8% coverage of articles in chemistry journals compared to 76.3% for computer science. Coverage also varies by publisher, with the coverage of the largest publisher, Elsevier, at 97.3%. Our interactive browser at greenelab.github.io/scihub allows users to explore these findings in more detail. We find Sci-Hub preferentially covers popular, paywalled content, containing 96.2% of citations to toll access journals since 2015. For recently requested articles by Unpaywall users, oaDOI provided access to 48.8% whereas Sci-Hub contained 81.5%. Together, oaDOI and Sci-Hub covered 94.1%, demonstrating that gaps in Sci-Hub’s coverage, especially for open access articles, can be filled using licit services. For the first time, nearly all scholarly literature is available gratis to anyone with an Internet connection. Sci-Hub’s scope suggests the subscription publishing model is becoming unsustainable.
Despite the growth of Open Access, potentially illegally circumventing paywalls to access scholarly publications is becoming a more mainstream phenomenon. The web service Sci-Hub is amongst the biggest facilitators of this, offering free access to around 62 million publications. So far it is not well studied how and why its users are accessing publications through Sci-Hub. By utilizing the recently released corpus of Sci-Hub and comparing it to the data of ~28 million downloads done through the service, this study tries to address some of these questions. The comparative analysis shows that both the usage and complete corpus is largely made up of recently published articles, with users disproportionately favoring newer articles and 35% of downloaded articles being published after 2013. These results hint that embargo periods before publications become Open Access are frequently circumnavigated using Guerilla Open Access approaches like Sci-Hub. On a journal level, the downloads show a bias towards some scholarly disciplines, especially Chemistry, suggesting increased barriers to access for these. Comparing the use and corpus on a publisher level, it becomes clear that only 11% of publishers are highly requested in comparison to the baseline frequency, while 45% of all publishers are significantly less accessed than expected. Despite this, the oligopoly of publishers is even more remarkable on the level of content consumption, with 80% of all downloads being published through only 9 publishers. All of this suggests that Sci-Hub is used by different populations and for a number of different reasons, and that there is still a lack of access to the published scientific record. A further analysis of these openly available data resources will undoubtedly be valuable for the investigation of academic publishing.
Peer review of research articles is a core part of our scholarly communication system. In spite of its importance, the status and purpose of peer review is often contested. What is its role in our modern digital research and communications infrastructure? Does it perform to the high standards with which it is generally regarded? Studies of peer review have shown that it is prone to bias and abuse in numerous dimensions, frequently unreliable, and can fail to detect even fraudulent research. With the advent of Web technologies, we are now witnessing a phase of innovation and experimentation in our approaches to peer review. These developments prompted us to examine emerging models of peer review from a range of disciplines and venues, and to ask how they might address some of the issues with our current systems of peer review. We examine the functionality of a range of social Web platforms, and compare these with the traits underlying a viable peer review system: quality control, quantified performance metrics as engagement incentives, and certification and reputation. Ideally, any new systems will demonstrate that they out-perform current models while avoiding as many of the biases of existing systems as possible. We conclude that there is considerable scope for new peer review initiatives to be developed, each with their own potential issues and advantages. We also propose a novel hybrid platform model that, at least partially, resolves many of the technical and social issues associated with peer review, and can potentially disrupt the entire scholarly communication system. Success for any such development relies on reaching a critical threshold of research community engagement with both the process and the platform, and therefore cannot be achieved without a significant change of incentives in research environments.
We explored the characteristics and motivations of people who, having obtained their genetic or genomic data from Direct-To-Consumer genetic testing (DTC-GT) companies, voluntarily decide to share them on the publicly accessible web platform openSNP. The study is the first attempt to describe open data sharing activities undertaken by individuals without institutional oversight. In the paper we provide a detailed overview of the distribution of the demographic characteristics and motivations of people engaged in genetic or genomic open data sharing. The geographical distribution of the respondents showed the USA as dominant. There was no significant gender divide, the age distribution was broad, educational background varied and respondents with and without children were equally represented. Health, even though prominent, was not the respondents’ primary or only motivation to be tested. As to their motivations to openly share their data, 86.05% indicated wanting to learn about themselves as relevant, followed by contributing to the advancement of medical research (80.30%), improving the predictability of genetic testing (76.02%) and considering it fun to explore genotype and phenotype data (75.51%). Whereas most respondents were well aware of the privacy risks of their involvement in open genetic data sharing and considered the possibility of direct, personal repercussions troubling, they estimated the risk of this happening to be negligible. Our findings highlight the diversity of DTC-GT consumers who decide to openly share their data. Instead of focusing exclusively on health-related aspects of genetic testing and data sharing, our study emphasizes the importance of taking into account benefits and risks that stretch beyond the health spectrum. Our results thus lend further support to the call for a broader and multi-faceted conceptualization of genomic utility.