Refine
Document Type
- Article (5)
- Preprint (5)
- Doctoral Thesis (1)
Has Fulltext
- yes (11)
Is part of the Bibliography
- no (11)
Keywords
- B chromosome (1)
- Carnivora (1)
- MinION (1)
- SARS-CoV-2 (1)
- assembly (1)
- cuticular hydrocarbons (1)
- desaturase (1)
- elongase (1)
- formicine (1)
- genome assembly and annotation (1)
The success of social insects is largely intertwined with their highly advanced chemical communication system that facilitates recognition and discrimination of species and nest-mates, recruitment, and division of labor. Hydrocarbons, which cover the cuticle of insects, not only serve as waterproofing agents but also constitute a major component of this communication system. Two cryptic Crematogaster species, which share their nest with Camponotus ants, show striking diversity in their cuticular hydrocarbon (CHC) profile. This mutualistic system therefore offers a great opportunity to study the genetic basis of CHC divergence between sister species. As a basis for further genome-wide studies high-quality genomes are needed. Here, we present the annotated draft genome for Crematogaster levior A. By combining the three most commonly used sequencing techniques—Illumina, PacBio, and Oxford Nanopore—we constructed a high-quality de novo ant genome. We show that even low coverage of long reads can add significantly to overall genome contiguity. Annotation of desaturase and elongase genes, which play a role in CHC biosynthesis revealed one of the largest repertoires in ants and a higher number of desaturases in general than in other Hymenoptera. This may provide a mechanistic explanation for the high diversity observed in C. levior CHC profiles.
Several members of the genus Legionella cause Legionnaires’ disease, a potentially debilitating form of pneumonia. Studies frequently focus on the abundant number of virulence factors present in this genus. However, what is often overlooked is the role of secondary metabolites from Legionella. Following whole genome sequencing, we assembled and annotated the Legionella parisiensis DSM 19216 genome. Together with 14 other members of the Legionella, we performed comparative genomics and analysed the secondary metabolite potential of each strain. We found that Legionella contains a huge variety of biosynthetic gene clusters (BGCs) that are potentially making a significant number of novel natural products with undefined function. Surprisingly, only a single Sfp-like phosphopantetheinyl transferase is found in all Legionella strains analyzed that might be responsible for the activation of all carrier proteins in primary (fatty acid biosynthesis) and secondary metabolism (polyketide and non-ribosomal peptide synthesis). Using conserved active site motifs, we predict some novel compounds that are probably involved in cell-cell communication, differing to known communication systems. We identify several gene clusters, which may represent novel signaling mechanisms and demonstrate the natural product potential of Legionella.
Die Analyse von DNA-Sequenzen steht spätestens seit der Feststellung ihrer tragenden Rolle in der Vererbung organismischer Eigenschaften im Fokus biologischer Fragestellungen. Seit Kurzem wird mit modernsten Methoden die Untersuchung von kompletten Genomen ermöglicht. Dies eröffnet den Zugang zu genomweiten Informationen gegenüber begrenzt aussagekräftigen markerbasierten Analysen. Eine Genomsequenz ist die ultimative Quelle an organismischer Information. Allerdings sind diese Informationen oft aufgrund technischer und biologischer Gründe komplex und werfen meist mehr Fragen auf, als sie beantworten.
Die Rekonstruktion einer bislang unbekannten Genomsequenz aus kurzen Sequenzen stellt eine technische Herausforderung dar, die mit grundlegenden, aber in der Realität nicht zwingend zutreffenden Annahmen verbunden ist. Außerdem können biologische Faktoren, wie Repeatgehalt oder Heterozygotie, die Fehlerrate einer Assemblierung stark beeinflussen. Die Beurteilung der Qualität einer de novo Assemblierung ist herausfordernd, aber zugleich äußerst notwendig. Anschließend ist eine strukturelle und funktionale Annotation von Genen, kodierenden Bereichen und repeats nötig, um umfangreiche biologische Fragestellungen beantworten zu können. Ein qualitativ hochwertiges und annotiertes assembly ermöglicht genomweite Analysen von Individuen und Populationen. Diese Arbeit beinhaltet die Assemblierung und Annotation des Genoms der Süßwasserschnecke Radix auricularia und eine Studie vergleichender Genomik von fünf Individuen aus verschiedenen molekularen Gruppen (MOTUs).
Mollusken beherbergen nach den Insekten die größte Artenvielfalt innerhalb der Tierstämme und besiedeln verschiedenste, teils extreme, Habitate. Trotz der großen Bedeutung für die Biodiversitätsforschung sind verhältnismäßig wenige genomische Daten öffentlich verfügbar. Zudem sind Arten der Gattung Radix auch aufgrund ihrer großen geografischen Verbreitung in diversen biologischen Disziplinen als Modellorganismen etabliert. Eine annotierte Genomsequenz ermöglicht über bereits untersuchte Felder hinaus die Forschung an grundlegenden biologischen Fragestellungen, wie z.B. die Funktionsweise von Hybridisierung und Artbildung. Durch Assemblierung und scaffolding von sechs whole genome shotgun Bibliotheken verschiedener insert sizes und einem transkriptbasiertem scaffolding konnte trotz des hohen Repeatgehalts ein vergleichsweise kontinuierliches assembly erhalten werden. Die erhebliche Differenz zwischen der Gesamtlänge der Assemblierung und der geschätzten Genomgröße konnte zum Großteil auf kollabierte repeats zurückgeführt werden.
Die strukturelle Annotation basierend auf Transkriptomen, Proteinen einer Datenbank und artspezifisch trainierten Genvorhersagemodellen resultierte in 17.338 proteinkodierenden Genen, die etwa 12,5% der geschätzten Genomgröße abdecken. Der Annotation wird u.a. aufgrund beinhaltender Kernrthologen, konservierter Proteindomänenarrangements und der Übereinstimmung mit de novo sequenzierten Peptiden eine hohe Qualität zugesprochen.
Das mapping der Sequenzen von fünf Radix MOTUs gegen die R. auricularia Assemblierung zeigte stark verringerte coverage außerhalb kodierender Bereiche der nicht-Referenz MOTUs aufgrund hoher Nukleotiddiversität. Für 16.039 Gene konnten Topologien berechnet werden und ein Test auf positive Selektion ausgeführt werden. Insgesamt konnte über alle MOTUs hinweg in 678 verschiedenen Genen positive Selektion detektiert werden, wobei jede MOTU ein nahezu einzigartiges Set positiv selektierter Gene beinhaltet. Von allen 16.039 untersuchten Genen konnten 56,4% funktional annotiert werden. Diese niedrige Rate wird vermutlich durch Mangel an genomischer Information in Mollusken verursacht. Anschließende Analysen auf Anreicherungen von Funktionen sind deshalb nur bedingt repräsentativ.
Neben den biologischen Ergebnissen wurden Methoden und Optimierungen genomischer Analysen von Nichtmodellorganismen entwickelt. Dazu zählen eigens angefertigte Skripte, um beispielsweise Transkriptomalignments zu filtern, Trainings eines Genvorhersagemodells automatisiert und parallelisiert auszuführen und Orthogruppen bestimmter Arten aus einer Orthologievorhersage zu extrahieren. Zusätzlich wurden Abläufe entwickelt, um möglichst viele vorhandene Daten in die Assemblierung und Annotation zu integrieren. Etwa wurde ein zusätzliches scaffolding mit eigens assemblierten Transkripten mehrerer MOTUs sequenziell und phylogenetisch begründet ausgeführt.
Insgesamt wird eine umfassende und qualitativ hochwertige Genomsequenz eines Süßwassermollusken präsentiert, welche eine Grundlage für zukünftige Forschungsprojekte z.B. im Bereich der Biodiversität, Populationsgenomik und molekularen Ökologie bietet. Die Ergebnisse dieser Arbeit stellen einen Wissenszuwachs in der Genomik von Mollusken dar, welche bisher trotz ihrer Artenvielfalt deutlich unterrepräsentiert bezüglich assemblierter und annotierter Genome auffallen.
Vampire bats are the only mammals that feed exclusively on blood. To uncover genomic changes associated with this dietary adaptation, we generated a haplotype-resolved genome of the common vampire bat and screened 27 bat species for genes that were specifically lost in the vampire bat lineage. We found previously unknown gene losses that relate to reduced insulin secretion (FFAR1 and SLC30A8), limited glycogen stores (PPP1R3E), and a unique gastric physiology (CTSE). Other gene losses likely reflect the biased nutrient composition (ERN2 and CTRL) and distinct pathogen diversity of blood (RNASE7) and predict the complete lack of cone-based vision in these strictly nocturnal bats (PDE6H and PDE6C). Notably, REP15 loss likely helped vampire bats adapt to high dietary iron levels by enhancing iron excretion, and the loss of CYP39A1 could have contributed to their exceptional cognitive abilities. These findings enhance our understanding of vampire bat biology and the genomic underpinnings of adaptations to blood feeding.
The gradual heterogeneity of climatic factors pose varying selection pressures across geographic distances that leave signatures of clinal variation in the genome. Separating signatures of clinal adaptation from signatures of other evolutionary forces, such as demographic processes, genetic drift, and adaptation to non-clinal conditions of the immediate local environment is a major challenge. Here, we examine climate adaptation in five natural populations of the harlequin fly Chironomus riparius sampled along a climatic gradient across Europe. Our study integrates experimental data, individual genome resequencing, Pool-Seq data, and population genetic modelling. Common-garden experiments revealed a positive correlation of population growth rates corresponding to the population origin along the climate gradient, suggesting thermal adaptation on the phenotypic level. Based on a population genomic analysis, we derived empirical estimates of historical demography and migration. We used an FST outlier approach to infer positive selection across the climate gradient, in combination with an environmental association analysis. In total we identified 162 candidate genes as genomic basis of climate adaptation. Enriched functions among these candidate genes involved the apoptotic process and molecular response to heat, as well as functions identified in other studies of climate adaptation in other insects. Our results show that local climate conditions impose strong selection pressures and lead to genomic adaptation despite strong gene flow. Moreover, these results imply that selection to different climatic conditions seems to converge on a functional level, at least between different insect species.
Feeding exclusively on blood, vampire bats represent the only obligate sanguivorous lineage among mammals. To uncover genomic changes associated with adaptations to this unique dietary specialization, we generated a new haplotype-resolved reference-quality genome of the common vampire bat (Desmodus rotundus) and screened 26 bat species for genes that were specifically lost in the vampire bat lineage. We discovered previously-unknown gene losses that relate to metabolic and physiological changes, such as reduced insulin secretion (FFAR1, SLC30A8), limited glycogen stores (PPP1R3E), and a distinct gastric physiology (CTSE). Other gene losses likely reflect the biased nutrient composition (ERN2, CTRL) and distinct pathogen diversity of blood (RNASE7). Interestingly, the loss of REP15 likely helped vampire bats to adapt to high dietary iron levels by enhancing iron excretion and the loss of the 24S-hydroxycholesterol metabolizing enzyme CYP39A1 could contribute to their exceptional cognitive abilities. Finally, losses of key cone phototransduction genes (PDE6H, PDE6C) suggest that these strictly-nocturnal bats completely lack cone-based vision. These findings enhance our understanding of vampire bat biology and the genomic underpinnings of adaptations to sanguivory.
Molluscs are the second most species-rich phylum in the animal kingdom, yet only 11 genomes of this group have been published so far. Here, we present the draft genome sequence of the pulmonate freshwater snail Radix auricularia. Six whole genome shotgun libraries with different layouts were sequenced. The resulting assembly comprises 4,823 scaffolds with a cumulative length of 910 Mb and an overall read coverage of 72×. The assembly contains 94.6% of a metazoan core gene collection, indicating an almost complete coverage of the coding fraction. The discrepancy of ∼690 Mb compared with the estimated genome size of R. auricularia (1.6 Gb) results from a high repeat content of 70% mainly comprising DNA transposons. The annotation of 17,338 protein coding genes was supported by the use of publicly available transcriptome data. This draft will serve as starting point for further genomic and population genetic research in this scientifically important phylum.
Molluscs are the second most species-rich phylum in the animal kingdom, yet only eleven genomes of this group have been published so far. Here, we present the draft genome sequence of the pulmonate freshwater snail Radix auricularia. Six whole genome shotgun libraries with different layouts were sequenced. The resulting assembly comprises 4,823 scaffolds with a cumulative length of 910 Mb and an overall read coverage of 72x. The assembly contains 94.6 % of a metazoan core gene collection, indicating an almost complete coverage of the coding fraction. The discrepancy of ~690 Mb compared to the estimated genome size of R. auricularia (1.6 Gb) results from a high repeat content of 70 % mainly comprising DNA transposons. The annotation of 17,338 protein coding genes was supported by the use of publicly-available transcriptome data. This draft will serve as starting point for further genomic and population genetic research in this scientifically important phylum.
Precise estimates of genome sizes are important parameters for both theoretical and practical biodiversity genomics. We present here a fast, easy-to-implement and precise method to estimate genome size from the number of bases sequenced and the mean sequence coverage. To estimate the latter, we take advantage of the fact that a precise estimation of the Poisson distribution parameter lambda is possible from truncated data, restricted to the part of the coverage distribution representing the true underlying distribution. With simulations we could show that reasonable genome size estimates can be gained even from low-coverage (10X), highly discontinuous genome drafts. Comparison of estimates from a wide range of taxa and sequencing strategies with flow-cytometry estimates of the same individuals showed a very good fit and suggested that both methods yield comparable, interchangeable results.
Background: Ever decreasing costs along with advances in sequencing and library preparation technologies enable even small research groups to generate chromosome-level assemblies today. Here we report the generation of an improved chromosome-level assembly for the Siamese fighting fish (Betta splendens) that was carried out during a practical university Master’s course. The Siamese fighting fish is a popular aquarium fish and an emerging model species for research on aggressive behaviour. We updated the current genome assembly by generating a new long-read nanopore-based assembly with subsequent scaffolding to chromosome-level using previously published HiC data.
Findings: The use of nanopore-based long-read data sequenced on a MinION platform (Oxford Nanopore Technologies) allowed us to generate a baseline assembly of only 1,276 contigs with a contig N50 of 2.1 Mbp, and a total length of 441 Mbp. Scaffolding using previously published HiC data resulted in 109 scaffolds with a scaffold N50 of 20.7 Mbp. More than 99% of the assembly is comprised in 21 scaffolds. The assembly showed the presence of 95.8% complete BUSCO genes from the Actinopterygii dataset indicating a high quality of the assembly.
Conclusion: We present an improved full chromosome-level assembly of the Siamese fighting fish generated during a university Master’s course. The use of ~35× long-read nanopore data drastically improved the baseline assembly in terms of continuity. We show that relatively in-expensive high-throughput sequencing technologies such as the long-read MinION sequencing platform can be used in educational settings allowing the students to gain practical skills in modern genomics and generate high quality results that benefit downstream research projects.