Refine
Language
- English (3)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Keywords
- assembly gaps (1)
- benchmarking (1)
- genome assembly (1)
- long sequencing reads (1)
Background: Long sequencing reads allow increasing contiguity and completeness of fragmented, short-read–based genome assemblies by closing assembly gaps, ideally at high accuracy. While several gap-closing methods have been developed, these methods often close an assembly gap with sequence that does not accurately represent the true sequence.
Findings: Here, we present DENTIST, a sensitive, highly accurate, and automated pipeline method to close gaps in short-read assemblies with long error-prone reads. DENTIST comprehensively determines repetitive assembly regions to identify reliable and unambiguous alignments of long reads to the correct loci, integrates a consensus sequence computation step to obtain a high base accuracy for the inserted sequence, and validates the accuracy of closed gaps. Unlike previous benchmarks, we generated test assemblies that have gaps at the exact positions where real short-read assemblies have gaps. Generating such realistic benchmarks for Drosophila (134 Mb genome), Arabidopsis (119 Mb), hummingbird (1 Gb), and human (3 Gb) and using simulated or real PacBio continuous long reads, we show that DENTIST consistently achieves a substantially higher accuracy compared to previous methods, while having a similar sensitivity.
Conclusion: DENTIST provides an accurate approach to improve the contiguity and completeness of fragmented assemblies with long reads. DENTIST's source code including a Snakemake workflow, conda package, and Docker container is available at https://github.com/a-ludi/dentist. All test assemblies as a resource for future benchmarking are at https://bds.mpi-cbg.de/hillerlab/DENTIST/.
Feeding exclusively on blood, vampire bats represent the only obligate sanguivorous lineage among mammals. To uncover genomic changes associated with adaptations to this unique dietary specialization, we generated a new haplotype-resolved reference-quality genome of the common vampire bat (Desmodus rotundus) and screened 26 bat species for genes that were specifically lost in the vampire bat lineage. We discovered previously-unknown gene losses that relate to metabolic and physiological changes, such as reduced insulin secretion (FFAR1, SLC30A8), limited glycogen stores (PPP1R3E), and a distinct gastric physiology (CTSE). Other gene losses likely reflect the biased nutrient composition (ERN2, CTRL) and distinct pathogen diversity of blood (RNASE7). Interestingly, the loss of REP15 likely helped vampire bats to adapt to high dietary iron levels by enhancing iron excretion and the loss of the 24S-hydroxycholesterol metabolizing enzyme CYP39A1 could contribute to their exceptional cognitive abilities. Finally, losses of key cone phototransduction genes (PDE6H, PDE6C) suggest that these strictly-nocturnal bats completely lack cone-based vision. These findings enhance our understanding of vampire bat biology and the genomic underpinnings of adaptations to sanguivory.
Vampire bats are the only mammals that feed exclusively on blood. To uncover genomic changes associated with this dietary adaptation, we generated a haplotype-resolved genome of the common vampire bat and screened 27 bat species for genes that were specifically lost in the vampire bat lineage. We found previously unknown gene losses that relate to reduced insulin secretion (FFAR1 and SLC30A8), limited glycogen stores (PPP1R3E), and a unique gastric physiology (CTSE). Other gene losses likely reflect the biased nutrient composition (ERN2 and CTRL) and distinct pathogen diversity of blood (RNASE7) and predict the complete lack of cone-based vision in these strictly nocturnal bats (PDE6H and PDE6C). Notably, REP15 loss likely helped vampire bats adapt to high dietary iron levels by enhancing iron excretion, and the loss of CYP39A1 could have contributed to their exceptional cognitive abilities. These findings enhance our understanding of vampire bat biology and the genomic underpinnings of adaptations to blood feeding.