Refine
Year of publication
- 2022 (3) (remove)
Document Type
- Article (3)
Language
- English (3)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Keywords
- Laurasiatheria (1)
- Scrotifera (1)
- anomaly zone (1)
- assembly gaps (1)
- benchmarking (1)
- exon coalescence (1)
- exon concatenation (1)
- genome assembly (1)
- long sequencing reads (1)
- retrophylogenomics (1)
Background: Long sequencing reads allow increasing contiguity and completeness of fragmented, short-read–based genome assemblies by closing assembly gaps, ideally at high accuracy. While several gap-closing methods have been developed, these methods often close an assembly gap with sequence that does not accurately represent the true sequence.
Findings: Here, we present DENTIST, a sensitive, highly accurate, and automated pipeline method to close gaps in short-read assemblies with long error-prone reads. DENTIST comprehensively determines repetitive assembly regions to identify reliable and unambiguous alignments of long reads to the correct loci, integrates a consensus sequence computation step to obtain a high base accuracy for the inserted sequence, and validates the accuracy of closed gaps. Unlike previous benchmarks, we generated test assemblies that have gaps at the exact positions where real short-read assemblies have gaps. Generating such realistic benchmarks for Drosophila (134 Mb genome), Arabidopsis (119 Mb), hummingbird (1 Gb), and human (3 Gb) and using simulated or real PacBio continuous long reads, we show that DENTIST consistently achieves a substantially higher accuracy compared to previous methods, while having a similar sensitivity.
Conclusion: DENTIST provides an accurate approach to improve the contiguity and completeness of fragmented assemblies with long reads. DENTIST's source code including a Snakemake workflow, conda package, and Docker container is available at https://github.com/a-ludi/dentist. All test assemblies as a resource for future benchmarking are at https://bds.mpi-cbg.de/hillerlab/DENTIST/.
Relationships among laurasiatherian clades represent one of the most highly disputed topics in mammalian phylogeny. In this study, we attempt to disentangle laurasiatherian interordinal relationships using two independent genome-level approaches: (1) quantifying retrotransposon presence/absence patterns, and (2) comparisons of exon datasets at the levels of nucleotides and amino acids. The two approaches revealed contradictory phylogenetic signals, possibly due to a high level of ancestral incomplete lineage sorting. The positions of Eulipotyphla and Chiroptera as the first and second earliest divergences were consistent across the approaches. However, the phylogenetic relationships of Perissodactyla, Cetartiodactyla, and Ferae, were contradictory. While retrotransposon insertion analyses suggest a clade with Cetartiodactyla and Ferae, the exon dataset favoured Cetartiodactyla and Perissodactyla. Future analyses of hitherto unsampled laurasiatherian lineages and synergistic analyses of retrotransposon insertions, exon and conserved intron/intergenic sequences might unravel the conflicting patterns of relationships in this major mammalian clade.
Vampire bats are the only mammals that feed exclusively on blood. To uncover genomic changes associated with this dietary adaptation, we generated a haplotype-resolved genome of the common vampire bat and screened 27 bat species for genes that were specifically lost in the vampire bat lineage. We found previously unknown gene losses that relate to reduced insulin secretion (FFAR1 and SLC30A8), limited glycogen stores (PPP1R3E), and a unique gastric physiology (CTSE). Other gene losses likely reflect the biased nutrient composition (ERN2 and CTRL) and distinct pathogen diversity of blood (RNASE7) and predict the complete lack of cone-based vision in these strictly nocturnal bats (PDE6H and PDE6C). Notably, REP15 loss likely helped vampire bats adapt to high dietary iron levels by enhancing iron excretion, and the loss of CYP39A1 could have contributed to their exceptional cognitive abilities. These findings enhance our understanding of vampire bat biology and the genomic underpinnings of adaptations to blood feeding.