Refine
Year of publication
Language
- English (36)
Has Fulltext
- yes (36)
Is part of the Bibliography
- no (36)
Keywords
- Giraffa (4)
- hybridization (3)
- runs of homozygosity (3)
- speciation (3)
- East Africa (2)
- Gene flow (2)
- Hybridization (2)
- SINE (2)
- Speciation (2)
- Ursidae (2)
Bears are iconic mammals with a complex evolutionary history. Natural bear hybrids and studies of few nuclear genes indicate that gene flow among bears may be more common than expected and not limited to polar and brown bears. Here we present a genome analysis of the bear family with representatives of all living species. Phylogenomic analyses of 869 mega base pairs divided into 18,621 genome fragments yielded a well-resolved coalescent species tree despite signals for extensive gene flow across species. However, genome analyses using different statistical methods show that gene flow is not limited to closely related species pairs. Strong ancestral gene flow between the Asiatic black bear and the ancestor to polar, brown and American black bear explains uncertainties in reconstructing the bear phylogeny. Gene flow across the bear clade may be mediated by intermediate species such as the geographically wide-spread brown bears leading to large amounts of phylogenetic conflict. Genome-scale analyses lead to a more complete understanding of complex evolutionary processes. Evidence for extensive inter-specific gene flow, found also in other animal species, necessitates shifting the attention from speciation processes achieving genome-wide reproductive isolation to the selective processes that maintain species divergence in the face of gene flow.
Background: The current taxonomy of the African giraffe (Giraffa camelopardalis) is primarily based on pelage pattern and geographic distribution, and nine subspecies are currently recognized. Although genetic studies have been conducted, their resolution is low, mainly due to limited sampling. Detailed knowledge about the genetic variation and phylogeography of the South African giraffe (G. c. giraffa) and the Angolan giraffe (G. c. angolensis) is lacking. We investigate genetic variation among giraffe matrilines by increased sampling, with a focus on giraffe key areas in southern Africa.
Results: The 1,562 nucleotides long mitochondrial DNA dataset (cytochrome b and partial control region) comprises 138 parsimony informative sites among 161 giraffe individuals from eight populations. We additionally included two okapis as an outgroup. The analyses of the maternally inherited sequences reveal a deep divergence between northern and southern giraffe populations in Africa, and a general pattern of distinct matrilineal clades corresponding to their geographic distribution. Divergence time estimates among giraffe populations place the deepest splits at several hundred thousand years ago.
Conclusions: Our increased sampling in southern Africa suggests that the distribution ranges of the Angolan and South African giraffe need to be redefined. Knowledge about the phylogeography and genetic variation of these two maternal lineages is crucial for the development of appropriate management strategies.
Ursine bears are a mammalian subfamily that comprises six morphologically and ecologically distinct extant species. Previous phylogenetic analyses of concatenated nuclear genes could not resolve all relationships among bears, and appeared to conflict with the mitochondrial phylogeny. Evolutionary processes such as incomplete lineage sorting and introgression can cause gene tree discordance and complicate phylogenetic inferences, but are not accounted for in phylogenetic analyses of concatenated data. We generated a high-resolution data set of autosomal introns from several individuals per species and of Y-chromosomal markers. Incorporating intraspecific variability in coalescence-based phylogenetic and gene flow estimation approaches, we traced the genealogical history of individual alleles. Considerable heterogeneity among nuclear loci and discordance between nuclear and mitochondrial phylogenies were found. A species tree with divergence time estimates indicated that ursine bears diversified within less than 2 My. Consistent with a complex branching order within a clade of Asian bear species, we identified unidirectional gene flow from Asian black into sloth bears. Moreover, gene flow detected from brown into American black bears can explain the conflicting placement of the American black bear in mitochondrial and nuclear phylogenies. These results highlight that both incomplete lineage sorting and introgression are prominent evolutionary forces even on time scales up to several million years. Complex evolutionary patterns are not adequately captured by strictly bifurcating models, and can only be fully understood when analyzing multiple independently inherited loci in a coalescence framework. Phylogenetic incongruence among gene trees hence needs to be recognized as a biologically meaningful signal.
All giraffe (Giraffa) were previously assigned to a single species (G. camelopardalis) and nine subspecies. However, multi‐locus analyses of all subspecies have shown that there are four genetically distinct clades and suggest four giraffe species. This conclusion might not be fully accepted due to limited data and lack of explicit gene flow analyses. Here, we present an extended study based on 21 independent nuclear loci from 137 individuals. Explicit gene flow analyses identify less than one migrant per generation, including between the closely related northern and reticulated giraffe. Thus, gene flow analyses and population genetics of the extended dataset confirm four genetically distinct giraffe clades and support four independent giraffe species. The new findings support a revision of the IUCN classification of giraffe taxonomy. Three of the four species are threatened with extinction, and mostly occurring in politically unstable regions, and as such, require the highest conservation support possible.
Background: A number of the deeper divergences in the placental mammal tree are still inconclusively resolved despite extensive phylogenomic analyses. A recent analysis of 200 kbp of protein coding sequences yielded only limited support for the relationships among Laurasiatheria (cow, dog, bat and shrew), probably because the divergences occurred only within a few million years from each other. It is generally expected that increasing the amount of data and improving the taxon sampling enhance the resolution of narrow divergences. Therefore these and other difficult splits were examined by phylogenomic analysis of the hitherto largest sequence alignment. The increasingly complete genome data of placental mammals also allowed developing a novel and stringent data search method. Results: The rigorous data handling, recursive BLAST, successfully removed the sequences from gene families, including those from well-known families hemoglobin, olfactory, myosin and HOX genes, thus avoiding alignment of possibly paralogous sequences. The current phylogenomic analysis of 3,012 genes (2,844,615 nucleotides) from a total of 22 species yielded statistically significant support for most relationships. While some major clades were confirmed using genomic sequence data, the placement of the treeshrew, bat and the relationship between Boreoeutheria, Xenarthra and Afrotheria remained problematic to resolve despite the size of the alignment. Phylogenomic analysis of divergence times dated the basal placental mammal splits at 95–100 million years ago. Many of the following divergences occurred only a few (2–4) million years later. Relationships with narrow divergence time intervals received unexpectedly limited support even from the phylogenomic analyses. Conclusion: The narrow temporal window within which some placental divergences took place suggests that inconsistencies and limited resolution of the mammalian tree may have their natural explanation in speciation processes such as lineage sorting, introgression from species hybridization or hybrid speciation. These processes obscure phylogenetic analysis, making some parts of the tree difficult to resolve even with genome data.
The massive amount of genomic sequence data that is now available for analyzing evolutionary relationships among 31 placental mammals reduces the stochastic error in phylogenetic analyses to virtually zero. One would expect that this would make it possible to finally resolve controversial branches in the placental mammalian tree. We analyzed a 2,863,797 nucleotide-long alignment (3,364 genes) from 31 placental mammals for reconstructing their evolution. Most placental mammalian relationships were resolved, and a consensus of their evolution is emerging. However, certain branches remain difficult or virtually impossible to resolve. These branches are characterized by short divergence times in the order of 1-4 million years. Computer simulations based on parameters from the real data show that as little as about 12,500 amino acid sites could be sufficient to confidently resolve short branches as old as about 90 million years ago. Thus, the amount of sequence data should no longer be a limiting factor in resolving the relationships among placental mammals. The timing of the early radiation of placental mammals coincides with a period of climate warming some 100 - 80 million years ago and with continental fragmentation. These global processes may have triggered the rapid diversification of placental mammals. However, the rapid radiations of certain mammalian groups complicate phylogenetic analyses, possibly due to incomplete lineage sorting and introgression. These speciation-related processes led to a mosaic genome and conflicting phylogenetic signals. Split network methods are ideal for visualizing these problematic branches and can therefore depict data conflict and possibly the true evolutionary history better than strictly bifurcating trees. Given the timing of tectonics, of placental mammalian divergences, and the fossil record, a Laurasian rather than Gondwanan origin of placental mammals seems the most parsimonious explanation. Key words: continental drift , Cretaceous warming , genome analysis , hybridization , phylogenomics , split decomposition
Our large brain, long life span and high fertility are key elements of human evolutionary success and are often thought to have evolved in interplay with tool use, carnivory and hunting. However, the specific impact of carnivory on human evolution, life history and development remains controversial. Here we show in quantitative terms that dietary profile is a key factor influencing time to weaning across a wide taxonomic range of mammals, including humans. In a model encompassing a total of 67 species and genera from 12 mammalian orders, adult brain mass and two dichotomous variables reflecting species differences regarding limb biomechanics and dietary profile, accounted for 75.5%, 10.3% and 3.4% of variance in time to weaning, respectively, together capturing 89.2% of total variance. Crucially, carnivory predicted the time point of early weaning in humans with remarkable precision, yielding a prediction error of less than 5% with a sample of forty-six human natural fertility societies as reference. Hence, carnivory appears to provide both a necessary and sufficient explanation as to why humans wean so much earlier than the great apes. While early weaning is regarded as essentially differentiating the genus Homo from the great apes, its timing seems to be determined by the same limited set of factors in humans as in mammals in general, despite some 90 million years of evolution. Our analysis emphasizes the high degree of similarity of relative time scales in mammalian development and life history across 67 genera from 12 mammalian orders and shows that the impact of carnivory on time to weaning in humans is quantifiable, and critical. Since early weaning yields shorter interbirth intervals and higher rates of reproduction, with profound effects on population dynamics, our findings highlight the emergence of carnivory as a process fundamentally determining human evolution.
Recent phylogenomic studies have failed to conclusively resolve certain branches of the placental mammalian tree, despite the evolutionary analysis of genomic data from 32 species. Previous analyses of single genes and retroposon insertion data yielded support for different phylogenetic scenarios for the most basal divergences. The results indicated that some mammalian divergences were best interpreted not as a single bifurcating tree, but as an evolutionary network. In these studies the relationships among some orders of the super-clade Laurasiatheria were poorly supported, albeit not studied in detail. Therefore, 4775 protein-coding genes (6,196,263 nucleotides) were collected and aligned in order to analyze the evolution of this clade. Additionally, over 200,000 introns were screened in silico, resulting in 32 phylogenetically informative long interspersed nuclear elements (LINE) insertion events.
The present study shows that the genome evolution of Laurasiatheria may best be understood as an evolutionary network. Thus, contrary to the common expectation to resolve major evolutionary events as a bifurcating tree, genome analyses unveil complex speciation processes even in deep mammalian divergences. We exemplify this on a subset of 1159 suitable genes that have individual histories, most likely due to incomplete lineage sorting or introgression, processes that can make the genealogy of mammalian genomes complex.
These unexpected results have major implications for the understanding of evolution in general, because the evolution of even some higher level taxa such as mammalian orders may sometimes not be interpreted as a simple bifurcating pattern.
Phylogenetic reconstruction from transposable elements (TEs) offers an additional perspective to study evolutionary processes. However, detecting phylogenetically informative TE insertions requires tedious experimental work, limiting the power of phylogenetic inference. Here, we analyzed the genomes of seven bear species using high-throughput sequencing data to detect thousands of TE insertions. The newly developed pipeline for TE detection called TeddyPi (TE detection and discovery for Phylogenetic Inference) identified 150,513 high-quality TE insertions in the genomes of ursine and tremarctine bears. By integrating different TE insertion callers and using a stringent filtering approach, the TeddyPi pipeline produced highly reliable TE insertion calls, which were confirmed by extensive in vitro validation experiments. Analysis of single nucleotide substitutions in the flanking regions of the TEs shows that these substitutions correlate with the phylogenetic signal from the TE insertions. Our phylogenomic analyses show that TEs are a major driver of genomic variation in bears and enabled phylogenetic reconstruction of a well-resolved species tree, despite strong signals for incomplete lineage sorting and introgression. The analyses show that the Asiatic black, sun, and sloth bear form a monophyletic clade, in which phylogenetic incongruence originates from incomplete lineage sorting. TeddyPi is open source and can be adapted to various TE and structural variation callers. The pipeline makes it possible to confidently extract thousands of TE insertions even from low-coverage genomes (∼10×) of nonmodel organisms. This opens new possibilities for biologists to study phylogenies and evolutionary processes as well as rates and patterns of (retro-)transposition and structural variation.
Background: We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis.
Result: The sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, fˆGf^G and fˆhomf^hom, to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background.
Conclusions: The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations.
Genetic signatures of adaptation revealed from transcriptome sequencing of Arctic and red foxes
(2015)
Background: The genus Vulpes (true foxes) comprises numerous species that inhabit a wide range of habitats and climatic conditions, including one species, the Arctic fox (Vulpes lagopus) which is adapted to the arctic region. A close relative to the Arctic fox, the red fox (Vulpes vulpes), occurs in subarctic to subtropical habitats. To study the genetic basis of their adaptations to different environments, transcriptome sequences from two Arctic foxes and one red fox individual were generated and analyzed for signatures of positive selection. In addition, the data allowed for a phylogenetic analysis and divergence time estimate between the two fox species.
Results: The de novo assembly of reads resulted in more than 160,000 contigs/transcripts per individual. Approximately 17,000 homologous genes were identified using human and the non-redundant databases. Positive selection analyses revealed several genes involved in various metabolic and molecular processes such as energy metabolism, cardiac gene regulation, apoptosis and blood coagulation to be under positive selection in foxes. Branch site tests identified four genes to be under positive selection in the Arctic fox transcriptome, two of which are fat metabolism genes. In the red fox transcriptome eight genes are under positive selection, including molecular process genes, notably genes involved in ATP metabolism. Analysis of the three transcriptomes and five Sanger re-sequenced genes in additional individuals identified a lower genetic variability within Arctic foxes compared to red foxes, which is consistent with distribution range differences and demographic responses to past climatic fluctuations. A phylogenomic analysis estimated that the Arctic and red fox lineages diverged about three million years ago.
Conclusions: Transcriptome data are an economic way to generate genomic resources for evolutionary studies. Despite not representing an entire genome, this transcriptome analysis identified numerous genes that are relevant to arctic adaptation in foxes. Similar to polar bears, fat metabolism seems to play a central role in adaptation of Arctic foxes to the cold climate, as has been identified in the polar bear, another arctic specialist.
The ancestors to the Australian marsupials entered Australia around 60 (54-72) million years ago from Antarctica, and radiated into the four living orders Peramelemorphia, Dasyuromorphia, Diprotodontia and Notoryctemorphia. The relationship between the four Australian marsupial orders has been a long-standing question, because different phylogenetic studies were not able to consistently reconstruct the same topology. Initial in silico analysis of the Tasmanian devil genome and experimental screening in the seven marsupial orders revealed 20 informative transposable element insertions for resolving the inter- and intraordinal relationships of Australian and South American orders. However, the retrotransposon insertions support three conflicting topologies regarding Peramelemorphia, Dasyuromorphia and Notoryctemorphia, indicating that the split between the three orders may be best understood as a network. This finding is supported by a phylogenetic re-analysis of nuclear gene sequences, using a consensus network approach that allows depicting hidden phylogenetic conflict, otherwise lost when forcing the data into a bifurcating tree. The consensus network analysis agrees with the transposable element analysis in that all possible topologies regarding Peramelemorphia, Dasyuromorphia, and Notoryctemorphia in a rooted four-taxon topology are equally well supported. In addition, retrotransposon insertion data supports the South American order Didelphimorphia being the sistergroup to all other living marsupial orders. The four Australian orders originated within three million years at the Cretaceous-Paleogene boundary. The rapid divergences left conflicting phylogenetic information in the genome possibly generated by incomplete lineage sorting or introgressive hybridisation, leaving the relationship among Australian marsupial orders unresolvable as a bifurcating process million years later.
Four species of true crocodile (genus Crocodylus) have been described from the Americas. Three of these crocodile species exhibit non-overlapping distributions—Crocodylus intermedius in South America, C. moreletii along the Caribbean coast of Mesoamerica, and C. rhombifer confined to Cuba. The fourth, C. acutus, is narrowly sympatric with each of the other three species. In this study, we sampled 113 crocodiles across Crocodylus populations in Cuba, as well as exemplar populations in Belize and Florida (USA), and sequenced three regions of the mitochondrial genome (D-loop, cytochrome b, cytochrome oxidase I; 3,626 base pair long dataset) that overlapped with published data previously collected from Colombia, Jamaica, and the Cayman Islands. Phylogenetic analyses of these data revealed two, paraphyletic lineages of C. acutus. One lineage, found in the continental Americas, is the sister taxon to C. intermedius, while the Greater Antillean lineage is most closely related to C. rhombifer. In addition to the paraphyly of the two C. acutus lineages, we recovered a 5.4% estimate of Tamura-Nei genetic divergence between the Antillean and continental clades. The reconstructed paraphyly, distinct phylogenetic affinities and high genetic divergence between Antillean and continental C. acutus populations are consistent with interspecific differentiation within the genus and suggest that the current taxon recognized as C. acutus is more likely a complex of cryptic species warranting a reassessment of current taxonomy. Moreover, the inclusion, for the first time, of samples from the western population of the American crocodile in Cuba revealed evidence for continental mtDNA haplotypes in the Antilles, suggesting this area may constitute a transition zone between distinct lineages of C. acutus. Further study using nuclear character data is warranted to more fully characterize this cryptic diversity, resolve taxonomic uncertainty, and inform conservation planning in this system.
The iconic Australasian kangaroos and wallabies represent a successful marsupial radiation. However, the evolutionary relationship within the two genera, Macropus and Wallabia, is controversial: mitochondrial and nuclear genes, and morphological data have produced conflicting scenarios regarding the phylogenetic relationships, which in turn impact the classification and taxonomy. We sequenced and analyzed the genomes of 11 kangaroos to investigate the evolutionary cause of the observed phylogenetic conflict. A multilocus coalescent analysis using ∼14,900 genome fragments, each 10 kb long, significantly resolved the species relationships between and among the sister-genera Macropus and Wallabia. The phylogenomic approach reconstructed the swamp wallaby (Wallabia) as nested inside Macropus, making this genus paraphyletic. However, the phylogenomic analyses indicate multiple conflicting phylogenetic signals in the swamp wallaby genome. This is interpreted as at least one introgression event between the ancestor of the genus Wallabia and a now extinct ghost lineage outside the genus Macropus. Additional phylogenetic signals must therefore be caused by incomplete lineage sorting and/or introgression, but available statistical methods cannot convincingly disentangle the two processes. In addition, the relationships inside the Macropus subgenus M. (Notamacropus) represent a hard polytomy. Thus, the relationships between tammar, red-necked, agile, and parma wallabies remain unresolvable even with whole-genome data. Even if most methods resolve bifurcating trees from genomic data, hard polytomies, incomplete lineage sorting, and introgression complicate the interpretation of the phylogeny and thus taxonomy.
Background: The genome of the carnivorous marsupial, the Tasmanian devil (Sarcophilus harrisii, Order: Dasyuromorphia), was sequenced in the hopes of finding a cure for or gaining a better understanding of the contagious devil facial tumor disease that is threatening the species’ survival. To better understand the Tasmanian devil genome, we screened it for transposable elements and investigated the dynamics of short interspersed element (SINE) retroposons.
Results: The temporal history of Tasmanian devil SINEs, elucidated using a transposition in transposition analysis, indicates that WSINE1, a CORE-SINE present in around 200,000 copies, is the most recently active element. Moreover, we discovered a new subtype of WSINE1 (WSINE1b) that comprises at least 90% of all Tasmanian devil WSINE1s. The frequencies of WSINE1 subtypes differ in the genomes of two of the other Australian marsupial orders. A co-segregation analysis indicated that at least 66 subfamilies of WSINE1 evolved during the evolution of Dasyuromorphia. Using a substitution rate derived from WSINE1 insertions, the ages of the subfamilies were estimated and correlated with a newly established phylogeny of Dasyuromorphia. Phylogenetic analyses and divergence time estimates of mitochondrial genome data indicate a rapid radiation of the Tasmanian devil and the closest relative the quolls (Dasyurus) around 14 million years ago.
Conclusions: The radiation and abundance of CORE-SINEs in marsupial genomes indicates that they may be a major player in the evolution of marsupials. It is evident that the early phases of evolution of the carnivorous marsupial order Dasyuromorphia was characterized by a burst of SINE activity. A correlation between a speciation event and a major burst of retroposon activity is for the first time shown in a marsupial genome.
Despite numerous large-scale phylogenomic studies, certain parts of the mammalian tree are extraordinarily difficult to resolve. We used the coding regions from 19 completely sequenced genomes to study the relationships within the super-clade Euarchontoglires (Primates, Rodentia, Lagomorpha, Dermoptera and Scandentia) because the placement of Scandentia within this clade is controversial. The difficulty in resolving this issue is due to the short time spans between the early divergences of Euarchontoglires, which may cause incongruent gene trees. The conflict in the data can be depicted by network analyses and the contentious relationships are best reconstructed by coalescent-based analyses. This method is expected to be superior to analyses of concatenated data in reconstructing a species tree from numerous gene trees. The total concatenated dataset used to study the relationships in this group comprises 5,875 protein-coding genes (9,799,170 nucleotides) from all orders except Dermoptera (flying lemurs). Reconstruction of the species tree from 1,006 gene trees using coalescent models placed Scandentia as sister group to the primates, which is in agreement with maximum likelihood analyses of concatenated nucleotide sequence data. Additionally, both analytical approaches favoured the Tarsier to be sister taxon to Anthropoidea, thus belonging to the Haplorrhine clade. When divergence times are short such as in radiations over periods of a few million years, even genome scale analyses struggle to resolve phylogenetic relationships. On these short branches processes such as incomplete lineage sorting and possibly hybridization occur and make it preferable to base phylogenomic analyses on coalescent methods.
A range-wide synthesis and timeline for phylogeographic events in the red fox (Vulpes vulpes)
(2013)
Background: Many boreo-temperate mammals have a Pleistocene fossil record throughout Eurasia and North America, but only few have a contemporary distribution that spans this large area. Examples of Holarctic-distributed carnivores are the brown bear, grey wolf, and red fox, all three ecological generalists with large dispersal capacity and a high adaptive flexibility. While the two former have been examined extensively across their ranges, no phylogeographic study of the red fox has been conducted across its entire Holarctic range. Moreover, no study included samples from central Asia, leaving a large sampling gap in the middle of the Eurasian landmass.
Results: Here we provide the first mitochondrial DNA sequence data of red foxes from central Asia (Siberia), and new sequences from several European populations. In a range-wide synthesis of 729 red fox mitochondrial control region sequences, including 677 previously published and 52 newly obtained sequences, this manuscript describes the pattern and timing of major phylogeographic events in red foxes, using a Bayesian coalescence approach with multiple fossil tip and root calibration points. In a 335 bp alignment we found in total 175 unique haplotypes. All newly sequenced individuals belonged to the previously described Holarctic lineage. Our analyses confirmed the presence of three Nearctic- and two Japan-restricted lineages that were formed since the Mid/Late Pleistocene.
Conclusions: The phylogeographic history of red foxes is highly similar to that previously described for grey wolves and brown bears, indicating that climatic fluctuations and habitat changes since the Pleistocene had similar effects on these highly mobile generalist species. All three species originally diversified in Eurasia and later colonized North America and Japan. North American lineages persisted through the last glacial maximum south of the ice sheets, meeting more recent colonizers from Beringia during postglacial expansion into the northern Nearctic. Both brown bears and red foxes colonized Japan’s northern island Hokkaido at least three times, all lineages being most closely related to different mainland lineages. Red foxes, grey wolves, and brown bears thus represent an interesting case where species that occupy similar ecological niches also exhibit similar phylogeographic histories.
Species is the fundamental taxonomic unit in biology and its delimitation has implications for conservation. In giraffe (Giraffa spp.), multiple taxonomic classifications have been proposed since the early 1900s.1 However, one species with nine subspecies has been generally accepted,2 likely due to limited in-depth assessments, subspecies hybridizing in captivity,3,4 and anecdotal reports of hybrids in the wild.5 Giraffe taxonomy received new attention after population genetic studies using traditional genetic markers suggested at least four species.6,7 This view has been met with controversy,8 setting the stage for debate.9,10 Genomics is significantly enhancing our understanding of biodiversity and speciation relative to traditional genetic approaches and thus has important implications for species delineation and conservation.11 We present a high-quality de novo genome assembly of the critically endangered Kordofan giraffe (G. camelopardalis antiquorum)12 and a comprehensive whole-genome analysis of 50 giraffe representing all traditionally recognized subspecies. Population structure and phylogenomic analyses support four separately evolving giraffe lineages, which diverged 230–370 ka ago. These lineages underwent distinct demographic histories and show different levels of heterozygosity and inbreeding. Our results strengthen previous findings of limited gene flow and admixture among putative giraffe species6,7,9 and establish a genomic foundation for recognizing four species and seven subspecies, the latter of which should be considered as evolutionary significant units. Achieving a consensus over the number of species and subspecies in giraffe is essential for adequately assessing their threat level and will improve conservation efforts for these iconic taxa.
Reconstructing the evolution of baleen whales (Mysticeti) has been problematic because morphological and genetic analyses have produced different scenarios. This might be caused by genomic admixture that may have taken place among some rorquals. We present the genomes of six whales, including the blue whale (Balaenoptera musculus), to reconstruct a species tree of baleen whales and to identify phylogenetic conflicts. Evolutionary multilocus analyses of 34,192 genome fragments reveal a fast radiation of rorquals at 10.5 to 7.5 million years ago coinciding with oceanic circulation shifts. The evolutionarily enigmatic gray whale (Eschrichtius robustus) is placed among rorquals, and the blue whale genome shows a high degree of heterozygosity. The nearly equal frequency of conflicting gene trees suggests that speciation of rorqual evolution occurred under gene flow, which is best depicted by evolutionary networks. Especially in marine environments, sympatric speciation might be common; our results raise questions about how genetic divergence can be established.
Highlights
• Genomes for all five Natrix species, two represented by two distinct subspecies each, were sequenced.
• Two genomes were de-novo assembled to their 1.7 Gb length with a contig N50 of 4.6 Mbp and 1.5 Mbp.
• Evidence for interspecific hybridization, both between allopatric and widely sympatric species.
• Fossil-calibrated molecular clock using genomes indicates that species are ancient several million-year-old lineages.
• Our findings imply that speciation took place despite continued gene flow.
Abstract
Understanding speciation is one of the cornerstones of biological diversity research. Currently, speciation is often understood as a continuous process of divergence that continues until genetic or other incompatibilities minimize or prevent interbreeding. The Palearctic snake genus Natrix is an ideal group to study speciation, as it comprises taxa representing distinct stages of the speciation process, ranging from widely interbreeding parapatric taxa through parapatric species with very limited gene flow in narrow hybrid zones to widely sympatric species. To understand the evolution of reproductive isolation through time, we have sequenced the genomes of all five species within this genus and two additional subspecies. We used both long-read and short-read methods to sequence and de-novo-assemble two high-quality genomes (Natrix h. helvetica, Natrix n. natrix) to their 1.7 Gb length with a contig N50 of 4.6 Mbp and 1.5 Mbp, respectively, and used these as references to assemble the remaining short-read-based genomes. Our phylogenomic analyses yielded a well-supported dated phylogeny and evidence for a surprisingly complex history of interspecific gene flow, including between widely sympatric species. Furthermore, evidence for gene flow was also found for currently allopatric species pairs. Genetic exchange among these well-defined, distinct, and several million-year-old reptile species emphasizes that speciation and maintenance of species distinctness can occur despite continued genetic exchange.
All giraffe (Giraffa) were previously assigned to a single species (G. Camelopardalis) and nine subspecies. However, multi-locus analyses of all subspecies have shown that there are four genetically distinct clades and suggest four giraffe species. This conclusion might not be fully accepted due to limited data and lack of explicit gene flow analyses. Here we present an extended study based on 21 independent nuclear loci from 137 individuals. Explicit gene flow analyses identify less than one migrant per generation, including between the closely related northern and reticulated giraffe. Thus, gene flow analyses and population genetics of the extended dataset confirm four genetically distinct giraffe clades and support four independent giraffe species. The new findings call for a revision of the IUCN classification of giraffe taxonomy. Three of the four species are threatened with extinction, mostly occurring in politically unstable regions, and as such, require the highest conservation support possible.
Background: Ever decreasing costs along with advances in sequencing and library preparation technologies enable even small research groups to generate chromosome-level assemblies today. Here we report the generation of an improved chromosome-level assembly for the Siamese fighting fish (Betta splendens) that was carried out during a practical university Master’s course. The Siamese fighting fish is a popular aquarium fish and an emerging model species for research on aggressive behaviour. We updated the current genome assembly by generating a new long-read nanopore-based assembly with subsequent scaffolding to chromosome-level using previously published HiC data.
Findings: The use of nanopore-based long-read data sequenced on a MinION platform (Oxford Nanopore Technologies) allowed us to generate a baseline assembly of only 1,276 contigs with a contig N50 of 2.1 Mbp, and a total length of 441 Mbp. Scaffolding using previously published HiC data resulted in 109 scaffolds with a scaffold N50 of 20.7 Mbp. More than 99% of the assembly is comprised in 21 scaffolds. The assembly showed the presence of 95.8% complete BUSCO genes from the Actinopterygii dataset indicating a high quality of the assembly.
Conclusion: We present an improved full chromosome-level assembly of the Siamese fighting fish generated during a university Master’s course. The use of ~35× long-read nanopore data drastically improved the baseline assembly in terms of continuity. We show that relatively in-expensive high-throughput sequencing technologies such as the long-read MinION sequencing platform can be used in educational settings allowing the students to gain practical skills in modern genomics and generate high quality results that benefit downstream research projects.
Bears are iconic mammals with a complex evolutionary history. Natural bear hybrids and studies of few nuclear genes indicate that gene flow among bears may be more common than expected and not limited to the closely related polar and brown bears. Here we present a genome analysis of the bear family with representatives of all living species. Phylogenomic analyses of 869 mega base pairs divided into 18,621 genome fragments yielded a well-resolved coalescent species tree despite signals for extensive gene flow across species. However, genome analyses using three different statistical methods show that gene flow is not limited to closely related species pairs. Strong ancestral gene flow between the Asiatic black bear and the ancestor to polar, brown and American black bear explains numerous uncertainties in reconstructing the bear phylogeny. Gene flow across the bear clade may be mediated by intermediate species such as the geographically wide-spread brown bears leading to massive amounts of phylogenetic conflict. Genome-scale analyses lead to a more complete understanding of complex evolutionary processes. The increasing evidence for extensive inter-specific gene flow, found also in other animal species, necessitates shifting the attention from speciation processes achieving genome-wide reproductive isolation to the selective processes that maintain species divergence in the face of gene flow.
Compared to sequence analyses, phylogenetic reconstruction from transposable elements (TEs) offers an additional perspective to study evolutionary processes. However, detecting phylogenetically informative TE insertions requires tedious experimental work, limiting the power of phylogenetic inference. Here, we analyzed the genomes of seven bear species using high throughput sequencing data to detect thousands of TE insertions. The newly developed pipeline for TE detection called TeddyPi (TE detection and discovery for Phylogenetic Inference) obtained 150,513 high-quality TE insertions in the genomes of ursine and tremarctine bears. By integrating different TE insertion callers and using a stringent filtering approach, the TeddyPi pipeline produced highly reliable TE insertion calls, which were confirmed by extensive in vitro validation experiments. Screening for single nucleotide substitutions in the flanking regions of the TEs show that these substitutions correlate with the phylogenetic signal from the TE insertions. Our phylogenomic analyses show that TEs are a major driver of genomic variation in bears and enabled phylogenetic reconstruction of a well-resolved species tree, even with strong signals for incomplete lineage sorting and introgression. The analyses show that the Asiatic black, sun and sloth bear form a monophyletic clade. TeddyPi is open source and can be adapted to various TE and structural variation callers. The pipeline makes it easy to confidently extract thousands of TE insertions even from low coverage genomes of non-model organisms, opening new possibilities for biologists to study phylogenies, evolutionary processes as well as rates and patterns of (retro-)transposition and structural variation.
It is generally recognized that large-scale whaling in the 19th and 20th century led to a substantial reduction of the size of many cetacean populations, particularly those of the baleen whales (Mysticeti). The impact of these operations on genomic diversity of one of the most hunted whales, the fin whale (Balaenoptera physalus), has remained largely unaddressed because of the paucity of adequate samples and the limitation of applicable techniques. Here, we have examined the effect of whaling on the North Atlantic fin whale based on genomes of 51 individuals from Icelandic waters, representing three temporally separated intervals, 1989, 2009 and 2018 and provide a reference genome for the species. Demographic models suggest a noticeable drop of the effective population size of the North Atlantic fin whale around a century ago. The present results suggest that the genome-wide heterozygosity is not markedly reduced and has remained comparable with other baleen whale species. Similarly, there are no signs of apparent inbreeding, as measured by the proportion of long runs of homozygosity, or of a distinctively increased mutational load, as measured by the amount of putative deleterious mutations. Compared with other baleen whales, the North Atlantic fin whale appears to be less affected by anthropogenic influences than other whales such as the North Atlantic right whale, consistent with the presence of long runs of homozygosity and higher levels of mutational load in an otherwise more heterozygous genome. Thus, genome-wide assessments of other species and populations are essential for future, more specific, conservation efforts.
Three of the four species of giraffe are threatened, particularly the northern giraffe (Giraffa camelopardalis), which collectively have the smallest known wild population estimates. Among the three subspecies of the northern giraffe, the West African giraffe (Giraffa camelopardalis peralta) had declined to 49 individuals by 1996 and only recovered due to conservation efforts undertaken in the past 25 years, while the Kordofan giraffe (Giraffa camelopardalis antiquorum) remains at <2300 individuals distributed in small, isolated populations over a large geographical range in Central Africa. These combined factors could lead to genetically depauperated populations. We analyzed 119 mitochondrial sequences and 26 whole genomes of northern giraffe individuals to investigate their population structure and assess the recent demographic history and current genomic diversity of West African and Kordofan giraffe. Phylogenetic and population structure analyses separate the three subspecies of northern giraffe and suggest genetic differentiation between populations from eastern and western areas of the Kordofan giraffe’s range. Both West African and Kordofan giraffe show a gradual decline in effective population size over the last 10 ka and have moderate genome-wide heterozygosity compared to other giraffe species. Recent inbreeding levels are higher in the West African giraffe and in Kordofan giraffe from Garamba National Park, Democratic Republic of Congo. Although numbers for both West African and some populations of Kordofan giraffe have increased in recent years, the threat of habitat loss, climate change impacts, and illegal hunting persists. Thus, future conservation actions should consider close genetic monitoring of populations to detect and, where practical, counteract negative trends that might develop.
Background: Genome sequencing of all known eukaryotes on Earth promises unprecedented advances in biological sciences and in biodiversity-related applied fields such as environmental management and natural product research. Advances in long-read DNA sequencing make it feasible to generate high-quality genomes for many non–genetic model species. However, long-read sequencing today relies on sizable quantities of high-quality, high molecular weight DNA, which is mostly obtained from fresh tissues. This is a challenge for biodiversity genomics of most metazoan species, which are tiny and need to be preserved immediately after collection. Here we present de novo genomes of 2 species of submillimeter Collembola. For each, we prepared the sequencing library from high molecular weight DNA extracted from a single specimen and using a novel ultra-low input protocol from Pacific Biosciences. This protocol requires a DNA input of only 5 ng, permitted by a whole-genome amplification step.
Results: The 2 assembled genomes have N50 values >5.5 and 8.5 Mb, respectively, and both contain ∼96% of BUSCO genes. Thus, they are highly contiguous and complete. The genomes are supported by an integrative taxonomy approach including placement in a genome-based phylogeny of Collembola and designation of a neotype for 1 of the species. Higher heterozygosity values are recorded in the more mobile species. Both species are devoid of the biosynthetic pathway for β-lactam antibiotics known in several Collembola, confirming the tight correlation of antibiotic synthesis with the species way of life.
Conclusions: It is now possible to generate high-quality genomes from single specimens of minute, field-preserved metazoans, exceeding the minimum contig N50 (1 Mb) required by the Earth BioGenome Project.
Background: In the speciation continuum, the strength of reproductive isolation varies, and species boundaries are blurred by gene flow. Interbreeding among giraffe (Giraffa spp.) in captivity is known, and anecdotal reports of natural hybrids exist. In Kenya, Nubian (G. camelopardalis camelopardalis), reticulated (G. reticulata), and Masai giraffe sensu stricto (G. tippelskirchi tippelskirchi) are parapatric, and thus, the country might be a melting pot for these taxa. We analyzed 128 genomes of wild giraffe, 113 newly sequenced, representing these three taxa.
Results: We found varying levels of Nubian ancestry in 13 reticulated giraffe sampled across the Laikipia Plateau most likely reflecting historical gene flow between these two lineages. Although comparatively weaker signs of ancestral gene flow and potential mitochondrial introgression from reticulated into Masai giraffe were also detected, estimated admixture levels between these two lineages are minimal. Importantly, contemporary gene flow between East African giraffe lineages was not statistically significant. Effective population sizes have declined since the Late Pleistocene, more severely for Nubian and reticulated giraffe.
Conclusions: Despite historically hybridizing, these three giraffe lineages have maintained their overall genomic integrity suggesting effective reproductive isolation, consistent with the previous classification of giraffe into four species.
The snake pipefish, Entelurus aequoreus (Linnaeus, 1758), is a slender, up to 60 cm long, northern Atlantic fish that dwells in open seagrass habitats and has recently expanded its distribution range. The snake pipefish is part of the family Syngnathidae (seahorses and pipefish) that has undergone several characteristic morphological changes, such as loss of pelvic fins and elongated snout. Here, we present a highly contiguous, near chromosome-scale genome of the snake pipefish assembled as part of a university master’s course. The final assembly has a length of 1.6 Gbp in 7,391 scaffolds, a scaffold and contig N50 of 62.3 Mbp and 45.0 Mbp and L50 of 12 and 14, respectively. The largest 28 scaffolds (>21 Mbp) span 89.7% of the assembly length. A BUSCO completeness score of 94.1% and a mapping rate above 98% suggest a high assembly completeness. Repetitive elements cover 74.93% of the genome, one of the highest proportions so far identified in vertebrate genomes. Demographic modeling using the PSMC framework indicates a peak in effective population size (50 – 100 kya) during the last interglacial period and suggests that the species might largely benefit from warmer water conditions, as seen today. Our updated snake pipefish assembly forms an important foundation for further analysis of the morphological and molecular changes unique to the family Syngnathidae.
Background: In the speciation continuum the strength of reproductive isolation varies, and species boundaries are blurred by gene flow. Interbreeding among giraffe (Giraffa spp.) in captivity is known and anecdotal reports of natural hybrids exist. In Kenya, Nubian (G. camelopardalis camelopardalis), reticulated (G. reticulata), and Masai giraffe sensu stricto (G. tippelskirchi tippelskirchi) are parapatric, and thus the country might be a melting pot for these taxa. We analyzed 128 genomes of wild giraffe, 113 newly sequenced, representing these three taxa.
Results: We found varying levels of Nubian ancestry in 13 reticulated giraffe sampled across the Laikipia Plateau most likely reflecting historical gene flow between these two lineages. Although comparatively weaker signs of ancestral gene flow and potential mitochondrial introgression from reticulated into Masai giraffe were also detected, estimated admixture levels between these two lineages are minimal. Importantly, contemporary gene flow between East African giraffe lineages was not statistically significant. Effective population sizes have declined since the Late Pleistocene, more severely for Nubian and reticulated giraffe.
Conclusions: Despite historically hybridizing, these three giraffe lineages have maintained their overall genomic integrity suggesting effective reproductive isolation, consistent with the previous classification of giraffe into four species.
Feeding exclusively on blood, vampire bats represent the only obligate sanguivorous lineage among mammals. To uncover genomic changes associated with adaptations to this unique dietary specialization, we generated a new haplotype-resolved reference-quality genome of the common vampire bat (Desmodus rotundus) and screened 26 bat species for genes that were specifically lost in the vampire bat lineage. We discovered previously-unknown gene losses that relate to metabolic and physiological changes, such as reduced insulin secretion (FFAR1, SLC30A8), limited glycogen stores (PPP1R3E), and a distinct gastric physiology (CTSE). Other gene losses likely reflect the biased nutrient composition (ERN2, CTRL) and distinct pathogen diversity of blood (RNASE7). Interestingly, the loss of REP15 likely helped vampire bats to adapt to high dietary iron levels by enhancing iron excretion and the loss of the 24S-hydroxycholesterol metabolizing enzyme CYP39A1 could contribute to their exceptional cognitive abilities. Finally, losses of key cone phototransduction genes (PDE6H, PDE6C) suggest that these strictly-nocturnal bats completely lack cone-based vision. These findings enhance our understanding of vampire bat biology and the genomic underpinnings of adaptations to sanguivory.
Phylogenetic analyses of nuclear and mitochondrial genomes have shown that polar bears captured the mitochondrial genome of brown bears some 160,00 years ago. This hybridization event likely led to an extinction of the original polar bear mitochondrial genome. However, parts of the mitochondrial DNA occasionally integrates into the nuclear genome, forming pseudogenes called numts (nuclear mitochondrial integrations). Screening the polar bear genome for numts, we identified only 13 such integrations. Analyses of whole-genome sequences from additional polar bears, brown and American black bears as well as the giant panda indicates that the discovered numts entered the bear lineage before the initial ursid radiation some 14 million years ago. Our findings suggests a low integration rate of numts in the bear lineage and a complete loss of the original polar bear mitochondrial genome.
Phylogenetic analyses of nuclear and mitochondrial genomes indicate that polar bears captured the brown bear mitochondrial genome 160,000 years ago, leading to an extinction of the original polar bear mitochondrial genome. However, mitochondrial DNA occasionally integrates into the nuclear genome, forming pseudogenes called numts (nuclear mitochondrial integrations). Screening the polar bear genome identified only 13 numts. Genomic analyses of two additional ursine bears and giant panda indicate that all except one of the discovered numts entered the bear lineage at least 14 million years ago. However, short read genome assemblies might lead to an under-representation of numts or other repetitive sequences. Our findings suggest low integration rates of numts in bears and a loss of the original polar bear mitochondrial genome.
Bird-mediated seed dispersal is crucial for the regeneration and viability of ecosystems, often resulting in complex mutualistic species networks. Yet, how this mutualism drives the evolution of seed dispersing birds is still poorly understood. In the present study we combine whole genome re-sequencing analyses and morphometric data to assess the evolutionary processes that shaped the diversification of the Eurasian nutcracker (Nucifraga), a seed disperser known for its mutualism with pines (Pinus). Our results show that the divergence and phylogeographic patterns of nutcrackers resemble those of other non-mutualistic passerine birds and suggest that their early diversification was shaped by similar biogeographic and climatic processes. The limited variation in foraging traits indicates that local adaptation to pines likely played a minor role. Our study shows that close mutualistic relationships between bird and plant species might not necessarily act as a primary driver of evolution and diversification in resource-specialized birds.
Vampire bats are the only mammals that feed exclusively on blood. To uncover genomic changes associated with this dietary adaptation, we generated a haplotype-resolved genome of the common vampire bat and screened 27 bat species for genes that were specifically lost in the vampire bat lineage. We found previously unknown gene losses that relate to reduced insulin secretion (FFAR1 and SLC30A8), limited glycogen stores (PPP1R3E), and a unique gastric physiology (CTSE). Other gene losses likely reflect the biased nutrient composition (ERN2 and CTRL) and distinct pathogen diversity of blood (RNASE7) and predict the complete lack of cone-based vision in these strictly nocturnal bats (PDE6H and PDE6C). Notably, REP15 loss likely helped vampire bats adapt to high dietary iron levels by enhancing iron excretion, and the loss of CYP39A1 could have contributed to their exceptional cognitive abilities. These findings enhance our understanding of vampire bat biology and the genomic underpinnings of adaptations to blood feeding.