Institut für Ökologie, Evolution und Diversität
Refine
Year of publication
Language
- English (29)
Has Fulltext
- yes (29)
Is part of the Bibliography
- no (29)
Keywords
- Giraffa (4)
- hybridization (3)
- speciation (3)
- East Africa (2)
- Gene flow (2)
- Hybridization (2)
- Speciation (2)
- Ursidae (2)
- gene flow (2)
- giraffe (2)
Reconstructing the evolution of baleen whales (Mysticeti) has been problematic because morphological and genetic analyses have produced different scenarios. This might be caused by genomic admixture that may have taken place among some rorquals. We present the genomes of six whales, including the blue whale (Balaenoptera musculus), to reconstruct a species tree of baleen whales and to identify phylogenetic conflicts. Evolutionary multilocus analyses of 34,192 genome fragments reveal a fast radiation of rorquals at 10.5 to 7.5 million years ago coinciding with oceanic circulation shifts. The evolutionarily enigmatic gray whale (Eschrichtius robustus) is placed among rorquals, and the blue whale genome shows a high degree of heterozygosity. The nearly equal frequency of conflicting gene trees suggests that speciation of rorqual evolution occurred under gene flow, which is best depicted by evolutionary networks. Especially in marine environments, sympatric speciation might be common; our results raise questions about how genetic divergence can be established.
Background: We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis.
Result: The sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, fˆGf^G and fˆhomf^hom, to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background.
Conclusions: The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations.
It is generally recognized that large-scale whaling in the 19th and 20th century led to a substantial reduction of the size of many cetacean populations, particularly those of the baleen whales (Mysticeti). The impact of these operations on genomic diversity of one of the most hunted whales, the fin whale (Balaenoptera physalus), has remained largely unaddressed because of the paucity of adequate samples and the limitation of applicable techniques. Here, we have examined the effect of whaling on the North Atlantic fin whale based on genomes of 51 individuals from Icelandic waters, representing three temporally separated intervals, 1989, 2009 and 2018 and provide a reference genome for the species. Demographic models suggest a noticeable drop of the effective population size of the North Atlantic fin whale around a century ago. The present results suggest that the genome-wide heterozygosity is not markedly reduced and has remained comparable with other baleen whale species. Similarly, there are no signs of apparent inbreeding, as measured by the proportion of long runs of homozygosity, or of a distinctively increased mutational load, as measured by the amount of putative deleterious mutations. Compared with other baleen whales, the North Atlantic fin whale appears to be less affected by anthropogenic influences than other whales such as the North Atlantic right whale, consistent with the presence of long runs of homozygosity and higher levels of mutational load in an otherwise more heterozygous genome. Thus, genome-wide assessments of other species and populations are essential for future, more specific, conservation efforts.
The snake pipefish, Entelurus aequoreus (Linnaeus, 1758), is a slender, up to 60 cm long, northern Atlantic fish that dwells in open seagrass habitats and has recently expanded its distribution range. The snake pipefish is part of the family Syngnathidae (seahorses and pipefish) that has undergone several characteristic morphological changes, such as loss of pelvic fins and elongated snout. Here, we present a highly contiguous, near chromosome-scale genome of the snake pipefish assembled as part of a university master’s course. The final assembly has a length of 1.6 Gbp in 7,391 scaffolds, a scaffold and contig N50 of 62.3 Mbp and 45.0 Mbp and L50 of 12 and 14, respectively. The largest 28 scaffolds (>21 Mbp) span 89.7% of the assembly length. A BUSCO completeness score of 94.1% and a mapping rate above 98% suggest a high assembly completeness. Repetitive elements cover 74.93% of the genome, one of the highest proportions so far identified in vertebrate genomes. Demographic modeling using the PSMC framework indicates a peak in effective population size (50 – 100 kya) during the last interglacial period and suggests that the species might largely benefit from warmer water conditions, as seen today. Our updated snake pipefish assembly forms an important foundation for further analysis of the morphological and molecular changes unique to the family Syngnathidae.
All giraffe (Giraffa) were previously assigned to a single species (G. camelopardalis) and nine subspecies. However, multi‐locus analyses of all subspecies have shown that there are four genetically distinct clades and suggest four giraffe species. This conclusion might not be fully accepted due to limited data and lack of explicit gene flow analyses. Here, we present an extended study based on 21 independent nuclear loci from 137 individuals. Explicit gene flow analyses identify less than one migrant per generation, including between the closely related northern and reticulated giraffe. Thus, gene flow analyses and population genetics of the extended dataset confirm four genetically distinct giraffe clades and support four independent giraffe species. The new findings support a revision of the IUCN classification of giraffe taxonomy. Three of the four species are threatened with extinction, and mostly occurring in politically unstable regions, and as such, require the highest conservation support possible.
All giraffe (Giraffa) were previously assigned to a single species (G. Camelopardalis) and nine subspecies. However, multi-locus analyses of all subspecies have shown that there are four genetically distinct clades and suggest four giraffe species. This conclusion might not be fully accepted due to limited data and lack of explicit gene flow analyses. Here we present an extended study based on 21 independent nuclear loci from 137 individuals. Explicit gene flow analyses identify less than one migrant per generation, including between the closely related northern and reticulated giraffe. Thus, gene flow analyses and population genetics of the extended dataset confirm four genetically distinct giraffe clades and support four independent giraffe species. The new findings call for a revision of the IUCN classification of giraffe taxonomy. Three of the four species are threatened with extinction, mostly occurring in politically unstable regions, and as such, require the highest conservation support possible.
Highlights
• Genomes for all five Natrix species, two represented by two distinct subspecies each, were sequenced.
• Two genomes were de-novo assembled to their 1.7 Gb length with a contig N50 of 4.6 Mbp and 1.5 Mbp.
• Evidence for interspecific hybridization, both between allopatric and widely sympatric species.
• Fossil-calibrated molecular clock using genomes indicates that species are ancient several million-year-old lineages.
• Our findings imply that speciation took place despite continued gene flow.
Abstract
Understanding speciation is one of the cornerstones of biological diversity research. Currently, speciation is often understood as a continuous process of divergence that continues until genetic or other incompatibilities minimize or prevent interbreeding. The Palearctic snake genus Natrix is an ideal group to study speciation, as it comprises taxa representing distinct stages of the speciation process, ranging from widely interbreeding parapatric taxa through parapatric species with very limited gene flow in narrow hybrid zones to widely sympatric species. To understand the evolution of reproductive isolation through time, we have sequenced the genomes of all five species within this genus and two additional subspecies. We used both long-read and short-read methods to sequence and de-novo-assemble two high-quality genomes (Natrix h. helvetica, Natrix n. natrix) to their 1.7 Gb length with a contig N50 of 4.6 Mbp and 1.5 Mbp, respectively, and used these as references to assemble the remaining short-read-based genomes. Our phylogenomic analyses yielded a well-supported dated phylogeny and evidence for a surprisingly complex history of interspecific gene flow, including between widely sympatric species. Furthermore, evidence for gene flow was also found for currently allopatric species pairs. Genetic exchange among these well-defined, distinct, and several million-year-old reptile species emphasizes that speciation and maintenance of species distinctness can occur despite continued genetic exchange.
Our large brain, long life span and high fertility are key elements of human evolutionary success and are often thought to have evolved in interplay with tool use, carnivory and hunting. However, the specific impact of carnivory on human evolution, life history and development remains controversial. Here we show in quantitative terms that dietary profile is a key factor influencing time to weaning across a wide taxonomic range of mammals, including humans. In a model encompassing a total of 67 species and genera from 12 mammalian orders, adult brain mass and two dichotomous variables reflecting species differences regarding limb biomechanics and dietary profile, accounted for 75.5%, 10.3% and 3.4% of variance in time to weaning, respectively, together capturing 89.2% of total variance. Crucially, carnivory predicted the time point of early weaning in humans with remarkable precision, yielding a prediction error of less than 5% with a sample of forty-six human natural fertility societies as reference. Hence, carnivory appears to provide both a necessary and sufficient explanation as to why humans wean so much earlier than the great apes. While early weaning is regarded as essentially differentiating the genus Homo from the great apes, its timing seems to be determined by the same limited set of factors in humans as in mammals in general, despite some 90 million years of evolution. Our analysis emphasizes the high degree of similarity of relative time scales in mammalian development and life history across 67 genera from 12 mammalian orders and shows that the impact of carnivory on time to weaning in humans is quantifiable, and critical. Since early weaning yields shorter interbirth intervals and higher rates of reproduction, with profound effects on population dynamics, our findings highlight the emergence of carnivory as a process fundamentally determining human evolution.
The iconic Australasian kangaroos and wallabies represent a successful marsupial radiation. However, the evolutionary relationship within the two genera, Macropus and Wallabia, is controversial: mitochondrial and nuclear genes, and morphological data have produced conflicting scenarios regarding the phylogenetic relationships, which in turn impact the classification and taxonomy. We sequenced and analyzed the genomes of 11 kangaroos to investigate the evolutionary cause of the observed phylogenetic conflict. A multilocus coalescent analysis using ∼14,900 genome fragments, each 10 kb long, significantly resolved the species relationships between and among the sister-genera Macropus and Wallabia. The phylogenomic approach reconstructed the swamp wallaby (Wallabia) as nested inside Macropus, making this genus paraphyletic. However, the phylogenomic analyses indicate multiple conflicting phylogenetic signals in the swamp wallaby genome. This is interpreted as at least one introgression event between the ancestor of the genus Wallabia and a now extinct ghost lineage outside the genus Macropus. Additional phylogenetic signals must therefore be caused by incomplete lineage sorting and/or introgression, but available statistical methods cannot convincingly disentangle the two processes. In addition, the relationships inside the Macropus subgenus M. (Notamacropus) represent a hard polytomy. Thus, the relationships between tammar, red-necked, agile, and parma wallabies remain unresolvable even with whole-genome data. Even if most methods resolve bifurcating trees from genomic data, hard polytomies, incomplete lineage sorting, and introgression complicate the interpretation of the phylogeny and thus taxonomy.
Four species of true crocodile (genus Crocodylus) have been described from the Americas. Three of these crocodile species exhibit non-overlapping distributions—Crocodylus intermedius in South America, C. moreletii along the Caribbean coast of Mesoamerica, and C. rhombifer confined to Cuba. The fourth, C. acutus, is narrowly sympatric with each of the other three species. In this study, we sampled 113 crocodiles across Crocodylus populations in Cuba, as well as exemplar populations in Belize and Florida (USA), and sequenced three regions of the mitochondrial genome (D-loop, cytochrome b, cytochrome oxidase I; 3,626 base pair long dataset) that overlapped with published data previously collected from Colombia, Jamaica, and the Cayman Islands. Phylogenetic analyses of these data revealed two, paraphyletic lineages of C. acutus. One lineage, found in the continental Americas, is the sister taxon to C. intermedius, while the Greater Antillean lineage is most closely related to C. rhombifer. In addition to the paraphyly of the two C. acutus lineages, we recovered a 5.4% estimate of Tamura-Nei genetic divergence between the Antillean and continental clades. The reconstructed paraphyly, distinct phylogenetic affinities and high genetic divergence between Antillean and continental C. acutus populations are consistent with interspecific differentiation within the genus and suggest that the current taxon recognized as C. acutus is more likely a complex of cryptic species warranting a reassessment of current taxonomy. Moreover, the inclusion, for the first time, of samples from the western population of the American crocodile in Cuba revealed evidence for continental mtDNA haplotypes in the Antilles, suggesting this area may constitute a transition zone between distinct lineages of C. acutus. Further study using nuclear character data is warranted to more fully characterize this cryptic diversity, resolve taxonomic uncertainty, and inform conservation planning in this system.
Phylogenetic analyses of nuclear and mitochondrial genomes have shown that polar bears captured the mitochondrial genome of brown bears some 160,00 years ago. This hybridization event likely led to an extinction of the original polar bear mitochondrial genome. However, parts of the mitochondrial DNA occasionally integrates into the nuclear genome, forming pseudogenes called numts (nuclear mitochondrial integrations). Screening the polar bear genome for numts, we identified only 13 such integrations. Analyses of whole-genome sequences from additional polar bears, brown and American black bears as well as the giant panda indicates that the discovered numts entered the bear lineage before the initial ursid radiation some 14 million years ago. Our findings suggests a low integration rate of numts in the bear lineage and a complete loss of the original polar bear mitochondrial genome.
Phylogenetic analyses of nuclear and mitochondrial genomes indicate that polar bears captured the brown bear mitochondrial genome 160,000 years ago, leading to an extinction of the original polar bear mitochondrial genome. However, mitochondrial DNA occasionally integrates into the nuclear genome, forming pseudogenes called numts (nuclear mitochondrial integrations). Screening the polar bear genome identified only 13 numts. Genomic analyses of two additional ursine bears and giant panda indicate that all except one of the discovered numts entered the bear lineage at least 14 million years ago. However, short read genome assemblies might lead to an under-representation of numts or other repetitive sequences. Our findings suggest low integration rates of numts in bears and a loss of the original polar bear mitochondrial genome.
Phylogenetic reconstruction from transposable elements (TEs) offers an additional perspective to study evolutionary processes. However, detecting phylogenetically informative TE insertions requires tedious experimental work, limiting the power of phylogenetic inference. Here, we analyzed the genomes of seven bear species using high-throughput sequencing data to detect thousands of TE insertions. The newly developed pipeline for TE detection called TeddyPi (TE detection and discovery for Phylogenetic Inference) identified 150,513 high-quality TE insertions in the genomes of ursine and tremarctine bears. By integrating different TE insertion callers and using a stringent filtering approach, the TeddyPi pipeline produced highly reliable TE insertion calls, which were confirmed by extensive in vitro validation experiments. Analysis of single nucleotide substitutions in the flanking regions of the TEs shows that these substitutions correlate with the phylogenetic signal from the TE insertions. Our phylogenomic analyses show that TEs are a major driver of genomic variation in bears and enabled phylogenetic reconstruction of a well-resolved species tree, despite strong signals for incomplete lineage sorting and introgression. The analyses show that the Asiatic black, sun, and sloth bear form a monophyletic clade, in which phylogenetic incongruence originates from incomplete lineage sorting. TeddyPi is open source and can be adapted to various TE and structural variation callers. The pipeline makes it possible to confidently extract thousands of TE insertions even from low-coverage genomes (∼10×) of nonmodel organisms. This opens new possibilities for biologists to study phylogenies and evolutionary processes as well as rates and patterns of (retro-)transposition and structural variation.
Compared to sequence analyses, phylogenetic reconstruction from transposable elements (TEs) offers an additional perspective to study evolutionary processes. However, detecting phylogenetically informative TE insertions requires tedious experimental work, limiting the power of phylogenetic inference. Here, we analyzed the genomes of seven bear species using high throughput sequencing data to detect thousands of TE insertions. The newly developed pipeline for TE detection called TeddyPi (TE detection and discovery for Phylogenetic Inference) obtained 150,513 high-quality TE insertions in the genomes of ursine and tremarctine bears. By integrating different TE insertion callers and using a stringent filtering approach, the TeddyPi pipeline produced highly reliable TE insertion calls, which were confirmed by extensive in vitro validation experiments. Screening for single nucleotide substitutions in the flanking regions of the TEs show that these substitutions correlate with the phylogenetic signal from the TE insertions. Our phylogenomic analyses show that TEs are a major driver of genomic variation in bears and enabled phylogenetic reconstruction of a well-resolved species tree, even with strong signals for incomplete lineage sorting and introgression. The analyses show that the Asiatic black, sun and sloth bear form a monophyletic clade. TeddyPi is open source and can be adapted to various TE and structural variation callers. The pipeline makes it easy to confidently extract thousands of TE insertions even from low coverage genomes of non-model organisms, opening new possibilities for biologists to study phylogenies, evolutionary processes as well as rates and patterns of (retro-)transposition and structural variation.
Ursine bears are a mammalian subfamily that comprises six morphologically and ecologically distinct extant species. Previous phylogenetic analyses of concatenated nuclear genes could not resolve all relationships among bears, and appeared to conflict with the mitochondrial phylogeny. Evolutionary processes such as incomplete lineage sorting and introgression can cause gene tree discordance and complicate phylogenetic inferences, but are not accounted for in phylogenetic analyses of concatenated data. We generated a high-resolution data set of autosomal introns from several individuals per species and of Y-chromosomal markers. Incorporating intraspecific variability in coalescence-based phylogenetic and gene flow estimation approaches, we traced the genealogical history of individual alleles. Considerable heterogeneity among nuclear loci and discordance between nuclear and mitochondrial phylogenies were found. A species tree with divergence time estimates indicated that ursine bears diversified within less than 2 My. Consistent with a complex branching order within a clade of Asian bear species, we identified unidirectional gene flow from Asian black into sloth bears. Moreover, gene flow detected from brown into American black bears can explain the conflicting placement of the American black bear in mitochondrial and nuclear phylogenies. These results highlight that both incomplete lineage sorting and introgression are prominent evolutionary forces even on time scales up to several million years. Complex evolutionary patterns are not adequately captured by strictly bifurcating models, and can only be fully understood when analyzing multiple independently inherited loci in a coalescence framework. Phylogenetic incongruence among gene trees hence needs to be recognized as a biologically meaningful signal.
Bears are iconic mammals with a complex evolutionary history. Natural bear hybrids and studies of few nuclear genes indicate that gene flow among bears may be more common than expected and not limited to polar and brown bears. Here we present a genome analysis of the bear family with representatives of all living species. Phylogenomic analyses of 869 mega base pairs divided into 18,621 genome fragments yielded a well-resolved coalescent species tree despite signals for extensive gene flow across species. However, genome analyses using different statistical methods show that gene flow is not limited to closely related species pairs. Strong ancestral gene flow between the Asiatic black bear and the ancestor to polar, brown and American black bear explains uncertainties in reconstructing the bear phylogeny. Gene flow across the bear clade may be mediated by intermediate species such as the geographically wide-spread brown bears leading to large amounts of phylogenetic conflict. Genome-scale analyses lead to a more complete understanding of complex evolutionary processes. Evidence for extensive inter-specific gene flow, found also in other animal species, necessitates shifting the attention from speciation processes achieving genome-wide reproductive isolation to the selective processes that maintain species divergence in the face of gene flow.
Bears are iconic mammals with a complex evolutionary history. Natural bear hybrids and studies of few nuclear genes indicate that gene flow among bears may be more common than expected and not limited to the closely related polar and brown bears. Here we present a genome analysis of the bear family with representatives of all living species. Phylogenomic analyses of 869 mega base pairs divided into 18,621 genome fragments yielded a well-resolved coalescent species tree despite signals for extensive gene flow across species. However, genome analyses using three different statistical methods show that gene flow is not limited to closely related species pairs. Strong ancestral gene flow between the Asiatic black bear and the ancestor to polar, brown and American black bear explains numerous uncertainties in reconstructing the bear phylogeny. Gene flow across the bear clade may be mediated by intermediate species such as the geographically wide-spread brown bears leading to massive amounts of phylogenetic conflict. Genome-scale analyses lead to a more complete understanding of complex evolutionary processes. The increasing evidence for extensive inter-specific gene flow, found also in other animal species, necessitates shifting the attention from speciation processes achieving genome-wide reproductive isolation to the selective processes that maintain species divergence in the face of gene flow.
Genetic signatures of adaptation revealed from transcriptome sequencing of Arctic and red foxes
(2015)
Background: The genus Vulpes (true foxes) comprises numerous species that inhabit a wide range of habitats and climatic conditions, including one species, the Arctic fox (Vulpes lagopus) which is adapted to the arctic region. A close relative to the Arctic fox, the red fox (Vulpes vulpes), occurs in subarctic to subtropical habitats. To study the genetic basis of their adaptations to different environments, transcriptome sequences from two Arctic foxes and one red fox individual were generated and analyzed for signatures of positive selection. In addition, the data allowed for a phylogenetic analysis and divergence time estimate between the two fox species.
Results: The de novo assembly of reads resulted in more than 160,000 contigs/transcripts per individual. Approximately 17,000 homologous genes were identified using human and the non-redundant databases. Positive selection analyses revealed several genes involved in various metabolic and molecular processes such as energy metabolism, cardiac gene regulation, apoptosis and blood coagulation to be under positive selection in foxes. Branch site tests identified four genes to be under positive selection in the Arctic fox transcriptome, two of which are fat metabolism genes. In the red fox transcriptome eight genes are under positive selection, including molecular process genes, notably genes involved in ATP metabolism. Analysis of the three transcriptomes and five Sanger re-sequenced genes in additional individuals identified a lower genetic variability within Arctic foxes compared to red foxes, which is consistent with distribution range differences and demographic responses to past climatic fluctuations. A phylogenomic analysis estimated that the Arctic and red fox lineages diverged about three million years ago.
Conclusions: Transcriptome data are an economic way to generate genomic resources for evolutionary studies. Despite not representing an entire genome, this transcriptome analysis identified numerous genes that are relevant to arctic adaptation in foxes. Similar to polar bears, fat metabolism seems to play a central role in adaptation of Arctic foxes to the cold climate, as has been identified in the polar bear, another arctic specialist.
Despite numerous large-scale phylogenomic studies, certain parts of the mammalian tree are extraordinarily difficult to resolve. We used the coding regions from 19 completely sequenced genomes to study the relationships within the super-clade Euarchontoglires (Primates, Rodentia, Lagomorpha, Dermoptera and Scandentia) because the placement of Scandentia within this clade is controversial. The difficulty in resolving this issue is due to the short time spans between the early divergences of Euarchontoglires, which may cause incongruent gene trees. The conflict in the data can be depicted by network analyses and the contentious relationships are best reconstructed by coalescent-based analyses. This method is expected to be superior to analyses of concatenated data in reconstructing a species tree from numerous gene trees. The total concatenated dataset used to study the relationships in this group comprises 5,875 protein-coding genes (9,799,170 nucleotides) from all orders except Dermoptera (flying lemurs). Reconstruction of the species tree from 1,006 gene trees using coalescent models placed Scandentia as sister group to the primates, which is in agreement with maximum likelihood analyses of concatenated nucleotide sequence data. Additionally, both analytical approaches favoured the Tarsier to be sister taxon to Anthropoidea, thus belonging to the Haplorrhine clade. When divergence times are short such as in radiations over periods of a few million years, even genome scale analyses struggle to resolve phylogenetic relationships. On these short branches processes such as incomplete lineage sorting and possibly hybridization occur and make it preferable to base phylogenomic analyses on coalescent methods.
Recent phylogenomic studies have failed to conclusively resolve certain branches of the placental mammalian tree, despite the evolutionary analysis of genomic data from 32 species. Previous analyses of single genes and retroposon insertion data yielded support for different phylogenetic scenarios for the most basal divergences. The results indicated that some mammalian divergences were best interpreted not as a single bifurcating tree, but as an evolutionary network. In these studies the relationships among some orders of the super-clade Laurasiatheria were poorly supported, albeit not studied in detail. Therefore, 4775 protein-coding genes (6,196,263 nucleotides) were collected and aligned in order to analyze the evolution of this clade. Additionally, over 200,000 introns were screened in silico, resulting in 32 phylogenetically informative long interspersed nuclear elements (LINE) insertion events.
The present study shows that the genome evolution of Laurasiatheria may best be understood as an evolutionary network. Thus, contrary to the common expectation to resolve major evolutionary events as a bifurcating tree, genome analyses unveil complex speciation processes even in deep mammalian divergences. We exemplify this on a subset of 1159 suitable genes that have individual histories, most likely due to incomplete lineage sorting or introgression, processes that can make the genealogy of mammalian genomes complex.
These unexpected results have major implications for the understanding of evolution in general, because the evolution of even some higher level taxa such as mammalian orders may sometimes not be interpreted as a simple bifurcating pattern.