Refine
Document Type
- Article (2)
Language
- English (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Keywords
- Collembola (1)
- Diplura (1)
- ESTs (1)
- Ellipura (1)
- Entognatha (1)
- Nonoculata (1)
- Protura (1)
- conflicting hypotheses (1)
- likelihood quartet mapping (1)
- missing data (1)
Institute
Phylogenetic relationships of the primarily wingless insects are still considered unresolved. Even the most comprehensive phylogenomic studies that addressed this question did not yield congruent results. In order to get a grip on these problems, we here analyzed the sources of incongruence in these phylogenomic studies using an extended transcriptome dataset.Our analyses showed that unevenly distributed missing data can be severely misleading by inflating node support despite the absence of phylogenetic signal. In consequence, only decisive datasets should be used which exclusively comprise data blocks containing all taxa whose relationships are addressed. Additionally, we employed Four-cluster Likelihood-Mapping (FcLM) to measure the degree of congruence among genes of a dataset, as a measure of support alternative to bootstrap. FcLM showed incongruent signal among genes, which in our case is correlated with neither functional class assignment of these genes, nor with model misspecification due to unpartitioned analyses. The herein analyzed dataset is the currently largest dataset covering primarily wingless insects, but failed to elucidate their interordinal phylogenetic relationships. While this is unsatisfying from a phylogenetic perspective, we try to show that the analyses of structure and signal within phylogenomic data can protect us from biased phylogenetic inferences due to analytical artefacts.
Background: Molecular phylogenies are being published increasingly and many biologists rely on the most recent topologies. However, different phylogenetic trees often contain conflicting results and contradict significant background data. Not knowing how reliable traditional knowledge is, a crucial question concerns the quality of newly produced molecular data. The information content of DNA alignments is rarely discussed, as quality statements are mostly restricted to the statistical support of clades. Here we present a case study of a recently published mollusk phylogeny that contains surprising groupings, based on five genes and 108 species, and we apply new or rarely used tools for the analysis of the information content of alignments and for the filtering of noise (masking of random-like alignment regions, split decomposition, phylogenetic networks, quartet mapping). Results: The data are very fragmentary and contain contaminations. We show that that signal-like patterns in the data set are conflicting and partly not distinct and that the reported strong support for a "rather surprising result" (monoplacophorans and chitons form a monophylum Serialia) does not exist at the level of primary homologies. Split-decomposition, quartet mapping and neighbornet analyses reveal conflicting nucleotide patterns and lack of distinct phylogenetic signal for the deeper phylogeny of mollusks. Conclusion: Even though currently a majority of molecular phylogenies are being justified with reference to the 'statistical' support of clades in tree topologies, this confidence seems to be unfounded. Contradictions between phylogenies based on different analyses are already a strong indication of unnoticed pitfalls. The use of tree-independent tools for exploratory analyses of data quality are highly recommended. Concerning the new mollusk phylogeny more convincing evidence is needed.