Gene flow analysis method, the D-statistic, is robust in a wide parameter space

  • Background: We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis. Result: The sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, fˆGf^G and fˆhomf^hom, to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background. Conclusions: The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations.
Author:Yichen Zheng, Axel Janke
Pubmed Id:
Parent Title (English):BMC bioinformatics
Publisher:BioMed Central ; Springer
Place of publication:London ; Berlin ; Heidelberg
Document Type:Article
Year of Completion:2018
Date of first Publication:2018/01/08
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Release Date:2018/01/16
Tag:Gene flow; Parameter space; Population size; Sensitivity; Simulation; The D-statistic
Issue:1, Art. 10
Page Number:19
First Page:1
Last Page:19
© The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.
Institutes:Angeschlossene und kooperierende Institutionen / Senckenbergische Naturforschende Gesellschaft
Biowissenschaften / Institut für Ökologie, Evolution und Diversität
Dewey Decimal Classification:5 Naturwissenschaften und Mathematik / 57 Biowissenschaften; Biologie / 570 Biowissenschaften; Biologie
Licence (German):License LogoCreative Commons - Namensnennung 4.0