Analyze Diet
Genetics2016; 203(1); 493-511; doi: 10.1534/genetics.116.187278

Bayesian Inference of Natural Selection from Allele Frequency Time Series.

Abstract: The advent of accessible ancient DNA technology now allows the direct ascertainment of allele frequencies in ancestral populations, thereby enabling the use of allele frequency time series to detect and estimate natural selection. Such direct observations of allele frequency dynamics are expected to be more powerful than inferences made using patterns of linked neutral variation obtained from modern individuals. We developed a Bayesian method to make use of allele frequency time series data and infer the parameters of general diploid selection, along with allele age, in nonequilibrium populations. We introduce a novel path augmentation approach, in which we use Markov chain Monte Carlo to integrate over the space of allele frequency trajectories consistent with the observed data. Using simulations, we show that this approach has good power to estimate selection coefficients and allele age. Moreover, when applying our approach to data on horse coat color, we find that ignoring a relevant demographic history can significantly bias the results of inference. Our approach is made available in a C++ software package.
Publication Date: 2016-03-23 PubMed ID: 27010022PubMed Central: PMC4858794DOI: 10.1534/genetics.116.187278Google Scholar: Lookup
The Equine Research Bank provides access to a large database of publicly available scientific literature. Inclusion in the Research Bank does not imply endorsement of study methods or findings by Mad Barn.
  • Journal Article
  • Research Support
  • U.S. Gov't
  • Non-P.H.S.
  • Research Support
  • N.I.H.
  • Extramural

Summary

This research summary has been generated with artificial intelligence and may contain errors and omissions. Refer to the original study to confirm details provided. Submit correction.

The researchers have developed a Bayesian method that allows the use of allele frequency time series to infer the parameters of natural selection. This is enabled through the progress in ancient DNA technology. The research shows that this method effectively estimates selection coefficients and allele age. The researchers also concluded that disregarding relevant demographic history could significantly affect the results of inference.

Introduction

This research focuses on the use of ancient DNA technology to directly obtain the allele frequencies in ancestral populations. By doing so, the researchers aim to develop a new method in which allele frequency time series can be used to infer the parameters of natural selection. The proposed method is expected to offer more powerful results than traditional inferences made using patterns of linked neutral variation obtained from modern individuals.

Development of the Bayesian Method

  • The research team developed a Bayesian method that is able to use allele frequency time series data to infer the parameters of general diploid selection, in addition to allele age, in nonequilibrium populations.
  • A special feature of their approach is a novel path augmentation effort. This uses Markov chain Monte Carlo to integrate over the possible allele frequency trajectories that align with the observed data.

Performance of the Bayesian Method

  • Assessments through simulations reveal that this approach shows good potential in estimating selection coefficients and allele age.
  • The researchers have provided this method to the public via a C++ software package.

Importance of Considering Relevant Demographic History

  • A key insight from applying this tool to data on horse coat color is the realization that neglecting pertinent demographic history can considerably distort the results of inference.

Cite This Article

APA
Schraiber JG, Evans SN, Slatkin M. (2016). Bayesian Inference of Natural Selection from Allele Frequency Time Series. Genetics, 203(1), 493-511. https://doi.org/10.1534/genetics.116.187278

Publication

ISSN: 1943-2631
NlmUniqueID: 0374636
Country: United States
Language: English
Volume: 203
Issue: 1
Pages: 493-511

Researcher Affiliations

Schraiber, Joshua G
  • Department of Genome Sciences, University of Washington, Seattle, Washington 98195 schraib@uw.edu.
Evans, Steven N
  • Department of Statistics, University of California, Berkeley, California Department of Mathematics, University of California, Berkeley, California.
Slatkin, Montgomery
  • Department of Integrative Biology, University of California, Berkeley, California 94720.

MeSH Terms

  • Animals
  • Bayes Theorem
  • Diploidy
  • Gene Frequency
  • Horses / genetics
  • Models, Genetic
  • Selection, Genetic
  • Skin Pigmentation / genetics
  • Software

Grant Funding

  • R01 GM040282 / NIGMS NIH HHS
  • R01 GM109454 / NIGMS NIH HHS

References

This article includes 40 references
  1. Bollback JP, York TL, Nielsen R. Estimation of 2Nes from temporal allele frequency data.. Genetics 2008 May;179(1):497-502.
    pmc: PMC2390626pubmed: 18493066doi: 10.1534/genetics.107.085019google scholar: lookup
  2. Chen H, Slatkin M. Inferring selection intensity and allele age from multilocus haplotype structure.. G3 (Bethesda) 2013 Aug 7;3(8):1429-42.
    pmc: PMC3737182pubmed: 23797107doi: 10.1534/g3.113.006197google scholar: lookup
  3. Coop G, Griffiths RC. Ancestral inference on gene trees under selection.. Theor Popul Biol 2004 Nov;66(3):219-32.
    pubmed: 15465123doi: 10.1016/j.tpb.2004.06.006google scholar: lookup
  4. Der Sarkissian C, Ermini L, Schubert M, Yang MA, Librado P, Fumagalli M, Jónsson H, Bar-Gal GK, Albrechtsen A, Vieira FG, Petersen B, Ginolhac A, Seguin-Orlando A, Magnussen K, Fages A, Gamba C, Lorente-Galdos B, Polani S, Steiner C, Neuditschko M, Jagannathan V, Feh C, Greenblatt CL, Ludwig A, Abramson NI, Zimmermann W, Schafberg R, Tikhonov A, Sicheritz-Ponten T, Willerslev E, Marques-Bonet T, Ryder OA, McCue M, Rieder S, Leeb T, Slatkin M, Orlando L. Evolutionary Genomics and Conservation of the Endangered Przewalski's Horse.. Curr Biol 2015 Oct 5;25(19):2577-83.
    pmc: PMC5104162pubmed: 26412128doi: 10.1016/j.cub.2015.08.032google scholar: lookup
  5. Ewens WJ. Mathematical Population Genetics: I. Theoretical Introduction. Vol. 27.
  6. Feder AF, Kryazhimskiy S, Plotkin JB. Identifying signatures of selection in genetic time series.. Genetics 2014 Feb;196(2):509-22.
    pmc: PMC3914623pubmed: 24318534doi: 10.1534/genetics.113.158220google scholar: lookup
  7. Feller W. Diffusion processes in genetics. Proceedings of the Second Berkeley Symposium Mathematical Statistics and Probability 1951, p. 246, Vol. 227.
  8. Fisher RA. On the dominance ratio. Proc. R. Soc. Edinb. 42: 321–341.
  9. Fuchs C. Inference for Diffusion Processes: With Applications in Life Sciences. .
  10. Girsanov IV. On transforming a certain class of stochastic processes by absolutely continuous substitution of measures. Theory Probab. Appl. 5(3): 285–301.
  11. Golightly A, Wilkinson DJ. Bayesian inference for stochastic kinetic models using a diffusion approximation.. Biometrics 2005 Sep;61(3):781-8.
  12. Golightly A, Wilkinson DJ. Bayesian inference for nonlinear multivariate diffusion models observed with error. Comput. Stat. Data Anal. 52(3): 1674–1693.
  13. Griffiths RC, Tavaré S. Sampling theory for neutral alleles in a varying environment.. Philos Trans R Soc Lond B Biol Sci 1994 Jun 29;344(1310):403-10.
    pubmed: 7800710doi: 10.1098/rstb.1994.0079google scholar: lookup
  14. Haldane JBS. A mathematical theory of natural and artificial selection, part v: selection and mutation. Math. Proc. Camb. Philos. Soc. 23(07): 838–844.
  15. Itô K. Stochastic integral. Proc. Jpn. Acad. Ser. A Math. Sci. 20(8): 519–524.
  16. Jenkins PA. Exact simulation of the sample paths of a diffusion with a finite entrance boundary. arXiv 1311.5777.
  17. Jenkins PA, Spano D. Exact simulation of the Wright-Fisher diffusion. arXiv 1506.06998.
  18. Kallenberg O. Foundations of Modern Probability (Probability and Its Applications, Ed. 2). .
  19. Knight FB. Essentials of Brownian Motion and Diffusion (Mathematical Surveys, Vol. 18). .
  20. Ludwig A, Pruvost M, Reissmann M, Benecke N, Brockmann GA, Castaños P, Cieslak M, Lippold S, Llorente L, Malaspinas AS, Slatkin M, Hofreiter M. Coat color variation at the beginning of horse domestication.. Science 2009 Apr 24;324(5926):485.
    pmc: PMC5102060pubmed: 19390039doi: 10.1126/science.1172750google scholar: lookup
  21. Malaspinas AS, Malaspinas O, Evans SN, Slatkin M. Estimating allele age and selection coefficient from time-serial data.. Genetics 2012 Oct;192(2):599-607.
    pmc: PMC3454883pubmed: 22851647doi: 10.1534/genetics.112.140939google scholar: lookup
  22. Mathieson I, McVean G. Estimating selection coefficients in spatially structured populations from time series data of allele frequencies.. Genetics 2013 Mar;193(3):973-84.
    pmc: PMC3584010pubmed: 23307902doi: 10.1534/genetics.112.147611google scholar: lookup
  23. Mathieson I, Lazaridis I, Rohland N, Mallick S, Patterson N, Roodenberg SA, Harney E, Stewardson K, Fernandes D, Novak M, Sirak K, Gamba C, Jones ER, Llamas B, Dryomov S, Pickrell J, Arsuaga JL, de Castro JM, Carbonell E, Gerritsen F, Khokhlov A, Kuznetsov P, Lozano M, Meller H, Mochalov O, Moiseyev V, Guerra MA, Roodenberg J, Vergès JM, Krause J, Cooper A, Alt KW, Brown D, Anthony D, Lalueza-Fox C, Haak W, Pinhasi R, Reich D. Genome-wide patterns of selection in 230 ancient Eurasians.. Nature 2015 Dec 24;528(7583):499-503.
    pmc: PMC4918750pubmed: 26595274doi: 10.1038/nature16152google scholar: lookup
  24. Nielsen R, Williamson S, Kim Y, Hubisz MJ, Clark AG, Bustamante C. Genomic scans for selective sweeps using SNP data.. Genome Res 2005 Nov;15(11):1566-75.
    pmc: PMC1310644pubmed: 16251466doi: 10.1101/gr.4252305google scholar: lookup
  25. Pickrell JK, Coop G, Novembre J, Kudaravalli S, Li JZ, Absher D, Srinivasan BS, Barsh GS, Myers RM, Feldman MW, Pritchard JK. Signals of recent positive selection in a worldwide sample of human populations.. Genome Res 2009 May;19(5):826-37.
    pmc: PMC2675971pubmed: 19307593doi: 10.1101/gr.087577.108google scholar: lookup
  26. Plummer M, Best N, Cowles K, Vines K. Coda: convergence diagnosis and output analysis for MCMC. R News 6(1): 7–11.
  27. Revuz D, Yor M. Continuous Martingales and Brownian Motion (Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], Vol. 293, Ed. 3). .
  28. Roberts GO, Stramer O. On inference for partially observed nonlinear diffusion models using the Metropolis–Hastings algorithm. Biometrika 88(3): 603–621.
  29. Schraiber JG. A path integral formulation of the Wright-Fisher process with genic selection.. Theor Popul Biol 2014 Mar;92:30-5.
    pmc: PMC3932315pubmed: 24269333doi: 10.1016/j.tpb.2013.11.002google scholar: lookup
  30. Schraiber JG, Griffiths RC, Evans SN. Analysis and rejection sampling of Wright-Fisher diffusion bridges.. Theor Popul Biol 2013 Nov;89:64-74.
    pmc: PMC3882091pubmed: 24001410doi: 10.1016/j.tpb.2013.08.005google scholar: lookup
  31. Sermaidis G, Papaspiliopoulos O, Roberts GO, Beskos A, Fearnhead P. Markov chain Monte Carlo for exact inference for diffusions. Scand. J. Stat. 40: 294–321.
  32. Sjödin P, Skoglund P, Jakobsson M. Assessing the maximum contribution from ancient populations.. Mol Biol Evol 2014 May;31(5):1248-60.
    pubmed: 24497031doi: 10.1093/molbev/msu059google scholar: lookup
  33. Slatkin M. Simulating genealogies of selected alleles in a population of variable size.. Genet Res 2001 Aug;78(1):49-57.
    pubmed: 11556137doi: 10.1017/s0016672301005183google scholar: lookup
  34. Slatkin M, Hudson RR. Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations.. Genetics 1991 Oct;129(2):555-62.
    pmc: PMC1204643pubmed: 1743491doi: 10.1093/genetics/129.2.555google scholar: lookup
  35. Song YS, Steinrücken M. A simple method for finding explicit analytic transition densities of diffusion processes with general diploid selection.. Genetics 2012 Mar;190(3):1117-29.
    pmc: PMC3296246pubmed: 22209899doi: 10.1534/genetics.111.136929google scholar: lookup
  36. Sørensen M. Parametric inference for discretely sampled stochastic differential equations. Handbook of Financial Time Series pp. 531–553.
  37. Steinrücken M, Bhaskar A, Song YS. A NOVEL SPECTRAL METHOD FOR INFERRING GENERAL DIPLOID SELECTION FROM TIME SERIES GENETIC DATA.. Ann Appl Stat 2014 Dec;8(4):2203-2222.
    pmc: PMC4295721pubmed: 25598858doi: 10.1214/14-aoas764google scholar: lookup
  38. Voight BF, Kudaravalli S, Wen X, Pritchard JK. A map of recent positive selection in the human genome.. PLoS Biol 2006 Mar;4(3):e72.
  39. Watterson GA. Estimating and testing selection: the two-alleles, genic selection diffusion model. Adv. Appl. Probab. 11: 14–30.
  40. Williamson EG, Slatkin M. Using maximum likelihood to estimate population size from temporal changes in allele frequencies.. Genetics 1999 Jun;152(2):755-61.
    pmc: PMC1460624pubmed: 10353915doi: 10.1093/genetics/152.2.755google scholar: lookup

Citations

This article has been cited 60 times.