Bayesian Inference of Natural Selection from Allele Frequency Time Series.
- Journal Article
- Research Support
- U.S. Gov't
- Non-P.H.S.
- Research Support
- N.I.H.
- Extramural
Summary
The researchers have developed a Bayesian method that allows the use of allele frequency time series to infer the parameters of natural selection. This is enabled through the progress in ancient DNA technology. The research shows that this method effectively estimates selection coefficients and allele age. The researchers also concluded that disregarding relevant demographic history could significantly affect the results of inference.
Introduction
This research focuses on the use of ancient DNA technology to directly obtain the allele frequencies in ancestral populations. By doing so, the researchers aim to develop a new method in which allele frequency time series can be used to infer the parameters of natural selection. The proposed method is expected to offer more powerful results than traditional inferences made using patterns of linked neutral variation obtained from modern individuals.
Development of the Bayesian Method
- The research team developed a Bayesian method that is able to use allele frequency time series data to infer the parameters of general diploid selection, in addition to allele age, in nonequilibrium populations.
- A special feature of their approach is a novel path augmentation effort. This uses Markov chain Monte Carlo to integrate over the possible allele frequency trajectories that align with the observed data.
Performance of the Bayesian Method
- Assessments through simulations reveal that this approach shows good potential in estimating selection coefficients and allele age.
- The researchers have provided this method to the public via a C++ software package.
Importance of Considering Relevant Demographic History
- A key insight from applying this tool to data on horse coat color is the realization that neglecting pertinent demographic history can considerably distort the results of inference.
Cite This Article
Publication
Researcher Affiliations
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195 schraib@uw.edu.
- Department of Statistics, University of California, Berkeley, California Department of Mathematics, University of California, Berkeley, California.
- Department of Integrative Biology, University of California, Berkeley, California 94720.
MeSH Terms
- Animals
- Bayes Theorem
- Diploidy
- Gene Frequency
- Horses / genetics
- Models, Genetic
- Selection, Genetic
- Skin Pigmentation / genetics
- Software
Grant Funding
- R01 GM040282 / NIGMS NIH HHS
- R01 GM109454 / NIGMS NIH HHS
References
- Bollback JP, York TL, Nielsen R. Estimation of 2Nes from temporal allele frequency data.. Genetics 2008 May;179(1):497-502.
- Chen H, Slatkin M. Inferring selection intensity and allele age from multilocus haplotype structure.. G3 (Bethesda) 2013 Aug 7;3(8):1429-42.
- Coop G, Griffiths RC. Ancestral inference on gene trees under selection.. Theor Popul Biol 2004 Nov;66(3):219-32.
- Der Sarkissian C, Ermini L, Schubert M, Yang MA, Librado P, Fumagalli M, Jónsson H, Bar-Gal GK, Albrechtsen A, Vieira FG, Petersen B, Ginolhac A, Seguin-Orlando A, Magnussen K, Fages A, Gamba C, Lorente-Galdos B, Polani S, Steiner C, Neuditschko M, Jagannathan V, Feh C, Greenblatt CL, Ludwig A, Abramson NI, Zimmermann W, Schafberg R, Tikhonov A, Sicheritz-Ponten T, Willerslev E, Marques-Bonet T, Ryder OA, McCue M, Rieder S, Leeb T, Slatkin M, Orlando L. Evolutionary Genomics and Conservation of the Endangered Przewalski's Horse.. Curr Biol 2015 Oct 5;25(19):2577-83.
- Ewens WJ. Mathematical Population Genetics: I. Theoretical Introduction. Vol. 27.
- Feder AF, Kryazhimskiy S, Plotkin JB. Identifying signatures of selection in genetic time series.. Genetics 2014 Feb;196(2):509-22.
- Feller W. Diffusion processes in genetics. Proceedings of the Second Berkeley Symposium Mathematical Statistics and Probability 1951, p. 246, Vol. 227.
- Fisher RA. On the dominance ratio. Proc. R. Soc. Edinb. 42: 321–341.
- Fuchs C. Inference for Diffusion Processes: With Applications in Life Sciences. .
- Girsanov IV. On transforming a certain class of stochastic processes by absolutely continuous substitution of measures. Theory Probab. Appl. 5(3): 285–301.
- Golightly A, Wilkinson DJ. Bayesian inference for stochastic kinetic models using a diffusion approximation.. Biometrics 2005 Sep;61(3):781-8.
- Golightly A, Wilkinson DJ. Bayesian inference for nonlinear multivariate diffusion models observed with error. Comput. Stat. Data Anal. 52(3): 1674–1693.
- Griffiths RC, Tavaré S. Sampling theory for neutral alleles in a varying environment.. Philos Trans R Soc Lond B Biol Sci 1994 Jun 29;344(1310):403-10.
- Haldane JBS. A mathematical theory of natural and artificial selection, part v: selection and mutation. Math. Proc. Camb. Philos. Soc. 23(07): 838–844.
- Itô K. Stochastic integral. Proc. Jpn. Acad. Ser. A Math. Sci. 20(8): 519–524.
- Jenkins PA. Exact simulation of the sample paths of a diffusion with a finite entrance boundary. arXiv 1311.5777.
- Jenkins PA, Spano D. Exact simulation of the Wright-Fisher diffusion. arXiv 1506.06998.
- Kallenberg O. Foundations of Modern Probability (Probability and Its Applications, Ed. 2). .
- Knight FB. Essentials of Brownian Motion and Diffusion (Mathematical Surveys, Vol. 18). .
- Ludwig A, Pruvost M, Reissmann M, Benecke N, Brockmann GA, Castaños P, Cieslak M, Lippold S, Llorente L, Malaspinas AS, Slatkin M, Hofreiter M. Coat color variation at the beginning of horse domestication.. Science 2009 Apr 24;324(5926):485.
- Malaspinas AS, Malaspinas O, Evans SN, Slatkin M. Estimating allele age and selection coefficient from time-serial data.. Genetics 2012 Oct;192(2):599-607.
- Mathieson I, McVean G. Estimating selection coefficients in spatially structured populations from time series data of allele frequencies.. Genetics 2013 Mar;193(3):973-84.
- Mathieson I, Lazaridis I, Rohland N, Mallick S, Patterson N, Roodenberg SA, Harney E, Stewardson K, Fernandes D, Novak M, Sirak K, Gamba C, Jones ER, Llamas B, Dryomov S, Pickrell J, Arsuaga JL, de Castro JM, Carbonell E, Gerritsen F, Khokhlov A, Kuznetsov P, Lozano M, Meller H, Mochalov O, Moiseyev V, Guerra MA, Roodenberg J, Vergès JM, Krause J, Cooper A, Alt KW, Brown D, Anthony D, Lalueza-Fox C, Haak W, Pinhasi R, Reich D. Genome-wide patterns of selection in 230 ancient Eurasians.. Nature 2015 Dec 24;528(7583):499-503.
- Nielsen R, Williamson S, Kim Y, Hubisz MJ, Clark AG, Bustamante C. Genomic scans for selective sweeps using SNP data.. Genome Res 2005 Nov;15(11):1566-75.
- Pickrell JK, Coop G, Novembre J, Kudaravalli S, Li JZ, Absher D, Srinivasan BS, Barsh GS, Myers RM, Feldman MW, Pritchard JK. Signals of recent positive selection in a worldwide sample of human populations.. Genome Res 2009 May;19(5):826-37.
- Plummer M, Best N, Cowles K, Vines K. Coda: convergence diagnosis and output analysis for MCMC. R News 6(1): 7–11.
- Revuz D, Yor M. Continuous Martingales and Brownian Motion (Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], Vol. 293, Ed. 3). .
- Roberts GO, Stramer O. On inference for partially observed nonlinear diffusion models using the Metropolis–Hastings algorithm. Biometrika 88(3): 603–621.
- Schraiber JG. A path integral formulation of the Wright-Fisher process with genic selection.. Theor Popul Biol 2014 Mar;92:30-5.
- Schraiber JG, Griffiths RC, Evans SN. Analysis and rejection sampling of Wright-Fisher diffusion bridges.. Theor Popul Biol 2013 Nov;89:64-74.
- Sermaidis G, Papaspiliopoulos O, Roberts GO, Beskos A, Fearnhead P. Markov chain Monte Carlo for exact inference for diffusions. Scand. J. Stat. 40: 294–321.
- Sjödin P, Skoglund P, Jakobsson M. Assessing the maximum contribution from ancient populations.. Mol Biol Evol 2014 May;31(5):1248-60.
- Slatkin M. Simulating genealogies of selected alleles in a population of variable size.. Genet Res 2001 Aug;78(1):49-57.
- Slatkin M, Hudson RR. Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations.. Genetics 1991 Oct;129(2):555-62.
- Song YS, Steinrücken M. A simple method for finding explicit analytic transition densities of diffusion processes with general diploid selection.. Genetics 2012 Mar;190(3):1117-29.
- Sørensen M. Parametric inference for discretely sampled stochastic differential equations. Handbook of Financial Time Series pp. 531–553.
- Steinrücken M, Bhaskar A, Song YS. A NOVEL SPECTRAL METHOD FOR INFERRING GENERAL DIPLOID SELECTION FROM TIME SERIES GENETIC DATA.. Ann Appl Stat 2014 Dec;8(4):2203-2222.
- Voight BF, Kudaravalli S, Wen X, Pritchard JK. A map of recent positive selection in the human genome.. PLoS Biol 2006 Mar;4(3):e72.
- Watterson GA. Estimating and testing selection: the two-alleles, genic selection diffusion model. Adv. Appl. Probab. 11: 14–30.
- Williamson EG, Slatkin M. Using maximum likelihood to estimate population size from temporal changes in allele frequencies.. Genetics 1999 Jun;152(2):755-61.