Analyze Diet
The annals of applied statistics2015; 8(4); 2203-2222; doi: 10.1214/14-aoas764

A Novel Spectral Method for Inferring General Diploid Selection from Time Series Genetic Data.

Abstract: The increased availability of time series genetic variation data from experimental evolution studies and ancient DNA samples has created new opportunities to identify genomic regions under selective pressure and to estimate their associated fitness parameters. However, it is a challenging problem to compute the likelihood of non-neutral models for the population allele frequency dynamics, given the observed temporal DNA data. Here, we develop a novel spectral algorithm to analytically and efficiently integrate over all possible frequency trajectories between consecutive time points. This advance circumvents the limitations of existing methods which require fine-tuning the discretization of the population allele frequency space when numerically approximating requisite integrals. Furthermore, our method is flexible enough to handle general diploid models of selection where the heterozygote and homozygote fitness parameters can take any values, while previous methods focused on only a few restricted models of selection. We demonstrate the utility of our method on simulated data and also apply it to analyze ancient DNA data from genetic loci associated with coat coloration in horses. In contrast to previous studies, our exploration of the full fitness parameter space reveals that a heterozygote-advantage form of balancing selection may have been acting on these loci.
Publication Date: 2015-01-20 PubMed ID: 25598858PubMed Central: PMC4295721DOI: 10.1214/14-aoas764Google Scholar: Lookup
The Equine Research Bank provides access to a large database of publicly available scientific literature. Inclusion in the Research Bank does not imply endorsement of study methods or findings by Mad Barn.
  • Journal Article

Summary

This research summary has been generated with artificial intelligence and may contain errors and omissions. Refer to the original study to confirm details provided. Submit correction.

This study introduces a novel spectral algorithm which is capable of analytically integrating over all frequency trajectories from consecutive time points in the context of time series genetic variation data. The study highlights the flexibility and efficiency of this method in relation to past methods, demonstrating how it can be used to determine fitness parameters associated with selection in diploid models.

Objective of the Research

  • The study is aimed at narrating how a new spectral algorithm dissects time series genetic variation data to identify genomic regions under selective pressure and estimate their associated fitness parameters.
  • The researchers’ objective is to overcome computational challenges that arise with non-neutral models when attempting to compute the population allele frequency dynamics likelihood given temporary DNA data.

Spectral Algorithm Development

  • The study develops a spectral algorithm that integrates across all possible frequency trajectories between time sequence points. This advancement addresses limitations with existing methods that often called for adjustments in the discretization of population allele frequency space when numerically approximating integrals required.
  • The key benefit of this alternative method is its flexibility to comfortably handle general diploid models of selection. With these models, the fitness parameters for heterozygotes and homozygotes are able to take any value.

Utility of the New Method

  • The utility of this new method is demonstrated using simulated data, highlighting its effectiveness compared to previous methods which were limited to evaluating certain models of selection.
  • The authors applied this method to ancient DNA data from horse genetic loci associated with coat coloration. This analysis revealed that a form of balancing selection favoring heterozygotes may likely have influenced these loci, a finding that contrasts with previous studies.

Conclusions

  • The findings of the research underscore the potential of the spectral algorithm in identifying genomic regions under selective pressure and estimating related fitness parameters.
  • The usage of this method facilitates a more specific analysis of genetic selection models, offering unique insights into the dynamics of genetic variation.

Cite This Article

APA
Steinrücken M, Bhaskar A, Song YS. (2015). A Novel Spectral Method for Inferring General Diploid Selection from Time Series Genetic Data. Ann Appl Stat, 8(4), 2203-2222. https://doi.org/10.1214/14-aoas764

Publication

ISSN: 1932-6157
NlmUniqueID: 101479511
Country: United States
Language: English
Volume: 8
Issue: 4
Pages: 2203-2222

Researcher Affiliations

Steinrücken, Matthias
  • University of California, Berkeley.
Bhaskar, Anand
  • University of California, Berkeley.
Song, Yun S
  • University of California, Berkeley.

Grant Funding

  • R01 GM094402 / NIGMS NIH HHS

References

This article includes 27 references
  1. Bollback JP, York TL, Nielsen R. Estimation of 2Nes from temporal allele frequency data.. Genetics 2008 May;179(1):497-502.
    pmc: PMC2390626pubmed: 18493066doi: 10.1534/genetics.107.085019google scholar: lookup
  2. Burke MK, Dunham JP, Shahrestani P, Thornton KR, Rose MR, Long AD. Genome-wide analysis of a long-term evolution experiment with Drosophila.. Nature 2010 Sep 30;467(7315):587-90.
    pubmed: 20844486doi: 10.1038/nature09352google scholar: lookup
  3. Ewens W. Mathematical Population Genetics: I. Theoretical Introduction. 2. Springer; 2004.
  4. Fearnhead P. Ancestral processes for non-neutral models of complex diseases.. Theor Popul Biol 2003 Mar;63(2):115-30.
    pubmed: 12615495doi: 10.1016/s0040-5809(02)00049-7google scholar: lookup
  5. Fearnhead P. The stationary distribution of allele frequencies when selection acts at unlinked loci.. Theor Popul Biol 2006 Nov;70(3):376-86.
    pubmed: 16563450doi: 10.1016/j.tpb.2006.02.001google scholar: lookup
  6. Feder AF, Kryazhimskiy S, Plotkin JB. Identifying signatures of selection in genetic time series.. Genetics 2014 Feb;196(2):509-22.
    pmc: PMC3914623pubmed: 24318534doi: 10.1534/genetics.113.158220google scholar: lookup
  7. Genz A, Joyce P. Computation of the normalizing constant for exponentially weighted Dirichlet distribution integrals. Computing Science and Statistics 2003;35:181–212.
  8. Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MH, Hansen NF, Durand EY, Malaspinas AS, Jensen JD, Marques-Bonet T, Alkan C, Prüfer K, Meyer M, Burbano HA, Good JM, Schultz R, Aximu-Petri A, Butthof A, Höber B, Höffner B, Siegemund M, Weihmann A, Nusbaum C, Lander ES, Russ C, Novod N, Affourtit J, Egholm M, Verna C, Rudan P, Brajkovic D, Kucan Ž, Gušic I, Doronichev VB, Golovanova LV, Lalueza-Fox C, de la Rasilla M, Fortea J, Rosas A, Schmitz RW, Johnson PLF, Eichler EE, Falush D, Birney E, Mullikin JC, Slatkin M, Nielsen R, Kelso J, Lachmann M, Reich D, Pääbo S. A draft sequence of the Neandertal genome.. Science 2010 May 7;328(5979):710-722.
    pmc: PMC5100745pubmed: 20448178doi: 10.1126/science.1188021google scholar: lookup
  9. Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD. Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data.. PLoS Genet 2009 Oct;5(10):e1000695.
  10. Hummel S, Schmidt D, Kremeyer B, Herrmann B, Oppermann M. Detection of the CCR5-Delta32 HIV resistance gene in Bronze Age skeletons.. Genes Immun 2005 Jun;6(4):371-4.
    pubmed: 15815693doi: 10.1038/sj.gene.6364172google scholar: lookup
  11. Lang GI, Rice DP, Hickman MJ, Sodergren E, Weinstock GM, Botstein D, Desai MM. Pervasive genetic hitchhiking and clonal interference in forty evolving yeast populations.. Nature 2013 Aug 29;500(7464):571-4.
    pmc: PMC3758440pubmed: 23873039doi: 10.1038/nature12344google scholar: lookup
  12. Ludwig A, Pruvost M, Reissmann M, Benecke N, Brockmann GA, Castaños P, Cieslak M, Lippold S, Llorente L, Malaspinas AS, Slatkin M, Hofreiter M. Coat color variation at the beginning of horse domestication.. Science 2009 Apr 24;324(5926):485.
    pmc: PMC5102060pubmed: 19390039doi: 10.1126/science.1172750google scholar: lookup
  13. Lukić S, Hey J, Chen K. Non-equilibrium allele frequency spectra via spectral methods.. Theor Popul Biol 2011 Jun;79(4):203-19.
    pmc: PMC3410934pubmed: 21376069doi: 10.1016/j.tpb.2011.02.003google scholar: lookup
  14. Malaspinas AS, Malaspinas O, Evans SN, Slatkin M. Estimating allele age and selection coefficient from time-serial data.. Genetics 2012 Oct;192(2):599-607.
    pmc: PMC3454883pubmed: 22851647doi: 10.1534/genetics.112.140939google scholar: lookup
  15. Mathar RJ. A Java Math.BigDecimal Implementation of Core Mathematical Functions. ArXiv e-prints 2009.
  16. Mathieson I, McVean G. Estimating selection coefficients in spatially structured populations from time series data of allele frequencies.. Genetics 2013 Mar;193(3):973-84.
    pmc: PMC3584010pubmed: 23307902doi: 10.1534/genetics.112.147611google scholar: lookup
  17. Orlando L, Ginolhac A, Zhang G, Froese D, Albrechtsen A, Stiller M, Schubert M, Cappellini E, Petersen B, Moltke I, Johnson PL, Fumagalli M, Vilstrup JT, Raghavan M, Korneliussen T, Malaspinas AS, Vogt J, Szklarczyk D, Kelstrup CD, Vinther J, Dolocan A, Stenderup J, Velazquez AM, Cahill J, Rasmussen M, Wang X, Min J, Zazula GD, Seguin-Orlando A, Mortensen C, Magnussen K, Thompson JF, Weinstock J, Gregersen K, Røed KH, Eisenmann V, Rubin CJ, Miller DC, Antczak DF, Bertelsen MF, Brunak S, Al-Rasheid KA, Ryder O, Andersson L, Mundy J, Krogh A, Gilbert MT, Kjær K, Sicheritz-Ponten T, Jensen LJ, Olsen JV, Hofreiter M, Nielsen R, Shapiro B, Wang J, Willerslev E. Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse.. Nature 2013 Jul 4;499(7456):74-8.
    pubmed: 23803765doi: 10.1038/nature12323google scholar: lookup
  18. Orozco-terWengel P, Kapun M, Nolte V, Kofler R, Flatt T, Schlötterer C. Adaptation of Drosophila to a novel laboratory environment reveals temporally heterogeneous trajectories of selected alleles.. Mol Ecol 2012 Oct;21(20):4931-41.
  19. Press W, Teukolsky S, Vetterling WT, Flannery BP. Numerical Recipes: The Art of Scientific Computing. 3. Cambridge University Press; 2007.
  20. Reich D, Green RE, Kircher M, Krause J, Patterson N, Durand EY, Viola B, Briggs AW, Stenzel U, Johnson PL, Maricic T, Good JM, Marques-Bonet T, Alkan C, Fu Q, Mallick S, Li H, Meyer M, Eichler EE, Stoneking M, Richards M, Talamo S, Shunkov MV, Derevianko AP, Hublin JJ, Kelso J, Slatkin M, Pääbo S. Genetic history of an archaic hominin group from Denisova Cave in Siberia.. Nature 2010 Dec 23;468(7327):1053-60.
    pmc: PMC4306417pubmed: 21179161doi: 10.1038/nature09710google scholar: lookup
  21. Shankarappa R, Margolick JB, Gange SJ, Rodrigo AG, Upchurch D, Farzadegan H, Gupta P, Rinaldo CR, Learn GH, He X, Huang XL, Mullins JI. Consistent viral evolutionary changes associated with the progression of human immunodeficiency virus type 1 infection.. J Virol 1999 Dec;73(12):10489-502.
  22. Song YS, Steinrücken M. A simple method for finding explicit analytic transition densities of diffusion processes with general diploid selection.. Genetics 2012 Mar;190(3):1117-29.
    pmc: PMC3296246pubmed: 22209899doi: 10.1534/genetics.111.136929google scholar: lookup
  23. Steinrücken M, Bhaskar A, Song YS. A NOVEL SPECTRAL METHOD FOR INFERRING GENERAL DIPLOID SELECTION FROM TIME SERIES GENETIC DATA.. Ann Appl Stat 2014 Dec;8(4):2203-2222.
    pmc: PMC4295721pubmed: 25598858doi: 10.1214/14-aoas764google scholar: lookup
  24. Steinrücken M, Wang YX, Song YS. An explicit transition density expansion for a multi-allelic Wright-Fisher diffusion with general diploid selection.. Theor Popul Biol 2013 Feb;83:1-14.
    pmc: PMC3568258pubmed: 23127866doi: 10.1016/j.tpb.2012.10.006google scholar: lookup
  25. Stephens M, Donnelly P. Ancestral inference in population genetics models with selection. Australian & New Zealand Journal of Statistics 2003;45:395–423.
  26. Williamson EG, Slatkin M. Using maximum likelihood to estimate population size from temporal changes in allele frequencies.. Genetics 1999 Jun;152(2):755-61.
    pmc: PMC1460624pubmed: 10353915doi: 10.1093/genetics/152.2.755google scholar: lookup
  27. Wiser MJ, Ribeck N, Lenski RE. Long-term dynamics of adaptation in asexual populations.. Science 2013 Dec 13;342(6164):1364-7.
    pubmed: 24231808doi: 10.1126/science.1243357google scholar: lookup

Citations

This article has been cited 29 times.
  1. Whitehouse LS, Schrider DR. Timesweeper: accurately identifying selective sweeps using population genomic time series.. Genetics 2023 Jul 6;224(3).
    doi: 10.1093/genetics/iyad084pubmed: 37157914google scholar: lookup
  2. He Z, Dai X, Lyu W, Beaumont M, Yu F. Estimating Temporally Variable Selection Intensity from Ancient DNA Data.. Mol Biol Evol 2023 Mar 4;40(3).
    doi: 10.1093/molbev/msad008pubmed: 36661852google scholar: lookup
  3. Barata C, Borges R, Kosiol C. Bait-ER: A Bayesian method to detect targets of selection in Evolve-and-Resequence experiments.. J Evol Biol 2023 Jan;36(1):29-44.
    doi: 10.1111/jeb.14134pubmed: 36544394google scholar: lookup
  4. Mathieson I, Terhorst J. Direct detection of natural selection in Bronze Age Britain.. Genome Res 2022 Nov-Dec;32(11-12):2057-2067.
    doi: 10.1101/gr.276862.122pubmed: 36316157google scholar: lookup
  5. Sohail MS, Louie RHY, Hong Z, Barton JP, McKay MR. Inferring Epistasis from Genetic Time-series Data.. Mol Biol Evol 2022 Oct 7;39(10).
    doi: 10.1093/molbev/msac199pubmed: 36130322google scholar: lookup
  6. Friedlander E, Steinrücken M. A numerical framework for genetic hitchhiking in populations of variable size.. Genetics 2022 Mar 3;220(3).
    doi: 10.1093/genetics/iyac012pubmed: 35143667google scholar: lookup
  7. Roodgar M, Good BH, Garud NR, Martis S, Avula M, Zhou W, Lancaster SM, Lee H, Babveyh A, Nesamoney S, Pollard KS, Snyder MP. Longitudinal linked-read sequencing reveals ecological and evolutionary responses of a human gut microbiome during antibiotic treatment.. Genome Res 2021 Aug;31(8):1433-1446.
    doi: 10.1101/gr.265058.120pubmed: 34301627google scholar: lookup
  8. Croze M, Kim Y. Inference of population genetic parameters from an irregular time series of seasonal influenza virus sequences.. Genetics 2021 Feb 9;217(2).
    doi: 10.1093/genetics/iyaa039pubmed: 33724414google scholar: lookup
  9. He Z, Dai X, Beaumont M, Yu F. Detecting and Quantifying Natural Selection at Two Linked Loci from Time Series Data of Allele Frequencies with Forward-in-Time Simulations.. Genetics 2020 Oct;216(2):521-541.
    doi: 10.1534/genetics.120.303463pubmed: 32826299google scholar: lookup
  10. He Z, Dai X, Beaumont M, Yu F. Estimation of Natural Selection and Allele Age from Time Series Allele Frequency Data Using a Novel Likelihood-Based Approach.. Genetics 2020 Oct;216(2):463-480.
    doi: 10.1534/genetics.120.303400pubmed: 32769100google scholar: lookup
  11. Dehasque M, Ávila-Arcos MC, Díez-Del-Molino D, Fumagalli M, Guschanski K, Lorenzen ED, Malaspinas AS, Marques-Bonet T, Martin MD, Murray GGR, Papadopulos AST, Therkildsen NO, Wegmann D, Dalén L, Foote AD. Inference of natural selection from ancient DNA.. Evol Lett 2020 Apr;4(2):94-108.
    doi: 10.1002/evl3.165pubmed: 32313686google scholar: lookup
  12. Paris C, Servin B, Boitard S. Inference of Selection from Genetic Time Series Using Various Parametric Approximations to the Wright-Fisher Model.. G3 (Bethesda) 2019 Dec 3;9(12):4073-4086.
    doi: 10.1534/g3.119.400778pubmed: 31597676google scholar: lookup
  13. Zinger T, Gelbart M, Miller D, Pennings PS, Stern A. Inferring population genetics parameters of evolving viruses using time-series data.. Virus Evol 2019 Jan;5(1):vez011.
    doi: 10.1093/ve/vez011pubmed: 31191979google scholar: lookup
  14. Liu J, Champer J, Langmüller AM, Liu C, Chung J, Reeves R, Luthra A, Lee YL, Vaughn AH, Clark AG, Messer PW. Maximum Likelihood Estimation of Fitness Components in Experimental Evolution.. Genetics 2019 Mar;211(3):1005-1017.
    doi: 10.1534/genetics.118.301893pubmed: 30679262google scholar: lookup
  15. Sackman AM, Harris RB, Jensen JD. Inferring Demography and Selection in Organisms Characterized by Skewed Offspring Distributions.. Genetics 2019 Mar;211(3):1019-1028.
    doi: 10.1534/genetics.118.301684pubmed: 30651284google scholar: lookup
  16. Ferguson JM, Buzbas EO. Inference from the stationary distribution of allele frequencies in a family of Wright-Fisher models with two levels of genetic variability.. Theor Popul Biol 2018 Jul;122:78-87.
    doi: 10.1016/j.tpb.2018.03.004pubmed: 29574050google scholar: lookup
  17. Rousseau E, Moury B, Mailleret L, Senoussi R, Palloix A, Simon V, Valière S, Grognard F, Fabre F. Estimating virus effective population size and selection without neutral markers.. PLoS Pathog 2017 Nov;13(11):e1006702.
    doi: 10.1371/journal.ppat.1006702pubmed: 29155894google scholar: lookup
  18. Taus T, Futschik A, Schlötterer C. Quantifying Selection with Pool-Seq Time Series Data.. Mol Biol Evol 2017 Nov 1;34(11):3023-3034.
    doi: 10.1093/molbev/msx225pubmed: 28961717google scholar: lookup
  19. R Nené N, Mustonen V, J R Illingworth C. Evaluating genetic drift in time-series evolutionary analysis.. J Theor Biol 2018 Jan 21;437:51-57.
    doi: 10.1016/j.jtbi.2017.09.021pubmed: 28958783google scholar: lookup
  20. Iranmehr A, Akbari A, Schlötterer C, Bafna V. Clear: Composition of Likelihoods for Evolve and Resequence Experiments.. Genetics 2017 Jun;206(2):1011-1023.
    doi: 10.1534/genetics.116.197566pubmed: 28396506google scholar: lookup
  21. Tataru P, Simonsen M, Bataillon T, Hobolth A. Statistical Inference in the Wright-Fisher Model Using Allele Frequency Data.. Syst Biol 2017 Jan 1;66(1):e30-e46.
    doi: 10.1093/sysbio/syw056pubmed: 28173553google scholar: lookup
  22. Jewett EM, Steinrücken M, Song YS. The Effects of Population Size Histories on Estimates of Selection Coefficients from Time-Series Genetic Data.. Mol Biol Evol 2016 Nov;33(11):3002-3027.
    doi: 10.1093/molbev/msw173pubmed: 27550904google scholar: lookup
  23. Ferrer-Admetlla A, Leuenberger C, Jensen JD, Wegmann D. An Approximate Markov Model for the Wright-Fisher Diffusion and Its Application to Time Series Data.. Genetics 2016 Jun;203(2):831-46.
    doi: 10.1534/genetics.115.184598pubmed: 27038112google scholar: lookup
  24. Schraiber JG, Evans SN, Slatkin M. Bayesian Inference of Natural Selection from Allele Frequency Time Series.. Genetics 2016 May;203(1):493-511.
    doi: 10.1534/genetics.116.187278pubmed: 27010022google scholar: lookup
  25. Steinrücken M, Jewett EM, Song YS. SpectralTDF: transition densities of diffusion processes with time-varying selection parameters, mutation rates and effective population sizes.. Bioinformatics 2016 Mar 1;32(5):795-7.
    doi: 10.1093/bioinformatics/btv627pubmed: 26556388google scholar: lookup
  26. Schraiber JG, Akey JM. Methods and models for unravelling human evolutionary history.. Nat Rev Genet 2015 Dec;16(12):727-40.
    doi: 10.1038/nrg4005pubmed: 26553329google scholar: lookup
  27. Tataru P, Bataillon T, Hobolth A. Inference Under a Wright-Fisher Model Using an Accurate Beta Approximation.. Genetics 2015 Nov;201(3):1133-41.
    doi: 10.1534/genetics.115.179606pubmed: 26311474google scholar: lookup
  28. Živković D, Steinrücken M, Song YS, Stephan W. Transition Densities and Sample Frequency Spectra of Diffusion Processes with Selection and Variable Population Size.. Genetics 2015 Jun;200(2):601-17.
    doi: 10.1534/genetics.115.175265pubmed: 25873633google scholar: lookup
  29. Terhorst J, Schlötterer C, Song YS. Multi-locus analysis of genomic time series data from experimental evolution.. PLoS Genet 2015 Apr;11(4):e1005069.
    doi: 10.1371/journal.pgen.1005069pubmed: 25849855google scholar: lookup