Genomic prediction of unordered categorical traits: an application to subpopulation assignment in German Warmblood horses.
Abstract: Categorical traits without ordinal representation of classes do not qualify for threshold models. Alternatively, the multinomial problem can be assessed by a sequence of independent binary contrasts using schemes such as one-vs-all or one-vs-one. Class probabilities can be arrived at by normalization or pair-wise coupling strategies. We assessed the predictive ability of whole-genome regression models and support vector machines for the classification of horses into four German Warmblood breeds. Results: Prediction accuracies of leave-one-out cross-validation were high and ranged from 0.75 to 0.97 depending on the binary classifier and breeds incorporated in the training. An analysis of the population structure using eigenvectors of the genomic relationship matrix revealed clustering of individuals beyond the given breed labels. Admixture between two breeds became apparent which had substantial impact on the prediction accuracies between those two breeds and also influenced the contrasts between other breeds. Conclusions: Genomic prediction of unordered categorical traits was successfully applied to subpopulation assignment of German Warmblood horses. The applied methodology is a straightforward extension of existing binary threshold models for genomic prediction.
Publication Date: 2016-02-11 PubMed ID: 26867647PubMed Central: PMC4751658DOI: 10.1186/s12711-016-0192-2Google Scholar: Lookup
The Equine Research Bank provides access to a large database of publicly available scientific literature. Inclusion in the Research Bank does not imply endorsement of study methods or findings by Mad Barn.
- Journal Article
- Research Support
- Non-U.S. Gov't
Summary
This research summary has been generated with artificial intelligence and may contain errors and omissions. Refer to the original study to confirm details provided. Submit correction.
The study explores the use of genomic prediction of unordered categorical traits to successfully assign German Warmblood horses into subpopulations. The method employed can be seen as a straightforward extension of existing binary threshold models for the same.
Introduction and Methodology
- The research presents an examination of categorical traits that do not have an ordinal representation of classes, hence threshold models cannot be applied.
- To navigate this issue, the authors propose an alternative, in the form of a sequence of independent contrasts using one-vs-all or one-vs-one schemes.
- Class probabilities can be reached using normalization or pair-wise coupling strategies. The main focus of the study was to assess the predictive ability of whole-genome regression models and support vector machines for horse classification into four German Warmblood breeds.
Results
- The prediction accuracies obtained through leave-one-out cross-validation were high, with a range of 0.75 to 0.97. These values were dependent on the breed involved in the training and the chosen binary classifier.
- The researchers also performed an analysis of the population structure, using eigenvectors of the genomic relationship matrix, and found that the clustering of individuals extended beyond the given breed labels.
- It was evident that an admixture, or genetic blend, between two breeds significantly impacted the prediction accuracies between those two breeds and influenced contrasts with other breeds.
Conclusions
- The research managed to perform genomic prediction of unordered categorical traits and successfully apply it to the subpopulation assignment of German Warmblood horses.
- The study thereby demonstrates a practical extension of the existing binary threshold models for genomic prediction—employing a series of independent binary contrasts for categorical traits that do not have ordinal representation.
Cite This Article
APA
Heuer C, Scheel C, Tetens J, Kühn C, Thaller G.
(2016).
Genomic prediction of unordered categorical traits: an application to subpopulation assignment in German Warmblood horses.
Genet Sel Evol, 48, 13.
https://doi.org/10.1186/s12711-016-0192-2 Publication
Researcher Affiliations
- Institute of Animal Breeding and Husbandry, University of Kiel, Hermann-Rodewald-Strasse 6, 24098, Kiel, Germany. cheuer@tierzucht.uni-kiel.de.
- Institute of Animal Breeding and Husbandry, University of Kiel, Hermann-Rodewald-Strasse 6, 24098, Kiel, Germany. cscheel@tierzucht.uni-kiel.de.
- Institute of Animal Breeding and Husbandry, University of Kiel, Hermann-Rodewald-Strasse 6, 24098, Kiel, Germany. jtetens@tierzucht.uni-kiel.de.
- Institute for Genome Biology, Leibniz Institute for Farm Animal Biology, Wilhelm-Stahl-Allee 2, 18196, Dummerstorf, Germany. kuehn@fbn-dummerstorf.de.
- Faculty of Agricultural and Environmental Sciences, University Rostock, Justus-von-Liebig-Weg 6, 18059, Rostock, Germany. kuehn@fbn-dummerstorf.de.
- Institute of Animal Breeding and Husbandry, University of Kiel, Hermann-Rodewald-Strasse 6, 24098, Kiel, Germany. gthaller@tierzucht.uni-kiel.de.
MeSH Terms
- Animals
- Bayes Theorem
- Breeding
- Genetics, Population
- Genome
- Genomics / methods
- Genotype
- Germany
- Horses / genetics
- Models, Genetic
- Phenotype
- Polymorphism, Single Nucleotide
- Quantitative Trait, Heritable
- Regression Analysis
- Support Vector Machine
References
This article includes 45 references
- WRIGHT S. The genetical structure of populations.. Ann Eugen 1951 Mar;15(4):323-54.
- Barreiro LB, Laval G, Quach H, Patin E, Quintana-Murci L. Natural selection has driven population differentiation in modern humans.. Nat Genet 2008 Mar;40(3):340-5.
- Habier D, Fernando RL, Garrick DJ. Genomic BLUP decoded: a look into the black box of genomic prediction.. Genetics 2013 Jul;194(3):597-607.
- Technow F, Bürger A, Melchinger AE. Genomic prediction of northern corn leaf blight resistance in maize with combined or separated training sets for heterotic groups.. G3 (Bethesda) 2013 Feb;3(2):197-203.
- Kizilkaya K, Tait RG, Garrick DJ, Fernando RL, Reecy JM. Whole genome analysis of infectious bovine keratoconjunctivitis in Angus cattle using Bayesian threshold models.. BMC Proc 2011 Jun 3;5 Suppl 4(Suppl 4):S22.
- Montesinos-López OA, Montesinos-López A, Pérez-Rodríguez P, de Los Campos G, Eskridge K, Crossa J. Threshold models for genome-enabled prediction of ordinal categorical traits in plant breeding.. G3 (Bethesda) 2014 Dec 23;5(2):291-300.
- Biscarini F, Stevanato P, Broccanello C, Stella A, Saccomani M. Genome-enabled predictions for binomial traits in sugar beet populations.. BMC Genet 2014 Jul 22;15:87.
- Tf Wu, Lin CJ, Weng RC. Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 2003;5:975–1005.
- Meuwissen TH, Hayes BJ, Goddard ME. Prediction of total genetic value using genome-wide dense marker maps.. Genetics 2001 Apr;157(4):1819-29.
- Gianola D, de los Campos G, Hill WG, Manfredi E, Fernando R. Additive genetic variability and the Bayesian alphabet.. Genetics 2009 Sep;183(1):347-63.
- Boser BE, Guyon IM, Vapnik VN. A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory. COLT ’92. New York; 1992. p. 144–152.
- de Los Campos G, Gianola D, Rosa GJ. Reproducing kernel Hilbert spaces regression: a general framework for genetic evaluation.. J Anim Sci 2009 Jun;87(6):1883-7.
- Brier GW. Verification of forecasts expressed in terms of probability. Mon Weather Rev 1950;78:1.
- Cheng W, Hüllermeier E. Probability estimation for multi-class classification based on label ranking. In: Bie TD, Cristianini N, Flach PA, editors. Machine learning and knowledge discovery in databases. Lecture notes in computer science. Berlin: Springer Verlag; 2012. pp. 83–98.
- Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, Pencina MJ, Kattan MW. Assessing the performance of prediction models: a framework for traditional and novel measures.. Epidemiology 2010 Jan;21(1):128-38.
- Teegen R, Edel C, Thaller G. Population structure of the Trakehner Horse breed.. Animal 2009 Jan;3(1):6-15.
- Roos L, Hinrichs D, Nissen T, Krieter J. Investigations into genetic variability in Holstein horse breed using pedigree data. Livest Sci 2015;177:25–32.
- Hamann H, Distl O. Genetic variability in Hanoverian warmblood horses using pedigree analysis.. J Anim Sci 2008 Jul;86(7):1503-13.
- Patterson N, Price AL, Reich D. Population structure and eigenanalysis.. PLoS Genet 2006 Dec;2(12):e190.
- VanRaden PM. Efficient methods to compute genomic predictions.. J Dairy Sci 2008 Nov;91(11):4414-23.
- Janss L, de Los Campos G, Sheehan N, Sorensen D. Inferences from genomic models in stratified populations.. Genetics 2012 Oct;192(2):693-704.
- Kuo L, Mallick B. Variable selection for regression models. Sankhya Ser B 1998;60:65–81.
- Habier D, Fernando RL, Kizilkaya K, Garrick DJ. Extension of the bayesian alphabet for genomic selection.. BMC Bioinformatics 2011 May 23;12:186.
- Fernando RL, Garrick D. Bayesian methods applied to GWAS. In: Gondro C, van der Werf J, Hayes B, editors. Genome-wide association studies and genomic prediction. Methods Mol Biol; 2013. p. 237–274.
- Fernando RL, Toosi A, Garrick DJ, Dekkers JCM. Application of whole-genome prediction methods for genome-wide association studies: a Bayesian approach. In: Proceedings of the 10th World Congress of Genetics Applied to Livestock Production. Vancouver; 2014.
- Holsinger KE, Weir BS. Genetics in geographically structured populations: defining, estimating and interpreting F(ST).. Nat Rev Genet 2009 Sep;10(9):639-50.
- Forneris NS, Legarra A, Vitezica ZG, Tsuruta S, Aguilar I, Misztal I, Cantet RJ. Quality control of genotypes using heritability estimates of gene content at the marker.. Genetics 2015 Mar;199(3):675-81.
- Gengler N, Mayeres P, Szydlowski M. A simple method to approximate gene content in large pedigree populations: application to the myostatin gene in dual-purpose Belgian Blue cattle.. Animal 2007 Feb;1(1):21-8.
- R Core Team. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2014.
- Pérez P, de los Campos G. Genome-wide regression and prediction with the BGLR statistical package.. Genetics 2014 Oct;198(2):483-95.
- Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F, Chih-Chung C. e1071: Misc functions of the Department of Statistics (e1071), TU Wien. 2014.
- Chang CC, Lin CJ. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2011;2:27:1–27:27.
- Gilmour A. ASReml User Guide. Release 3.0; 2008.
- Plummer M. JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. 2003.
- Plummer M, Stukalov A. rjags: Bayesian graphical models using MCMC. 2014.
- Raj A, Stephens M, Pritchard JK. fastSTRUCTURE: variational inference of population structure in large SNP data sets.. Genetics 2014 Jun;197(2):573-89.
- Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data.. Genetics 2000 Jun;155(2):945-59.
- Meuwissen TH, Odegard J, Andersen-Ranberg I, Grindflek E. On the distance of genetic relationships and the accuracy of genomic prediction in pig breeding.. Genet Sel Evol 2014 Aug 1;46(1):49.
- Gianola D. Priors in whole-genome regression: the bayesian alphabet returns.. Genetics 2013 Jul;194(3):573-96.
- Wallace BC, Dahabreh IJ. Class probability estimates ere unreliable for imbalanced data (and how to fix them). In: Proceedings of the 2012 IEEE 12th International Conference on Data Mining. ICDM ’12. Washington; 2012. p. 695–704.
- Zhang X, Misztal I, Heidaritabar M, Bastiaansen JWM, Borg R, Okimoto R. Prior genetic architecture impacting genomic regions under selection: an example using genomic selection in two poultry breeds. Livest Sci 2015;171:1–11.
- Leinonen T, McCairns RJ, O'Hara RB, Merilä J. Q(ST)-F(ST) comparisons: evolutionary and ecological insights from genomic heterogeneity.. Nat Rev Genet 2013 Mar;14(3):179-90.
- de los Campos G, Sorensen D. On the genomic analysis of data from structured populations.. J Anim Breed Genet 2014 Jun;131(3):163-4.
- Congdon P. Bayesian models for categorical data. Wiley series in probability and statistics. New York: Wiley; 2005.
- Götz KU, Thaller G. Assignment of individuals to populations using microsatellites. J Anim Breed Genet 1998;115:53–61.
Citations
This article has been cited 6 times.- Lindsay-McGee V, Sanchez-Molano E, Banos G, Clark EL, Piercy RJ, Psifidi A. Genetic characterisation of the Connemara pony and the Warmblood horse using a within-breed clustering approach.. Genet Sel Evol 2023 Aug 17;55(1):60.
- Reinoso-Peláez EL, Gianola D, González-Recio O. Genome-Enabled Prediction Methods Based on Machine Learning.. Methods Mol Biol 2022;2467:189-218.
- Jiang Y, Weise S, Graner A, Reif JC. Using Genome-Wide Predictions to Assess the Phenotypic Variation of a Barley (Hordeum sp.) Gene Bank Collection for Important Agronomic Traits and Passport Information.. Front Plant Sci 2020;11:604781.
- Nolte W, Thaller G, Kuehn C. Selection signatures in four German warmblood horse breeds: Tracing breeding history in the modern sport horse.. PLoS One 2019;14(4):e0215913.
- Chen T, Brewster P, Tuttle KR, Dworkin LD, Henrich W, Greco BA, Steffes M, Tobe S, Jamerson K, Pencina K, Massaro JM, D'Agostino RB Sr, Cutlip DE, Murphy TP, Cooper CJ, Shapiro JI. Prediction of cardiovascular outcomes with machine learning techniques: application to the Cardiovascular Outcomes in Renal Atherosclerotic Lesions (CORAL) study.. Int J Nephrol Renovasc Dis 2019;12:49-58.
- Montesinos-López OA, Montesinos-López A, Luna-Vázquez FJ, Toledo FH, Pérez-Rodríguez P, Lillemo M, Crossa J. An R Package for Bayesian Analysis of Multi-environment and Multi-trait Multi-environment Data for Genome-Based Prediction.. G3 (Bethesda) 2019 May 7;9(5):1355-1369.
Use Nutrition Calculator
Check if your horse's diet meets their nutrition requirements with our easy-to-use tool Check your horse's diet with our easy-to-use tool
Talk to a Nutritionist
Discuss your horse's feeding plan with our experts over a free phone consultation Discuss your horse's diet over a phone consultation
Submit Diet Evaluation
Get a customized feeding plan for your horse formulated by our equine nutritionists Get a custom feeding plan formulated by our nutritionists