Analyze Diet
Animals : an open access journal from MDPI2024; 14(18); doi: 10.3390/ani14182723

Supervised Machine Learning Techniques for Breeding Value Prediction in Horses: An Example Using Gait Visual Scores.

Abstract: Gait scores are widely used in the genetic evaluation of horses. However, the nature of such measurement may limit genetic progress since there is subjectivity in phenotypic information. This study aimed to assess the application of machine learning techniques in the prediction of breeding values for five visual gait scores in Campolina horses: dissociation, comfort, style, regularity, and development. The dataset contained over 5000 phenotypic records with 107,951 horses (14 generations) in the pedigree. A fixed model was used to estimate least-square solutions for fixed effects and adjusted phenotypes. Variance components and breeding values (EBV) were obtained via a multiple-trait model (MTM). Adjusted phenotypes and fixed effects solutions were used to train machine learning models (using the EBV from MTM as target variable): artificial neural network (ANN), random forest regression (RFR) and support vector regression (SVR). To validate the models, the linear regression method was used. Accuracy was comparable across all models (but it was slightly higher for ANN). The highest bias was observed for ANN, followed by MTM. Dispersion varied according to the trait; it was higher for ANN and the lowest for MTM. Machine learning is a feasible alternative to EBV prediction; however, this method will be slightly biased and over-dispersed for young animals.
Publication Date: 2024-09-20 PubMed ID: 39335312PubMed Central: PMC11429212DOI: 10.3390/ani14182723Google Scholar: Lookup
The Equine Research Bank provides access to a large database of publicly available scientific literature. Inclusion in the Research Bank does not imply endorsement of study methods or findings by Mad Barn.
  • Journal Article

Summary

This research summary has been generated with artificial intelligence and may contain errors and omissions. Refer to the original study to confirm details provided. Submit correction.

This research paper explores the application of machine learning techniques in predicting the genetic traits of horses, specifically five visual gait scores. It concludes that while machine learning provides a feasible alternative to traditional breeding value prediction methods, its predictions may be slightly biased and over-dispersed for young animals.

Study Objective and Approach

  • The objective of this study was to assess the application of machine learning techniques in the prediction of breeding values for five visual gait scores in Campolina horses. These scores were dissociation, comfort, style, regularity, and development.
  • The researchers collated a dataset comprising over 5000 phenotypic records from 107,951 horses across 14 generations.
  • In order to calculate the least-square solutions for fixed effects and adjusted phenotypes, the team applied a fixed model.
  • Multiple-trait modeling was used to derive variance components and breeding values (EBV).
  • The adjusted phenotypes and fixed effect solutions were the basis for training three types of machine learning models: artificial neural networks (ANN), random forest regression (RFR), and support vector regression (SVR), aiming to predict EBV.

Model Validation and Results

  • The machine learning models were validated using the linear regression method.
  • All three machine learning models displayed comparable accuracy, although the ANN model showed a slightly higher accuracy.
  • The highest bias, however, was observed in the ANN model, followed by the multiple-trait model (MTM).
  • Model dispersion varied according to the trait, with ANN displaying the highest and MTM the lowest.

Conclusions

  • The study indicated that machine learning can be a valid alternative to traditional methods for predicting Estimated Breeding Values (EBV) in horses.
  • However, machine learning predictions may be slightly biased and over-dispersed especially for young animals.

Despite its limitations, this study shows that machine learning can improve the accuracy of genetic predictions in horse breeding, potentially streamlining the selection process for specific traits.

Cite This Article

APA
Bussiman F, Alves AAC, Richter J, Hidalgo J, Veroneze R, Oliveira T. (2024). Supervised Machine Learning Techniques for Breeding Value Prediction in Horses: An Example Using Gait Visual Scores. Animals (Basel), 14(18). https://doi.org/10.3390/ani14182723

Publication

ISSN: 2076-2615
NlmUniqueID: 101635614
Country: Switzerland
Language: English
Volume: 14
Issue: 18

Researcher Affiliations

Bussiman, Fernando
  • Animal and Dairy Science Department, University of Georgia, Athens, GA 30602, USA.
Alves, Anderson A C
  • Animal and Dairy Science Department, University of Georgia, Athens, GA 30602, USA.
Richter, Jennifer
  • Animal and Dairy Science Department, University of Georgia, Athens, GA 30602, USA.
Hidalgo, Jorge
  • Animal and Dairy Science Department, University of Georgia, Athens, GA 30602, USA.
Veroneze, Renata
  • Animal and Dairy Science Department, University of Georgia, Athens, GA 30602, USA.
  • Animal Science Department, Federal University of Viçosa, Viçosa 36570-900, Brazil.
Oliveira, Tiago
  • Statistics Department, State University of Paraíba, Campina Grande 58429-500, Brazil.

Conflict of Interest Statement

The authors declare no conflicts of interest.

References

This article includes 73 references
  1. Nicodemus M.C., Clayton H.M.. Temporal Variables of Four-Beat, Stepping Gaits of Gaited Horses. Appl. Anim. Behav. Sci. 2003;80:133–142.
  2. Wanderley E.K., Manso Filho H.C., Manso H.E.C.C.C., Santiago T.A., McKeever K.H.. Metabolic Changes in Four Beat Gaited Horses after Field Marcha Simulation. Equine Vet. J. 2010;42:105–109.
  3. Bussiman F.D.O., dos Santos B.A., Abreu Silva B.C., Perez B.C., Pereira G.L., Chardulo L.A.L., Eler J.P., Ferraz J.B.S., Mattos E.C., Curi R.A.. Allelic and Genotypic Frequencies of the Dmrt3 Gene in the Brazilian Horse Breed Mangalarga Marchador and Their Association with Types of Gait. Genet. Mol. Res. 2019;18:1–11.
    doi: 10.4238/gmr18217google scholar: lookup
  4. Novoa-Bravo M., Jäderkvist Fegraeus K., Rhodin M., Strand E., García L.F., Lindgren G.. Selection on the Colombian Paso Horse’s Gaits Has Produced Kinematic Differences Partly Explained by the Dmrt3 Gene. PLoS ONE 2018;13:1–18.
  5. Emil O., Andersen P.H., Pfau T.. Accuracy and Precision of Equine Gait Event Detection during Walking with Limb and Trunk Mounted Inertial Sensors. Sensors 2012;12:8145–8156.
    doi: 10.3390/s120608145pmc: PMC3436021pubmed: 22969392google scholar: lookup
  6. Serra Bragança F.M., Broomé S., Rhodin M., Björnsdóttir S., Gunnarsson V., Voskamp J.P., Persson-Sjodin E., Back W., Lindgren G., Novoa-Bravo M.. Improving Gait Classification in Horses by Using Inertial Measurement Unit (Imu) Generated Data and Machine Learning. Sci. Rep. 2020;10:17785.
    doi: 10.1038/s41598-020-73215-9pmc: PMC7576586pubmed: 33082367google scholar: lookup
  7. Lage J., Fonseca M.G., de Barros G.G.M., Feringer-Júnior W.H., Pereira G.T., Ferraz G.C.. Workload of Official Contests, Net Cost of Transport, and Metabolic Power of Mangalarga Marchador Horses of Marcha Batida or Picada Gaits. J. Anim. Sci. 2017;95:2488–2495.
    pubmed: 28727062
  8. Rustin M., Janssens S., Buys N., Gengler N.. Multi-Trait Animal Model Estimation of Genetic Parameters for Linear Type and Gait Traits in the Belgian Warmblood Horse. J. Anim. Breed. Genet. 2009;126:378–386.
  9. Vicente A.A., Carolino N., Ralão-Duarte J., Gama L.T.. Selection for Morphology, Gaits and Functional Traits in Lusitano Horses: Ii. Fixed Effects, Genetic Trends and Selection in Retrospect. Livest. Sci. 2014;164:13–25.
  10. Vicente A.A., Carolino N., Ralão-Duarte J., Gama L.T.. Selection for Morphology, Gaits and Functional Traits in Lusitano Horses: I. Genetic Parameter Estimates. Livest. Sci. 2014;164:1–12.
  11. de Oliveira Bussiman F., da Costa Perez B., Ventura R.V., Silva F.F.E., Peixoto M.G.C.D., Vizoná R.G., Mattos E.C., Ferraz J.B.S., Eler J.P., Curi R.A.. Genetic Analysis of Morphological and Functional Traits in Campolina Horses Using Bayesian Multi-Trait Model. Livest. Sci. 2018;216:119–129.
  12. Bartolomé E., Menéndez-Buxadera A., Molina A., Valera M.. Plasticity Effect of Rider-Horse Interaction on Genetic Evaluations for Show Jumping Discipline in Sport Horses. J. Anim. Breed. Genet. 2018;135:138–148.
    doi: 10.1111/jbg.12315pubmed: 29363192google scholar: lookup
  13. de Oliveira B.F., Carvalho R.S.B., Silva F.F.E., Ventura R.V., Ferraz J.B.S., Mattos E.C., Eler J.P., de Carvalho Balieiro J.C.. Reduced Rank Analysis of Morphometric and Functional Traits in Campolina Horses. J. Anim. Breed. Genet. 2022;139:231–246.
    doi: 10.1111/jbg.12658pubmed: 34841593google scholar: lookup
  14. Molina A., Valera M., Santos R.D., Rodero A.. Genetic Parameters of Morphofunctional Traits in Andalusian Horse. Livest. Prod. Sci. 1999;60:295–303.
  15. Lubos V., Vostrà-vydrovà H., Hofmanovà B., Veselà Z., Schmidovà J., Majzlik I.. Genetic Parameters for Linear Type Traits in Three Czech Draught Horse Breeds. Agric. Conspec. Sci. 2017;82:111–115.
  16. Thompson R., Meyer K.. A Review of Theoretical Aspects in the Estimation of Breeding Values for Multi-Trait Selection. Livest. Prod. Sci. 1986;15:299–313.
  17. van der Werf J.H.J., van Arendonk J.A.M., de Vries A.G.. Book of abstracts of European Federation of Animal Science. Wageningen Academic Publishers; Madrid, Spain: 1992.
  18. Pollak E.J., van der Werf J., Quaas R.L.. Selection Bias and Multiple Trait Evaluation. J. Dairy Sci. 1984;67:1590–1595.
  19. Jorge H., Lourenco D., Tsuruta S., Bermann M., Breen V., Herring W., Misztal I.. Efficient Ways to Combine Data from Broiler and Layer Chickens to Account for Sequential Genomic Selection. J. Anim. Sci. 2023;101:skad177.
    pmc: PMC10276640pubmed: 37249185
  20. Jorge H., Tsuruta S., Lourenco D., Masuda Y., Huang Y., Gray K.A., Misztal I.. Changes in Genetic Parameters for Fitness and Growth Traits in Pigs under Genomic Selection. J. Anim. Sci. 2020;98:skaa032.
    pmc: PMC7039409pubmed: 31999338
  21. Jennifer R., Hidalgo J., Bussiman F., Breen V., Misztal I., Lourenco D.. Temporal Dynamics of Genetic Parameters and Snp Effects for Performance and Disorder Traits in Poultry Undergoing Genomic Selection. J. Anim. Sci. 2024;102:skae097.
    pmc: PMC11044709pubmed: 38576313
  22. Karin M., Kirkpatrick M.. Perils of Parsimony: Properties of Reduced-Rank Estimates of Genetic Covariance Matrices. Genetics 2008;180:1153–1166.
    pmc: PMC2567364pubmed: 18757923
  23. Meyer K.. Genetic Principal Components for Live Ultrasound Scan Traits of Angus Cattle. Anim. Sci. 2005;81:337–345.
    doi: 10.1079/ASC50850337google scholar: lookup
  24. Fernando B., Chen C.-Y., Holl J., Bermann M., Legarra A., Misztal I., Lourenco D.. Boundaries for Genotype, Phenotype, and Pedigree Truncation in Genomic Evaluations in Pigs. J. Anim. Sci. 2023;101:skad273.
    pmc: PMC10464514pubmed: 37584978
  25. Jorge H., Lourenco D., Tsuruta S., Bermann M., Breen V., Misztal I.. Derivation of Indirect Predictions Using Genomic Recursions across Generations in a Broiler Population. J. Anim. Sci. 2023;101:skad355.
    pmc: PMC10630029pubmed: 37837636
  26. Shadi N., Sargolzaei M., Tulpan D.. A Review of Traditional and Machine Learning Methods Applied to Animal Breeding. Anim. Health Res. Rev. 2019;20:31–46.
    pubmed: 31895018
  27. López M., Antonio O., López A.M., Crossa J.. Multivariate Statistical Machine Learning Methods for Genomic Prediction. Springer International Publishing; Cham, Switzerland: 2022.
    pubmed: 36103587
  28. Carvalho A.A.A., Andrietta L.T., Lopes R.Z., de Oliveira Bussiman F., Silva F.F.E., Carvalheiro R., Brito L.F., de Carvalho Balieiro J.C., Albuquerque L.G., Ventura R.V.. Integrating Audio Signal Processing and Deep Learning Algorithms for Gait Pattern Classification in Brazilian Gaited Horses. Front. Anim. Sci. 2021;2:681557.
  29. Bengio Y.. Learning Deep Architectures for Ai. Found. Trends® Mach. Learn. 2009;2:1–127.
    doi: 10.1561/2200000006google scholar: lookup
  30. Trevor H., Tibshirani R., Friedman J.. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer; New York, NY, USA: 2009.
  31. Hastie T., Tibshirani R., Friedman J.H., Friedman J.H.. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer; New York, NY, USA: 2009.
  32. López M., Antonio O., López A.M., Crossa J.. Multivariate Statistical Machine Learning Methods for Genomic Prediction. Springer International Publishing; Cham, Switzerland: 2022.
    pubmed: 36103587
  33. ABCCCampolina. Regulamento Do Serviço De Registro Genealógico Do Cavalo Campolina—SRGCC 212028.006084/2017-11 No. 39/2018/SMA. P.1-18. Ministério da Agricultura; Pesca e Abastecimento, Brazil: 2018.
  34. SAS Institute Inc.. Sas/Stat User’s Guide. SAS Institute Inc., SAS Campus Drive; Cary, NC, USA: 2017.
  35. Misztal I., Tsuruta S., Lourenco D.A.L., Masuda Y., Aguilar I., Legarra A., Vitezica Z.G.. Manual for Blupf90 Family of Programs. University of Georgia. [(accessed on 15 November 2023)].
  36. Jacob C.. Multiple Regression as a General Data-Analytic System. Psychol. Bull. 1968;70:426–443.
  37. Daniel B.S.. Use of Dummy Variables in Regression Equations. J. Am. Stat. Assoc. 1957;52:548–551.
  38. Jacob K.. Fastdummies: Fast Creation of Dummy (Binary) Columns and Rows from Categorical Variables. [(accessed on 8 April 2024)].
  39. Yixuan Q., Mei J.. Rspectra: Solvers for Large-Scale Eigenvalue and Svd Problems. [(accessed on 8 April 2024)].
  40. Chien-Chih W., Chang H.-T., Chien C.-H.. Hybrid Lstm-Arma Demand-Forecasting Model Based on Error Compensation for Integrated Circuit Tray Manufacturing. Mathematics 2022;10:2158.
    doi: 10.3390/math10132158google scholar: lookup
  41. Ian G., Bengio Y., Courville A.. Regularization for Deep Learning. In: Dietterich T., editor. Deep Learning. MIT Press; Cambridge, UK: 2016.
  42. Allaire J.J., Chollet F.. Keras: R Interface to ‘Keras’. [(accessed on 9 April 2024)].
  43. Allaire J.J., Tang Y.. Tensorflow: R Interface to ‘Tensorflow’. [(accessed on 9 April 2024)].
  44. Mariette A., Khanna R.. Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers. Apress; Berkeley, CA, USA: 2015.
  45. Zhang F., O’Donnell L.J.. Chapter 7—Support Vector Regression. In: Mechelli A., Vieira S., editors. Machine Learning. Academic Press; Cambridge, USA: 2020.
  46. Meyer D., Dimitriadou E., Hornik K., Weingessel A., Leisch F.. E1071: Misc Functions of the Department Fo Statistics, Probability Group (Formerly: E1071), Tu Wien. [(accessed on 9 April 2024)].
  47. Leo B.. Random Forests. Mach. Learn. 2001;45:5–32.
  48. Andy L., Wiener M.. Classification and Regression by Randomforest. R News 2002;2:18–22.
  49. R Core Team. R Foundation for Statistical Computing. R Foundation for Statistical Computing; Vienna, Austria: 2020.
  50. Andres L., Reverter A.. Semi-Parametric Estimates of Population Accuracy and Bias of Predictions of Breeding Values and Future Phenotypes Using the Lr Method. Genet. Sel. Evol. 2018;50:53.
    pmc: PMC6219059pubmed: 30400768
  51. Hadley W.. Ggplot2: Elegant Graphics for Data Analysis. 2nd ed. Springer International Publishing; New York, NY, USA: 2016.
  52. de Oliveira B.F., da Costa Perez B., Ventura R.V., Peixoto M.G.C.D., Curi R.A., Balieiro J.C.C.. Pedigree Analysis and Inbreeding Effects over Morphological Traits in Campolina Horse Population. Animal 2018;12:2246–2255.
    pubmed: 29467044
  53. Procópio A.M., Bergmann J.A.G., Costa M.D.. Formação E Demografia Da Raça Campolina. Arq. Bras. De Med. Veterinária E Zootec. 2003;55:361–365.
  54. Thorvaldur Á.. Sustainable Food Production. Springer; New York, NY, USA: 2013.
  55. Thorvaldur Á., Van Vleck L.D.. Genetic Improvement of the Horse. In: Bowling A.T., Ruvinsky A., editors. The Genetics of the Horse. CABI Publishing; Oxford, UK: 2000.
  56. Isabel C., Gutiérrez J.P., García-Ballesteros S., Varona L.. Combining Threshold, Thurstonian and Classical Linear Models in Horse Genetic Evaluations for Endurance Competitions. Animals 2020;10:1075.
    doi: 10.3390/ani10061075pmc: PMC7341300pubmed: 32580415google scholar: lookup
  57. Velie B.D., Hamilton N.A., Wade C.M.. Heritability of Racing Performance in the Australian Thoroughbred Racing Population. Anim. Genet. 2015;46:23–29.
    doi: 10.1111/age.12234pubmed: 25393770google scholar: lookup
  58. Anne R., Legarra A.. Validation of Models for Analysis of Ranks in Horse Breeding Evaluation. Genet. Sel. Evol. 2010;42:3.
    pmc: PMC2832620pubmed: 20109204
  59. Luis V., Legarra A.. Gibbsthur: Software for Estimating Variance Components and Predicting Breeding Values for Ranking Traits Based on a Thurstonian Model. Animals 2020;10:1001.
    doi: 10.3390/ani10061001pmc: PMC7341208pubmed: 32521773google scholar: lookup
  60. Fonseca M.G.. Mangalarga Marchador: Estudo Mofométrico, Cinemático E Genético Da Marcha Batida E Da Marcha Picada. Ph.D. Thesis. Sao Paulo State University; Jaboticabal, SP, Brazil: 2018.
  61. Álvares S.F.C.. Cinemática Das Marchas Batida E Picada Durante Julgamento De Equinos Montados Da 39 Exposição Nacional Do Cavalo Mangalarga Marchador. Master’s Thesis. Federal University of Minas Gerais; Belo Horizonte, MG, Brazil: 2023.
  62. López M., Antonio O., López A.M., Crossa J.. Multivariate Statistical Machine Learning Methods for Genomic Prediction. Springer International Publishing; Cham, Switzerland: 2022.
    pubmed: 36103587
  63. Macedo F.L., Reverter A., Legarra A.. Behavior of the Linear Regression Method to Estimate Bias and Accuracies with Correct and Incorrect Genetic Evaluation Models. J. Dairy Sci. 2020;103:529–544.
    doi: 10.3168/jds.2019-16603pubmed: 31704008google scholar: lookup
  64. Saleh S., Mehrabani-Yeganeh H., Lucas C., Kalhor A., Kazemian M., Weigel K.A.. Prediction of Breeding Values for Dairy Cattle Using Artificial Neural Networks and Neuro-Fuzzy Systems. Comput. Math. Methods Med. 2012;2012:1–9.
    pmc: PMC3444039pubmed: 22991575
  65. Hamidreza G., Mohammadabadi M., Nezamabadi-pour H., Babenko O.I., Bushtruk M.V., Tkachenko S.V.. Predicting Breeding Value of Body Weight at 6-Month Age Using Artificial Neural Networks in Kermani Sheep Breed. Acta Sci. Anim. Sci. 2019;41:45282.
  66. Pour Hamidi S., Mohammadabadi M.R., Foozi M.A., Nezamabadi-pour H.. Prediction of Breeding Values for the Milk Production Trait in Iranian Holstein Cows Applying Artificial Neural Networks. J. Livest. Sci. Technol. 2017;5:53–61.
  67. Macedo F.L., Astruc J.M., Meuwissen T.H.E., Legarra A.. Removing Data and Using Metafounders Alleviates Biases for All Traits in Lacaune Dairy Sheep Predictions. J. Dairy Sci. 2022;105:2439–2452.
    doi: 10.3168/jds.2021-20860pubmed: 35033343google scholar: lookup
  68. Wei Z., Lai X., Liu D., Zhang Z., Ma P., Wang Q., Zhang Z., Pan Y.. Applications of Support Vector Machine in Genomic Prediction in Pig and Maize Populations. Front. Genet. 2020;11:598318.
    pmc: PMC7744740pubmed: 33343636
  69. Gerhard M., Tier B., Crump R.E., Khatkar M.S., Raadsma H.W.. A Comparison of Five Methods to Predict Genomic Breeding Values of Dairy Bulls from Genome-Wide Snp Markers. Genet. Sel. Evol. 2009;41:56.
    pmc: PMC2814805pubmed: 20043835
  70. Nanye L., Gianola D., Rosa G.J.M., Weigel K.A.. Application of Support Vector Regression to Genome-Assisted Prediction of Quantitative Traits. Theor. Appl. Genet. 2011;123:1065–1074.
    pubmed: 21739137
  71. Gota M., Gianola D.. Kernel-Based Whole-Genome Prediction of Complex Traits: A Review. Front. Genet. 2014;5:363.
    pmc: PMC4199321pubmed: 25360145
  72. Karansher S., Patil S.S., Pumphrey M., Carter A.. Multitrait Machine- and Deep-Learning Models for Genomic Selection Using Spectral Information in a Wheat Breeding Program. Plant Genome 2021;14:e20119.
    pubmed: 34482627
  73. Trevor H., Tibshirani R., Friedman J.. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer; New York, NY, USA: 2009.