Analyze Diet
BMC genomics2025; 26(1); 790; doi: 10.1186/s12864-025-11985-0

High-quality, haplotype-resolved reference genomes of the Dutch warmblood horse and Friesian horse using trio binning.

Abstract: In horses, genetic diversity is predominantly observed between breeds, with little variation within breeds. The studbooks of the two largest horse populations in the Netherlands, the Dutch Warmblood horse and Friesian horse population, have ongoing conservation projects including collecting large-scale genotype and sequence data. The current reference genome, derived from a Thoroughbred horse can lead to bias in genetic analyses of other horse breeds. Therefore, the aim of this study was to create high-quality breed-specific reference genomes of Dutch Warmblood and Friesian horses. We performed nanopore long-read sequencing (R10.4, Q20+) of an F1 cross between a Dutch Warmblood horse and a Friesian horse to create two breed-specific reference genomes by trio binning. This resulted in high-quality, haplotype-resolved reference genomes with contig N50 of 37 and 35 Mb and single copy gene completeness of 99.2 and 99.3% for the Friesian and Warmblood, respectively. The majority of the chromosomes contained telomeric and /or centromeric sequences. The Ensembl gene annotation resulted in 19,750 and 19,872 protein coding genes for the Friesian and Warmblood, respectively. No large chromosomal rearrangements were observed between the Friesian and Warmblood genomes. However, a total of 722 large structural variations (> 10 kb) were identified, of which 14 affect the coding sequence of protein-coding genes. The novel breed-specific reference genomes provide a valuable resource for future genetic analysis and breed conservation efforts and will contribute to ongoing equine pangenome efforts. The online version contains supplementary material available at 10.1186/s12864-025-11985-0.
Publication Date: 2025-09-01 PubMed ID: 40890628PubMed Central: PMC12400632DOI: 10.1186/s12864-025-11985-0Google Scholar: Lookup
The Equine Research Bank provides access to a large database of publicly available scientific literature. Inclusion in the Research Bank does not imply endorsement of study methods or findings by Mad Barn.
  • Journal Article

Summary

This research summary has been generated with artificial intelligence and may contain errors and omissions. Refer to the original study to confirm details provided. Submit correction.

Overview

  • This study developed high-quality, breed-specific reference genomes for the Dutch Warmblood and Friesian horses using advanced sequencing technology and trio binning approaches.
  • The new genomes improve genetic analysis accuracy over the previous reference derived from a Thoroughbred horse and support conservation efforts for these Dutch horse breeds.

Background

  • Horses exhibit most genetic diversity between different breeds rather than within a single breed.
  • The Dutch Warmblood and Friesian horses are two of the largest horse populations in the Netherlands and have active breeding and conservation programs.
  • Previous horse genetic studies relied on a reference genome from a Thoroughbred horse, which may not adequately represent other breeds and can introduce biases.

Research Objective

  • To generate high-quality, breed-specific, haplotype-resolved reference genomes for the Dutch Warmblood and Friesian horses.
  • These references are expected to enhance the accuracy of genetic studies and support breed conservation.

Methods

  • Sequencing was performed using nanopore long-read sequencing technology (R10.4, Q20+) on an F1 crossbred offspring between a Dutch Warmblood and Friesian horse.
  • A trio binning approach was employed, which uses parental genetic information to separate (bin) the offspring’s sequencing reads into two groups—one for each breed’s haplotype.
  • This method enables the construction of two separate breed-specific genome assemblies from the hybrid individual.

Results

  • Two high-quality, haplotype-resolved reference genomes were created:
    • Friesian genome: contig N50 of 37 Mb, 99.2% completeness of single-copy genes
    • Dutch Warmblood genome: contig N50 of 35 Mb, 99.3% completeness of single-copy genes
  • Most chromosomes in both genomes included telomeric and/or centromeric sequences, indicating well-assembled chromosome ends and central parts.
  • Gene annotation identified approximately 19,750 protein-coding genes in the Friesian genome and 19,872 in the Warmblood genome using Ensembl pipelines.
  • No large-scale chromosomal rearrangements were detected between the two breed genomes, indicating structural similarity at the chromosome level.
  • However, 722 large structural variations (>10 kb) were found between the two breeds; 14 of these affect protein-coding gene sequences, potentially influencing breed-specific traits.

Implications and Significance

  • The newly produced breed-specific reference genomes provide a more accurate framework for genetic analyses within Dutch Warmblood and Friesian horses.
  • They reduce bias introduced by using a Thoroughbred reference genome when studying other breeds.
  • The genomes support ongoing breed conservation programs by offering detailed genetic information for managing breed diversity.
  • This work contributes to the broader equine pangenome initiative, which aims to capture the full genetic diversity across horse breeds.
  • The data, including supplementary materials, are publicly available for further research and application.

Cite This Article

APA
Steensma MJ, Ducro BJ, Dibbits B, Doekes HP, van Schipstal JGC, Kalblfleisch T, Groenen MAM, Derks MFL. (2025). High-quality, haplotype-resolved reference genomes of the Dutch warmblood horse and Friesian horse using trio binning. BMC Genomics, 26(1), 790. https://doi.org/10.1186/s12864-025-11985-0

Publication

ISSN: 1471-2164
NlmUniqueID: 100965258
Country: England
Language: English
Volume: 26
Issue: 1
Pages: 790
PII: 790

Researcher Affiliations

Steensma, Marije J
  • Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands. marije.steensma@wur.nl.
Ducro, Bart J
  • Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
Dibbits, Bert
  • Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
Doekes, Harmen P
  • Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
van Schipstal, Job G C
  • Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
Kalblfleisch, Ted
  • Maxwell H. Gluck Equine Research Center, University of Kentucky, Lexington, KY, 40546, USA.
Groenen, Martien A M
  • Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
Derks, Martijn F L
  • Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.

Grant Funding

  • 4164023400 / Topconsortium voor Kennis en Innovatie
  • 4164023400 / Topconsortium voor Kennis en Innovatie

Conflict of Interest Statement

Declarations. Ethics approval and consent to participate: The biological material used in this study was collected as part of routine data collection from the KFPS, and not specifically for the purpose of this project. Therefore, approval of an ethics committee was not mandatory. Blood sample collection was done by a licensed vet following the “Code of Good Veterinary Practice”. No animals were euthanized/sacrificed or anaesthetized for this study. Sample collection was conducted strictly in line with Dutch law on the protection of animals (Gezondheids- en welzijnswet voor dieren). Informed consent was obtained by the owner of the F1 and both parents to collect blood samples for DNA extraction. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.

References

This article includes 62 references
  1. Hendricks B, Lou. International encyclopedia of horse breeds. Norman: University of Oklahoma Press; 1995.
  2. Petersen JL, Mickelson JR, Cothran EG, Andersson LS, Axelsson J, Bailey E. Genetic diversity in the modern horse illustrated from genome-wide SNP data. PLoS One 2013;8: e54997.
    pmc: PMC3559798pubmed: 23383025
  3. Petersen JL, Mickelson JR, Rendahl AK, Valberg SJ, Andersson LS, Axelsson J. Genome-wide analysis reveals selection for important traits in domestic horse breeds. PLoS Genet 2013;9: e1003211.
    pmc: PMC3547851pubmed: 23349635
  4. Online Mendelian Inheritance in Animals, OMIA. Sydney School of Veterinary Science, retrieved at 13-02-2023. 2023. World Wide Web URL: https://www.omia.org/. Accessed 13 Sep 2022.
  5. Raudsepp T, Finno CJ, Bellone RR, Petersen JL. Ten years of the horse reference genome: insights into equine biology, domestication and population dynamics in the post-genome era. Anim Genet 2019;50:569–97.
    pmc: PMC6825885pubmed: 31568563
  6. Wade CM, Giulotto E, Sigurdsson S, Zoli M, Gnerre S, Imsland F. Genome sequence, comparative analysis, and population genetics of the domestic horse. Sci (1979) 2009;326:865–7.
    pmc: PMC3785132pubmed: 19892987
  7. Kalbfleisch TS, Rice ES, DePriest MS Jr, Walenz BP, Hestand MS, Vermeesch JR. Improved reference genome for the domestic horse increases assembly contiguity and composition. Commun Biol 2018;1(1): 197.
    pmc: PMC6240028pubmed: 30456315
  8. Raudsepp T, Gustafson-Seabury A, Durkin K, Wagner ML, Goh G, Seabury CM. A 4,103 marker integrated physical and comparative map of the horse genome. Cytogenet Genome Res 2008;122:28–36.
    pmc: PMC2587302pubmed: 18931483
  9. Coleman SJ, Zeng Z, Wang K, Luo S, Khrebtukova I, Mienaltowski MJ. Structural annotation of equine protein-coding genes determined by mRNA sequencing. Anim Genet 2010;41:121–30.
    pubmed: 21070285
  10. Schaefer RJ, Schubert M, Bailey E, Bannasch DL, Barrey E, Bar-Gal GK. Developing a 670k genotyping array to tag ~ 2 m SNPs across 24 horse breeds. BMC Genomics 2017;18:1–18.
    pmc: PMC5530493pubmed: 28750625
  11. McCue ME, Bannasch DL, Petersen JL, Gurr J, Bailey E, Binns MM. A high density SNP array for the domestic horse and extant perissodactyla: utility for association mapping, genetic diversity, and phylogeny studies. PLoS Genet 2012;8: e1002451.
    pmc: PMC3257288pubmed: 22253606
  12. Bellone RR, Holl H, Setaluri V, Devi S, Maddodi N, Archer S. Evidence for a retroviral insertion in TRPM1 as the cause of congenital stationary night blindness and Leopard complex spotting in the horse. PLoS One 2013;8: e78280.
    pmc: PMC3805535pubmed: 24167615
  13. Schubert M, Jónsson H, Chang D, Der Sarkissian C, Ermini L, Ginolhac A. Prehistoric genomes reveal the genetic foundation and cost of horse domestication. Proc Natl Acad Sci U S A 2014;111:E5661-9.
    pmc: PMC4284583pubmed: 25512547
  14. Librado P, Der Sarkissian C, Ermini L, Schubert M, Jónsson H, Albrechtsen A. Tracking the origins of Yakutian horses and the genetic basis for their fast adaptation to subarctic environments. Proc Natl Acad Sci U S A 2015;112:E6889-97.
    pmc: PMC4687531pubmed: 26598656
  15. Librado P, Gamba C, Gaunitz C, Der Sarkissian C, Pruvost M, Albrechtsen A. Ancient genomic changes associated with domestication of the horse. Sci (1979) 2017;356:442–5.
    pubmed: 28450643
  16. Pokharel K, Weldenegodguad M, Reilas T, Kantanen J. Equcab_finn: a new reference genome assembly for the domestic horse, finnhorse.. Anim Genet 2024.
    pubmed: 38986537doi: 10.1111/age.13463google scholar: lookup
  17. Li K, Miller D, Antczak D, AbouEI Ela NA, Johnson L, Ciosek JL. A Thoroughbred T2T Reference Genome.. In: 14th International Havemeyer Foundation Horse Genome Workshop, May 12th – 15th, 2024. France; 2024.
  18. Yuan S, Qin Z. Read-mapping using personalized diploid reference genome for RNA sequencing data reduced bias for detecting allele-specific expression.. In: 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops. IEEE; 2012;718–24.
    pmc: PMC4304670pubmed: 25621316
  19. Liu W-C, Lin C-P, Cheng C-P, Ho C-H, Lan K-L, Cheng J-H. Aligning to the sample-specific reference sequence to optimize the accuracy of next-generation sequencing analysis for hepatitis B virus.. Hepatol Int 2016;10:147–57.
    pmc: PMC4722079pubmed: 26208819
  20. Wang D, Yang L, Ning C, Liu J-F, Zhao X. Breed-specific reference sequence optimized mapping accuracy of NGS analyses for pigs.. BMC Genomics 2021;22:1–8.
    pmc: PMC8507312pubmed: 34641784
  21. Schurink A, Arts DJG, Ducro BJ. Genetic diversity in the Dutch Harness horse population using pedigree analysis.. Livest Sci 2012;143:270–7.
  22. Schurink A, Shrestha M, Eriksson S, Bosse M, Bovenhuis H, Back W. The genomic makeup of nine horse populations sampled in the Netherlands.. Genes 2019;10: 480.
    pmc: PMC6627704pubmed: 31242710
  23. Ducro BJ, Schurink A, Bastiaansen JWM, Boegheim IJM, van Steenbeek FG, Vos-Loohuis M. A nonsense mutation in B3GALNT2 is concordant with hydrocephalus in Friesian horses.. BMC Genomics 2015;16:1–9.
    pmc: PMC4600337pubmed: 26452345
  24. Leegwater PA, Vos-Loohuis M, Ducro BJ, Boegheim IJ, van Steenbeek FG, Nijman IJ. Dwarfism with joint laxity in Friesian horses is associated with a splice site mutation in B4GALT7.. BMC Genomics 2016;17:1–9.
    pmc: PMC5084406pubmed: 27793082
  25. Ploeg M. Challenging Friesian horse diseases: aortic rupture and megaesophagus.. Thesis. Utrecht University; 2015.
  26. Koren S, Rhie A, Walenz BP, Dilthey AT, Bickhart DM, Kingan SB. De novo assembly of haplotype-resolved genomes with trio binning.. Nat Biotechnol 2018;36:1174–82.
    pmc: PMC6476705pubmed: 30346939
  27. Ranallo-Benavidez TR, Jaron KS, Schatz MC. GenomeScope 2.0 and smudgeplot for reference-free profiling of polyploid genomes.. Nat Commun 2020;11: 1432.
    pmc: PMC7080791pubmed: 32188846
  28. Kolmogorov M, Yuan J, Lin Y, Pevzner PA. Assembly of long, error-prone reads using repeat graphs.. Nat Biotechnol 2019;37:540–6.
    pubmed: 30936562
  29. Alonge M, Lebeigle L, Kirsche M, Jenike K, Ou S, Aganezov S. Automated assembly scaffolding using ragtag elevates a new tomato system for high-throughput genome editing. Genome Biol 2022;23: 258.
    pmc: PMC9753292pubmed: 36522651
  30. Marçais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL, Zimin A. MUMmer4: a fast and versatile genome alignment system. PLoS Comput Biol 2018;14: e1005944.
    pmc: PMC5802927pubmed: 29373581
  31. Zhao H, Sun Z, Wang J, Huang H, Kocher J-P, Wang L. Crossmap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics 2014;30:1006–7.
    pmc: PMC3967108pubmed: 24351709
  32. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559–75.
    pmc: PMC1950838pubmed: 17701901
  33. Vasimuddin M, Misra S, Li H, Aluru S. Efficient architecture-aware acceleration of BWA-MEM for multicore systems. In: 2019 IEEE international parallel and distributed processing symposium (IPDPS). IEEE; 2019. 10.1109/ipdps.2019.00041.
    doi: 10.1109/ipdps.2019.00041google scholar: lookup
  34. Faust GG, Hall IM. Samblaster: fast duplicate marking and structural variant read extraction. Bioinformatics 2014;30:2503–5.
    pmc: PMC4147885pubmed: 24812344
  35. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N. The sequence alignment/map format and samtools. Bioinformatics 2009;25:2078–9.
    pmc: PMC2723002pubmed: 19505943
  36. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 2014;9: e112963.
    pmc: PMC4237348pubmed: 25409509
  37. Rhie A, Walenz BP, Koren S, Phillippy AM. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol 2020;21:1–27.
    pmc: PMC7488777pubmed: 32928274
  38. Huang N, Li H. Compleasm: a faster and more accurate reimplementation of BUSCO. Bioinformatics 2023;39:btad595.
    pmc: PMC10558035pubmed: 37758247
  39. Trizna M. Available at: https://github.com/MikeTrizna/assembly_stats. 2020.
  40. Cabanettes F, Klopp C. D-GENIES: Dot plot large genomes in an interactive, efficient and simple way. PeerJ 2018;6: e4958.
    pmc: PMC5991294pubmed: 29888139
  41. Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C. Repeatmodeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci U S A 2020;117:9451–7.
    pmc: PMC7196820pubmed: 32300014
  42. Smith A, Hubley R, Green P. RepeatMasker Open-4.0. 2013.
  43. Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 1999;27:573–80.
    pmc: PMC148217pubmed: 9862982
  44. Korf I, Yandell M, Bedell J. Blast. Sebastopol: O’Reilly Media; 2003.
  45. Nergadze SG, Belloni E, Piras FM, Khoriauli L, Mazzagatti A, Vella F. Discovery and comparative analysis of a novel satellite, EC137, in horses and other equids. Cytogenet Genome Res 2014;144:114–23.
    pubmed: 25342230
  46. Cerutti F, Gamba R, Mazzagatti A, Piras FM, Cappelletti E, Belloni E. The major horse satellite DNA family is associated with centromere competence. Mol Cytogenet 2016;9:1–8.
    pmc: PMC4847189pubmed: 27123044
  47. Aken BL, Ayling S, Barrell D, Clarke L, Curwen V, Fairley S. The ensembl gene annotation system. Database 2016;2016:baw093.
    pmc: PMC4919035pubmed: 27337980
  48. Emms DM, Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 2019;20:1–14.
    pmc: PMC6857279pubmed: 31727128
  49. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K. BLAST+: architecture and applications. BMC Bioinformatics 2009;10:1–9.
    pmc: PMC2803857pubmed: 20003500
  50. Goel M, Sun H, Jiao W-B, Schneeberger K. Syri: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol 2019;20:1–13.
    pmc: PMC6913012pubmed: 31842948
  51. Goel M, Schneeberger K. Plotsr: visualizing structural similarities and rearrangements between multiple genomes. Bioinformatics 2022;38:2922–6.
    pmc: PMC9113368pubmed: 35561173
  52. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A. The ensembl variant effect predictor. Genome Biol 2016;17:1–14.
    pmc: PMC4893825pubmed: 27268795
  53. Buels R, Yao E, Diesh CM, Hayes RD, Munoz-Torres M, Helt G. JBrowse: a dynamic web platform for genome visualization and analysis. Genome Biol 2016;17:1–12.
    pmc: PMC4830012pubmed: 27072794
  54. Genecards. Genecards - the human gene database. Weizman Institute of Science. 1997. https://www.genecards.org/. Accessed 13 Sep 2024.
  55. Piras FM, Nergadze SG, Magnani E, Bertoni L, Attolini C, Khoriauli L. Uncoupling of satellite DNA and centromeric function in the genus. PLoS Genet 2010;6:e1000845.
    pmc: PMC2820525pubmed: 20169180
  56. Kronenberg ZN, Rhie A, Koren S, Concepcion GT, Peluso P, Munson KM. Extended haplotype phasing of de Novo genome assemblies with FALCON-Phase. Biorxiv 1935;2018:1935.
    pmc: PMC8081726pubmed: 33911078
  57. Yen EC, McCarthy SA, Galarza JA, Generalovic TN, Pelan S, Nguyen P. A haplotype-resolved, de novo genome assembly for the wood tiger moth () through trio binning. Gigascience 2020;9:giaa088.
    pmc: PMC7433188pubmed: 32808665
  58. Viļuma A, Mikko S, Hahn D, Skow L, Andersson G, Bergström TF. Genomic structure of the horse major histocompatibility complex class II region resolved using PacBio long-read sequencing technology. Sci Rep 2017;7:45518.
    pmc: PMC5374520pubmed: 28361880
  59. Wallner B, Vogl C, Shukla P, Burgstaller JP, Druml T, Brem G. Identification of genetic variation on the horse Y chromosome and the tracing of male founder lineages in modern breeds. PLoS One 2013;8:e60015.
    pmc: PMC3616054pubmed: 23573227
  60. Warmuth V, Eriksson A, Bower MA, Barker G, Barrett E, Hanks BK. Reconstructing the origin and spread of horse domestication in the Eurasian steppe. Proc Natl Acad Sci U S A 2012;109:8202–6.
    pmc: PMC3361400pubmed: 22566639
  61. Lindgren G, Backström N, Swinburne J, Hellborg L, Einarsson A, Sandberg K. Limited number of patrilines in horse domestication. Nat Genet 2004;36:335–6.
    pubmed: 15034578
  62. Janečka JE, Davis BW, Ghosh S, Paria N, Das PJ, Orlando L. Horse Y chromosome assembly displays unique evolutionary features and putative stallion fertility genes. Nat Commun 2018;9:1–15.
    pmc: PMC6063916pubmed: 30054462

Citations

This article has been cited 0 times.