Abstract: In horses, genetic diversity is predominantly observed between breeds, with little variation within breeds. The studbooks of the two largest horse populations in the Netherlands, the Dutch Warmblood horse and Friesian horse population, have ongoing conservation projects including collecting large-scale genotype and sequence data. The current reference genome, derived from a Thoroughbred horse can lead to bias in genetic analyses of other horse breeds. Therefore, the aim of this study was to create high-quality breed-specific reference genomes of Dutch Warmblood and Friesian horses. We performed nanopore long-read sequencing (R10.4, Q20+) of an F1 cross between a Dutch Warmblood horse and a Friesian horse to create two breed-specific reference genomes by trio binning. This resulted in high-quality, haplotype-resolved reference genomes with contig N50 of 37 and 35 Mb and single copy gene completeness of 99.2 and 99.3% for the Friesian and Warmblood, respectively. The majority of the chromosomes contained telomeric and /or centromeric sequences. The Ensembl gene annotation resulted in 19,750 and 19,872 protein coding genes for the Friesian and Warmblood, respectively. No large chromosomal rearrangements were observed between the Friesian and Warmblood genomes. However, a total of 722 large structural variations (> 10 kb) were identified, of which 14 affect the coding sequence of protein-coding genes. The novel breed-specific reference genomes provide a valuable resource for future genetic analysis and breed conservation efforts and will contribute to ongoing equine pangenome efforts. The online version contains supplementary material available at 10.1186/s12864-025-11985-0.
The Equine Research Bank provides access to a large database of publicly available scientific literature. Inclusion in the Research Bank does not imply endorsement of study methods or findings by Mad Barn.
This research summary has been generated with artificial intelligence and may contain errors and omissions. Refer to the original study to confirm details provided. Submit correction.
Overview
This study developed high-quality, breed-specific reference genomes for the Dutch Warmblood and Friesian horses using advanced sequencing technology and trio binning approaches.
The new genomes improve genetic analysis accuracy over the previous reference derived from a Thoroughbred horse and support conservation efforts for these Dutch horse breeds.
Background
Horses exhibit most genetic diversity between different breeds rather than within a single breed.
The Dutch Warmblood and Friesian horses are two of the largest horse populations in the Netherlands and have active breeding and conservation programs.
Previous horse genetic studies relied on a reference genome from a Thoroughbred horse, which may not adequately represent other breeds and can introduce biases.
Research Objective
To generate high-quality, breed-specific, haplotype-resolved reference genomes for the Dutch Warmblood and Friesian horses.
These references are expected to enhance the accuracy of genetic studies and support breed conservation.
Methods
Sequencing was performed using nanopore long-read sequencing technology (R10.4, Q20+) on an F1 crossbred offspring between a Dutch Warmblood and Friesian horse.
A trio binning approach was employed, which uses parental genetic information to separate (bin) the offspring’s sequencing reads into two groups—one for each breed’s haplotype.
This method enables the construction of two separate breed-specific genome assemblies from the hybrid individual.
Results
Two high-quality, haplotype-resolved reference genomes were created:
Friesian genome: contig N50 of 37 Mb, 99.2% completeness of single-copy genes
Dutch Warmblood genome: contig N50 of 35 Mb, 99.3% completeness of single-copy genes
Most chromosomes in both genomes included telomeric and/or centromeric sequences, indicating well-assembled chromosome ends and central parts.
Gene annotation identified approximately 19,750 protein-coding genes in the Friesian genome and 19,872 in the Warmblood genome using Ensembl pipelines.
No large-scale chromosomal rearrangements were detected between the two breed genomes, indicating structural similarity at the chromosome level.
However, 722 large structural variations (>10 kb) were found between the two breeds; 14 of these affect protein-coding gene sequences, potentially influencing breed-specific traits.
Implications and Significance
The newly produced breed-specific reference genomes provide a more accurate framework for genetic analyses within Dutch Warmblood and Friesian horses.
They reduce bias introduced by using a Thoroughbred reference genome when studying other breeds.
The genomes support ongoing breed conservation programs by offering detailed genetic information for managing breed diversity.
This work contributes to the broader equine pangenome initiative, which aims to capture the full genetic diversity across horse breeds.
The data, including supplementary materials, are publicly available for further research and application.
Cite This Article
APA
Steensma MJ, Ducro BJ, Dibbits B, Doekes HP, van Schipstal JGC, Kalblfleisch T, Groenen MAM, Derks MFL.
(2025).
High-quality, haplotype-resolved reference genomes of the Dutch warmblood horse and Friesian horse using trio binning.
BMC Genomics, 26(1), 790.
https://doi.org/10.1186/s12864-025-11985-0
Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands. marije.steensma@wur.nl.
Ducro, Bart J
Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
Dibbits, Bert
Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
Doekes, Harmen P
Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
van Schipstal, Job G C
Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
Kalblfleisch, Ted
Maxwell H. Gluck Equine Research Center, University of Kentucky, Lexington, KY, 40546, USA.
Groenen, Martien A M
Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
Derks, Martijn F L
Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
Grant Funding
4164023400 / Topconsortium voor Kennis en Innovatie
4164023400 / Topconsortium voor Kennis en Innovatie
Conflict of Interest Statement
Declarations. Ethics approval and consent to participate: The biological material used in this study was collected as part of routine data collection from the KFPS, and not specifically for the purpose of this project. Therefore, approval of an ethics committee was not mandatory. Blood sample collection was done by a licensed vet following the “Code of Good Veterinary Practice”. No animals were euthanized/sacrificed or anaesthetized for this study. Sample collection was conducted strictly in line with Dutch law on the protection of animals (Gezondheids- en welzijnswet voor dieren). Informed consent was obtained by the owner of the F1 and both parents to collect blood samples for DNA extraction. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.
References
This article includes 62 references
Hendricks B, Lou. International encyclopedia of horse breeds. Norman: University of Oklahoma Press; 1995.
Online Mendelian Inheritance in Animals, OMIA. Sydney School of Veterinary Science, retrieved at 13-02-2023. 2023. World Wide Web URL: https://www.omia.org/. Accessed 13 Sep 2022.
Schaefer RJ, Schubert M, Bailey E, Bannasch DL, Barrey E, Bar-Gal GK. Developing a 670k genotyping array to tag ~ 2 m SNPs across 24 horse breeds. BMC Genomics 2017;18:1–18.
Li K, Miller D, Antczak D, AbouEI Ela NA, Johnson L, Ciosek JL. A Thoroughbred T2T Reference Genome.. In: 14th International Havemeyer Foundation Horse Genome Workshop, May 12th – 15th, 2024. France; 2024.
Yuan S, Qin Z. Read-mapping using personalized diploid reference genome for RNA sequencing data reduced bias for detecting allele-specific expression.. In: 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops. IEEE; 2012;718–24.
Liu W-C, Lin C-P, Cheng C-P, Ho C-H, Lan K-L, Cheng J-H. Aligning to the sample-specific reference sequence to optimize the accuracy of next-generation sequencing analysis for hepatitis B virus.. Hepatol Int 2016;10:147–57.
Wang D, Yang L, Ning C, Liu J-F, Zhao X. Breed-specific reference sequence optimized mapping accuracy of NGS analyses for pigs.. BMC Genomics 2021;22:1–8.
Ploeg M. Challenging Friesian horse diseases: aortic rupture and megaesophagus.. Thesis. Utrecht University; 2015.
Koren S, Rhie A, Walenz BP, Dilthey AT, Bickhart DM, Kingan SB. De novo assembly of haplotype-resolved genomes with trio binning.. Nat Biotechnol 2018;36:1174–82.
Alonge M, Lebeigle L, Kirsche M, Jenike K, Ou S, Aganezov S. Automated assembly scaffolding using ragtag elevates a new tomato system for high-throughput genome editing. Genome Biol 2022;23: 258.
Zhao H, Sun Z, Wang J, Huang H, Kocher J-P, Wang L. Crossmap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics 2014;30:1006–7.
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559–75.
Vasimuddin M, Misra S, Li H, Aluru S. Efficient architecture-aware acceleration of BWA-MEM for multicore systems. In: 2019 IEEE international parallel and distributed processing symposium (IPDPS). IEEE; 2019. 10.1109/ipdps.2019.00041.
Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 2014;9: e112963.
Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C. Repeatmodeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci U S A 2020;117:9451–7.
Cerutti F, Gamba R, Mazzagatti A, Piras FM, Cappelletti E, Belloni E. The major horse satellite DNA family is associated with centromere competence. Mol Cytogenet 2016;9:1–8.
Goel M, Sun H, Jiao W-B, Schneeberger K. Syri: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol 2019;20:1–13.
Buels R, Yao E, Diesh CM, Hayes RD, Munoz-Torres M, Helt G. JBrowse: a dynamic web platform for genome visualization and analysis. Genome Biol 2016;17:1–12.
Genecards. Genecards - the human gene database. Weizman Institute of Science. 1997. https://www.genecards.org/. Accessed 13 Sep 2024.
Piras FM, Nergadze SG, Magnani E, Bertoni L, Attolini C, Khoriauli L. Uncoupling of satellite DNA and centromeric function in the genus. PLoS Genet 2010;6:e1000845.
Kronenberg ZN, Rhie A, Koren S, Concepcion GT, Peluso P, Munson KM. Extended haplotype phasing of de Novo genome assemblies with FALCON-Phase. Biorxiv 1935;2018:1935.
Yen EC, McCarthy SA, Galarza JA, Generalovic TN, Pelan S, Nguyen P. A haplotype-resolved, de novo genome assembly for the wood tiger moth () through trio binning. Gigascience 2020;9:giaa088.