Thoroughbred Horse Single Nucleotide Polymorphism and Expression Database: HSDB.
- Journal Article
Summary
The research paper discusses the creation of a new horse genomic variants database, the Horse Single Nucleotide Polymorphism and Expression Database (HSDB), which includes expression data and other information to aid in a better understanding of horses.
About The Database
The HSDB was created to address the lack of comprehensive horse-related genome databases. This database has been used to identify unexplored genomic variants in the horse genome, including rare ones.
- The database was constructed using population genome sequences and RNA-seq data from a total of twenty-two horses.
- The identified single nucleotide polymorphisms (SNPs), a type of genetic variation, were confirmed by comparing them with SNP chip data and RNA-seq variants, which yielded a high level of agreement (99.2% and 96.6% respectively).
Importance Of The Database
The HSDB offers several notable features that make it a valuable resource in horse genetics and breeding.
- By providing a vast amount of genetic variance data, the database aids in a more thorough understanding of the horse genome.
- The database is unique in that it links the genomic variants it includes to their corresponding transcriptional profiles, providing important functional context for these variants.
- By making this comprehensive data readily accessible, the HSDB is expected to contribute significantly to the genetic improvement and precision of breeding strategies of Thoroughbreds.
Potential Applications
The HSDB’s wealth of data and innovative design position it as a tool that could revolutionize horse breeding and selection.
- The database could be used to identify specific genetic markers for desirable traits, significantly enhancing the efficiency and accuracy of horse breeding programs.
- Additionally, the HSDB could aid in tracking and combating genetic diseases common in horse populations.
- The methodology used in the creation of the HSDB could also be applied to the creation of similar databases for other species, thereby further expanding our understanding of genetics and breeding in a broader context.
Cite This Article
Publication
Researcher Affiliations
- Genomic Informatics Center, Hankyong National University, Anseong 456-749, Korea.
- Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 151-742, Korea .
- Genomic Informatics Center, Hankyong National University, Anseong 456-749, Korea.
- Department of Animal Science, College of Life Sciences, Pusan National University, Miryang 627-702, Korea .
- Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 151-742, Korea .
- Department of Equine Sciences, Sorabol College, Gyeongju 780-711, Korea .
- C&K Genomics, Seoul National University Research Park, Seoul 151-919, Korea .
- C&K Genomics, Seoul National University Research Park, Seoul 151-919, Korea . ; Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-742, Korea .
- C&K Genomics, Seoul National University Research Park, Seoul 151-919, Korea .
- Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 151-742, Korea . ; Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-742, Korea .
- C&K Genomics, Seoul National University Research Park, Seoul 151-919, Korea .
- Genomic Informatics Center, Hankyong National University, Anseong 456-749, Korea.
References
- Ameur A, Zaghlool A, Halvardson J, Wetterbom A, Gyllensten U, Cavelier L, Feuk L. Total RNA sequencing reveals nascent transcription and widespread co-transcriptional splicing in the human brain.. Nat Struct Mol Biol 2011 Nov 6;18(12):1435-40.
- Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps.. Bioinformatics 2005 Jan 15;21(2):263-5.
- Chowdhary BP, Raudsepp T. The horse genome derby: racing from map to whole genome sequence.. Chromosome Res 2008;16(1):109-27.
- Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.. Fly (Austin) 2012 Apr-Jun;6(2):80-92.
- Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R. The variant call format and VCFtools.. Bioinformatics 2011 Aug 1;27(15):2156-8.
- Gordon J. The Horse Industry – Contributing to the Australian Economy. Rural Industries Research and Development Corporation; Canberra, Australia: 2001. pp. 1–58.
- Hill EW, McGivney BA, Gu J, Whiston R, Machugh DE. A genome-wide SNP-association study confirms a sequence variant (g.66493737C>T) in the equine myostatin (MSTN) gene as the most powerful predictor of optimum racing distance for Thoroughbred racehorses.. BMC Genomics 2010 Oct 11;11:552.
- Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, Durbin R, Eyras E, Gilbert J, Hammond M, Huminiecki L, Kasprzyk A, Lehvaslaiho H, Lijnzaad P, Melsopp C, Mongin E, Pettett R, Pocock M, Potter S, Rust A, Schmidt E, Searle S, Slater G, Smith J, Spooner W, Stabenau A, Stalker J, Stupka E, Ureta-Vidal A, Vastrik I, Clamp M. The Ensembl genome database project.. Nucleic Acids Res 2002 Jan 1;30(1):38-41.
- Kapranov P, St Laurent G, Raz T, Ozsolak F, Reynolds CP, Sorensen PH, Reaman G, Milos P, Arceci RJ, Thompson JF, Triche TJ. The majority of total nuclear-encoded non-ribosomal RNA in a human cell is 'dark matter' un-annotated RNA.. BMC Biol 2010 Dec 21;8:149.
- Kim H, Lee T, Park W, Lee JW, Kim J, Lee BY, Ahn H, Moon S, Cho S, Do KT, Kim HS, Lee HK, Lee CK, Kong HS, Yang YM, Park J, Kim HM, Kim BC, Hwang S, Bhak J, Burt D, Park KD, Cho BW, Kim H. Peeling back the evolutionary layers of molecular mechanisms responsive to exercise-stress in the skeletal muscle of the racing horse.. DNA Res 2013 Jun;20(3):287-98.
- Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2.. Nat Methods 2012 Mar 4;9(4):357-9.
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. The Sequence Alignment/Map format and SAMtools.. Bioinformatics 2009 Aug 15;25(16):2078-9.
- McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.. Genome Res 2010 Sep;20(9):1297-303.
- Park KD, Park J, Ko J, Kim BC, Kim HS, Ahn K, Do KT, Choi H, Kim HM, Song S, Lee S, Jho S, Kong HS, Yang YM, Jhun BH, Kim C, Kim TH, Hwang S, Bhak J, Lee HK, Cho BW. Whole transcriptome analyses of six thoroughbred horses before and after exercise using RNA-Seq.. BMC Genomics 2012 Sep 12;13:473.
- Petersen JL, Mickelson JR, Rendahl AK, Valberg SJ, Andersson LS, Axelsson J, Bailey E, Bannasch D, Binns MM, Borges AS, Brama P, da Câmara Machado A, Capomaccio S, Cappelli K, Cothran EG, Distl O, Fox-Clipsham L, Graves KT, Guérin G, Haase B, Hasegawa T, Hemmann K, Hill EW, Leeb T, Lindgren G, Lohi H, Lopes MS, McGivney BA, Mikko S, Orr N, Penedo MC, Piercy RJ, Raekallio M, Rieder S, Røed KH, Swinburne J, Tozaki T, Vaudin M, Wade CM, McCue ME. Genome-wide analysis reveals selection for important traits in domestic horse breeds.. PLoS Genet 2013;9(1):e1003211.
- Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.. Bioinformatics 2010 Jan 1;26(1):139-40.
- Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. dbSNP: the NCBI database of genetic variation.. Nucleic Acids Res 2001 Jan 1;29(1):308-11.
- St Laurent G, Shtokalo D, Tackett MR, Yang Z, Eremina T, Wahlestedt C, Urcuqui-Inchima S, Seilheimer B, McCaffrey TA, Kapranov P. Intronic RNAs constitute the major fraction of the non-coding RNA in mammalian cells.. BMC Genomics 2012 Sep 24;13:504.
- Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq.. Bioinformatics 2009 May 1;25(9):1105-11.
- van Bakel H, Nislow C, Blencowe BJ, Hughes TR. Most "dark matter" transcripts are associated with known genes.. PLoS Biol 2010 May 18;8(5):e1000371.
- Wade CM, Giulotto E, Sigurdsson S, Zoli M, Gnerre S, Imsland F, Lear TL, Adelson DL, Bailey E, Bellone RR, Blöcker H, Distl O, Edgar RC, Garber M, Leeb T, Mauceli E, MacLeod JN, Penedo MC, Raison JM, Sharpe T, Vogel J, Andersson L, Antczak DF, Biagi T, Binns MM, Chowdhary BP, Coleman SJ, Della Valle G, Fryc S, Guérin G, Hasegawa T, Hill EW, Jurka J, Kiialainen A, Lindgren G, Liu J, Magnani E, Mickelson JR, Murray J, Nergadze SG, Onofrio R, Pedroni S, Piras MF, Raudsepp T, Rocchi M, Røed KH, Ryder OA, Searle S, Skow L, Swinburne JE, Syvänen AC, Tozaki T, Valberg SJ, Vaudin M, White JR, Zody MC, Lander ES, Lindblad-Toh K. Genome sequence, comparative analysis, and population genetics of the domestic horse.. Science 2009 Nov 6;326(5954):865-7.
- Wetterbom A, Ameur A, Feuk L, Gyllensten U, Cavelier L. Identification of novel exons and transcribed regions by chimpanzee transcriptome sequencing.. Genome Biol 2010;11(7):R78.