Annotation of the Protein Coding Regions of the Equine Genome.
Abstract: Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced mRNA from a pool of forty-three different tissues. From these, we derived the structures of 68,594 transcripts. In addition, we identified 301,829 positions with SNPs or small indels within these transcripts relative to EquCab2. Interestingly, 780 variants extend the open reading frame of the transcript and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross-species transcriptional and genomic comparisons.
Publication Date: 2015-06-24 PubMed ID: 26107351PubMed Central: PMC4481266DOI: 10.1371/journal.pone.0124375Google Scholar: Lookup
The Equine Research Bank provides access to a large database of publicly available scientific literature. Inclusion in the Research Bank does not imply endorsement of study methods or findings by Mad Barn.
- Journal Article
- Research Support
- N.I.H.
- Extramural
- Research Support
- Non-U.S. Gov't
Summary
This research summary has been generated with artificial intelligence and may contain errors and omissions. Refer to the original study to confirm details provided. Submit correction.
The research article discusses a more extensive analysis of the horse genome, specifically the annotation of its protein coding regions. This was achieved through the sequencing of mRNA from multiple tissues. The findings provide a more comprehensive understanding of equine mRNA structures and protein coding variations.
Research Methodology
- The researchers acknowledged that most of the current gene annotation of the horse genome is based on in silico predictions and cross-species alignments. This means that much of what is understood about the horse genome relies on computational tools and comparisons with other species.
- To augment this, the researchers proceeded to sequence the messenger RNA (mRNA) from a pool of forty-three different tissues. When genes are expressed, they are transcribed into mRNA, which carries the code from the DNA form of the gene to the site of protein synthesis. By doing this, they aimed to use direct equine experimental evidence to expand the number of annotated equine genes.
Findings
- As a result of the sequencing, the researchers were able to derive the structures of 68,594 transcripts. Transcripts are the RNA molecules that result from genetic transcription.
- They also discovered 301,829 positions within these transcripts that had small nucleotide polymorphisms (SNPs) or small insertions and deletions (indels). These are variations in the genetic code that can have profound influences on gene function and expression.
- Interestingly, the researchers identified 780 variants that extended the open reading frame of the transcript – essentially, these are seemingly small errors in the equine reference genome. They were identified as such because they were also found as homozygous variants when resequencing the genome of the reference horse. Homozygous variants are those where both copies of a gene have the same mutation.
Significance
- The current study provides a rich resource for understanding equine mRNA structures and protein coding variants, contributing to a more nuanced understanding of the equine genome.
- This knowledge not only helps with equine genomic studies but also aids further cross-species genomic comparisons. Understanding the detailed structure and function of a closely related organism’s genome, such as a horse, can provide illuminating insights about other species, including humans.
Cite This Article
APA
Hestand MS, Kalbfleisch TS, Coleman SJ, Zeng Z, Liu J, Orlando L, MacLeod JN.
(2015).
Annotation of the Protein Coding Regions of the Equine Genome.
PLoS One, 10(6), e0124375.
https://doi.org/10.1371/journal.pone.0124375 Publication
Researcher Affiliations
- Maxwell H. Gluck Equine Research Center, Department of Veterinary Science, University of Kentucky, Lexington, Kentucky, United States of America.
- Biochemistry and Molecular Biology Department, School of Medicine, University of Louisville, Louisville, Kentucky, United States of America.
- Maxwell H. Gluck Equine Research Center, Department of Veterinary Science, University of Kentucky, Lexington, Kentucky, United States of America.
- Department of Computer Science, University of Kentucky, Lexington, Kentucky, United States of America.
- Department of Computer Science, University of Kentucky, Lexington, Kentucky, United States of America.
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark.
- Maxwell H. Gluck Equine Research Center, Department of Veterinary Science, University of Kentucky, Lexington, Kentucky, United States of America.
MeSH Terms
- Animals
- Base Sequence
- Genome
- Horses / genetics
- Molecular Sequence Annotation
- Open Reading Frames / genetics
- RNA, Messenger / genetics
Grant Funding
- P20 GM103436 / NIGMS NIH HHS
- P20 RR016481 / NCRR NIH HHS
- R01 HG006272 / NHGRI NIH HHS
- 5P20RR016481-09 / NCRR NIH HHS
Conflict of Interest Statement
The authors have declared that no competing interests exist.
References
This article includes 28 references
- Wade CM, Giulotto E, Sigurdsson S, Zoli M, Gnerre S, Imsland F, Lear TL, Adelson DL, Bailey E, Bellone RR, Blöcker H, Distl O, Edgar RC, Garber M, Leeb T, Mauceli E, MacLeod JN, Penedo MC, Raison JM, Sharpe T, Vogel J, Andersson L, Antczak DF, Biagi T, Binns MM, Chowdhary BP, Coleman SJ, Della Valle G, Fryc S, Guérin G, Hasegawa T, Hill EW, Jurka J, Kiialainen A, Lindgren G, Liu J, Magnani E, Mickelson JR, Murray J, Nergadze SG, Onofrio R, Pedroni S, Piras MF, Raudsepp T, Rocchi M, Røed KH, Ryder OA, Searle S, Skow L, Swinburne JE, Syvänen AC, Tozaki T, Valberg SJ, Vaudin M, White JR, Zody MC, Lander ES, Lindblad-Toh K. Genome sequence, comparative analysis, and population genetics of the domestic horse.. Science 2009 Nov 6;326(5954):865-7.
- Tang F, Barbacioru C, Wang Y, Nordman E, Lee C, Xu N, Wang X, Bodeau J, Tuch BB, Siddiqui A, Lao K, Surani MA. mRNA-Seq whole-transcriptome analysis of a single cell.. Nat Methods 2009 May;6(5):377-82.
- Orlando L, Ginolhac A, Zhang G, Froese D, Albrechtsen A, Stiller M, Schubert M, Cappellini E, Petersen B, Moltke I, Johnson PL, Fumagalli M, Vilstrup JT, Raghavan M, Korneliussen T, Malaspinas AS, Vogt J, Szklarczyk D, Kelstrup CD, Vinther J, Dolocan A, Stenderup J, Velazquez AM, Cahill J, Rasmussen M, Wang X, Min J, Zazula GD, Seguin-Orlando A, Mortensen C, Magnussen K, Thompson JF, Weinstock J, Gregersen K, Røed KH, Eisenmann V, Rubin CJ, Miller DC, Antczak DF, Bertelsen MF, Brunak S, Al-Rasheid KA, Ryder O, Andersson L, Mundy J, Krogh A, Gilbert MT, Kjær K, Sicheritz-Ponten T, Jensen LJ, Olsen JV, Hofreiter M, Nielsen R, Shapiro B, Wang J, Willerslev E. Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse.. Nature 2013 Jul 4;499(7456):74-8.
- Doan R, Cohen ND, Sawyer J, Ghaffari N, Johnson CD, Dindot SV. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.. BMC Genomics 2012 Feb 17;13:78.
- Metzger J, Tonda R, Beltran S, Agueda L, Gut M, Distl O. Next generation sequencing gives an insight into the characteristics of highly selected breeds versus non-breed horses in the course of domestication.. BMC Genomics 2014 Jul 4;15(1):562.
- Coleman SJ, Zeng Z, Wang K, Luo S, Khrebtukova I, Mienaltowski MJ, Schroth GP, Liu J, MacLeod JN. Structural annotation of equine protein-coding genes determined by mRNA sequencing.. Anim Genet 2010 Dec;41 Suppl 2:121-30.
- Rustici G, Kolesnikov N, Brandizi M, Burdett T, Dylag M, Emam I, Farne A, Hastings E, Ison J, Keays M, Kurbatova N, Malone J, Mani R, Mupo A, Pedro Pereira R, Pilicheva E, Rung J, Sharma A, Tang YA, Ternent T, Tikhonov A, Welter D, Williams E, Brazma A, Parkinson H, Sarkans U. ArrayExpress update--trends in database growth and links to data analysis tools.. Nucleic Acids Res 2013 Jan;41(Database issue):D987-90.
- Flicek P, Ahmed I, Amode MR, Barrell D, Beal K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fairley S, Fitzgerald S, Gil L, García-Girón C, Gordon L, Hourlier T, Hunt S, Juettemann T, Kähäri AK, Keenan S, Komorowska M, Kulesha E, Longden I, Maurel T, McLaren WM, Muffato M, Nag R, Overduin B, Pignatelli M, Pritchard B, Pritchard E, Riat HS, Ritchie GR, Ruffier M, Schuster M, Sheppard D, Sobral D, Taylor K, Thormann A, Trevanion S, White S, Wilder SP, Aken BL, Birney E, Cunningham F, Dunham I, Harrow J, Herrero J, Hubbard TJ, Johnson N, Kinsella R, Parker A, Spudich G, Yates A, Zadissa A, Searle SM. Ensembl 2013.. Nucleic Acids Res 2013 Jan;41(Database issue):D48-55.
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool.. J Mol Biol 1990 Oct 5;215(3):403-10.
- Rebolledo-Mendez J, Hestand MS, Coleman SJ, Zeng Z, Orlando L, MacLeod JN, Kalbfleisch T. Comparison of the Equine Reference Sequence with Its Sanger Source Data and New Illumina Reads.. PLoS One 2015;10(6):e0126852.
- Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer.. Nat Biotechnol 2011 Jan;29(1):24-6.
- Ladoukakis E, Pereira V, Magny EG, Eyre-Walker A, Couso JP. Hundreds of putatively functional small open reading frames in Drosophila.. Genome Biol 2011 Nov 25;12(11):R118.
- Magny EG, Pueyo JI, Pearl FM, Cespedes MA, Niven JE, Bishop SA, Couso JP. Conserved regulation of cardiac calcium uptake by peptides encoded in small open reading frames.. Science 2013 Sep 6;341(6150):1116-20.
- Schwarzbauer JE, DeSimone DW. Fibronectins, their fibrillogenesis, and in vivo functions.. Cold Spring Harb Perspect Biol 2011 Jul 1;3(7).
- MacLeod JN, Burton-Wurster N, Gu DN, Lust G. Fibronectin mRNA splice variant in articular cartilage lacks bases encoding the V, III-15, and I-10 protein segments.. J Biol Chem 1996 Aug 2;271(31):18954-60.
- Chomczynski P, Sacchi N. Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction.. Anal Biochem 1987 Apr;162(1):156-9.
- Lambert JD, Chan XY, Spiecker B, Sweet HC. Characterizing the embryonic transcriptome of the snail Ilyanassa.. Integr Comp Biol 2010 Nov;50(5):768-77.
- Hestand MS, Klingenhoff A, Scherf M, Ariyurek Y, Ramos Y, van Workum W, Suzuki M, Werner T, van Ommen GJ, den Dunnen JT, Harbers M, 't Hoen PA. Tissue-specific transcript annotation and expression profiling with complementary next-generation sequencing technologies.. Nucleic Acids Res 2010 Sep;38(16):e165.
- Wang K, Singh D, Zeng Z, Coleman SJ, Huang Y, Savich GL, He X, Mieczkowski P, Grimm SA, Perou CM, MacLeod JN, Chiang DY, Prins JF, Liu J. MapSplice: accurate mapping of RNA-seq reads for splice junction discovery.. Nucleic Acids Res 2010 Oct;38(18):e178.
- Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.. Nat Biotechnol 2010 May;28(5):511-5.
- Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.. Genome Biol 2009;10(3):R25.
- Kent WJ. BLAT--the BLAST-like alignment tool.. Genome Res 2002 Apr;12(4):656-64.
- Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D. The human genome browser at UCSC.. Genome Res 2002 Jun;12(6):996-1006.
- Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ. The UCSC Table Browser data retrieval tool.. Nucleic Acids Res 2004 Jan 1;32(Database issue):D493-6.
- Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, Goldman M, Barber GP, Clawson H, Coelho A, Diekhans M, Dreszer TR, Giardine BM, Harte RA, Hillman-Jackson J, Hsu F, Kirkup V, Kuhn RM, Learned K, Li CH, Meyer LR, Pohl A, Raney BJ, Rosenbloom KR, Smith KE, Haussler D, Kent WJ. The UCSC Genome Browser database: update 2011.. Nucleic Acids Res 2011 Jan;39(Database issue):D876-82.
- McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.. Genome Res 2010 Sep;20(9):1297-303.
- DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, Altshuler D, Daly MJ. A framework for variation discovery and genotyping using next-generation DNA sequencing data.. Nat Genet 2011 May;43(5):491-8.
- Rice P, Longden I, Bleasby A. EMBOSS: the European Molecular Biology Open Software Suite.. Trends Genet 2000 Jun;16(6):276-7.
Citations
This article has been cited 20 times.Use Nutrition Calculator
Check if your horse's diet meets their nutrition requirements with our easy-to-use tool Check your horse's diet with our easy-to-use tool
Talk to a Nutritionist
Discuss your horse's feeding plan with our experts over a free phone consultation Discuss your horse's diet over a phone consultation
Submit Diet Evaluation
Get a customized feeding plan for your horse formulated by our equine nutritionists Get a custom feeding plan formulated by our nutritionists