Analyze Diet
PloS one2015; 10(6); e0124375; doi: 10.1371/journal.pone.0124375

Annotation of the Protein Coding Regions of the Equine Genome.

Abstract: Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced mRNA from a pool of forty-three different tissues. From these, we derived the structures of 68,594 transcripts. In addition, we identified 301,829 positions with SNPs or small indels within these transcripts relative to EquCab2. Interestingly, 780 variants extend the open reading frame of the transcript and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross-species transcriptional and genomic comparisons.
Publication Date: 2015-06-24 PubMed ID: 26107351PubMed Central: PMC4481266DOI: 10.1371/journal.pone.0124375Google Scholar: Lookup
The Equine Research Bank provides access to a large database of publicly available scientific literature. Inclusion in the Research Bank does not imply endorsement of study methods or findings by Mad Barn.
  • Journal Article
  • Research Support
  • N.I.H.
  • Extramural
  • Research Support
  • Non-U.S. Gov't

Summary

This research summary has been generated with artificial intelligence and may contain errors and omissions. Refer to the original study to confirm details provided. Submit correction.

The research article discusses a more extensive analysis of the horse genome, specifically the annotation of its protein coding regions. This was achieved through the sequencing of mRNA from multiple tissues. The findings provide a more comprehensive understanding of equine mRNA structures and protein coding variations.

Research Methodology

  • The researchers acknowledged that most of the current gene annotation of the horse genome is based on in silico predictions and cross-species alignments. This means that much of what is understood about the horse genome relies on computational tools and comparisons with other species.
  • To augment this, the researchers proceeded to sequence the messenger RNA (mRNA) from a pool of forty-three different tissues. When genes are expressed, they are transcribed into mRNA, which carries the code from the DNA form of the gene to the site of protein synthesis. By doing this, they aimed to use direct equine experimental evidence to expand the number of annotated equine genes.

Findings

  • As a result of the sequencing, the researchers were able to derive the structures of 68,594 transcripts. Transcripts are the RNA molecules that result from genetic transcription.
  • They also discovered 301,829 positions within these transcripts that had small nucleotide polymorphisms (SNPs) or small insertions and deletions (indels). These are variations in the genetic code that can have profound influences on gene function and expression.
  • Interestingly, the researchers identified 780 variants that extended the open reading frame of the transcript – essentially, these are seemingly small errors in the equine reference genome. They were identified as such because they were also found as homozygous variants when resequencing the genome of the reference horse. Homozygous variants are those where both copies of a gene have the same mutation.

Significance

  • The current study provides a rich resource for understanding equine mRNA structures and protein coding variants, contributing to a more nuanced understanding of the equine genome.
  • This knowledge not only helps with equine genomic studies but also aids further cross-species genomic comparisons. Understanding the detailed structure and function of a closely related organism’s genome, such as a horse, can provide illuminating insights about other species, including humans.

Cite This Article

APA
Hestand MS, Kalbfleisch TS, Coleman SJ, Zeng Z, Liu J, Orlando L, MacLeod JN. (2015). Annotation of the Protein Coding Regions of the Equine Genome. PLoS One, 10(6), e0124375. https://doi.org/10.1371/journal.pone.0124375

Publication

ISSN: 1932-6203
NlmUniqueID: 101285081
Country: United States
Language: English
Volume: 10
Issue: 6
Pages: e0124375
PII: e0124375

Researcher Affiliations

Hestand, Matthew S
  • Maxwell H. Gluck Equine Research Center, Department of Veterinary Science, University of Kentucky, Lexington, Kentucky, United States of America.
Kalbfleisch, Theodore S
  • Biochemistry and Molecular Biology Department, School of Medicine, University of Louisville, Louisville, Kentucky, United States of America.
Coleman, Stephen J
  • Maxwell H. Gluck Equine Research Center, Department of Veterinary Science, University of Kentucky, Lexington, Kentucky, United States of America.
Zeng, Zheng
  • Department of Computer Science, University of Kentucky, Lexington, Kentucky, United States of America.
Liu, Jinze
  • Department of Computer Science, University of Kentucky, Lexington, Kentucky, United States of America.
Orlando, Ludovic
  • Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark.
MacLeod, James N
  • Maxwell H. Gluck Equine Research Center, Department of Veterinary Science, University of Kentucky, Lexington, Kentucky, United States of America.

MeSH Terms

  • Animals
  • Base Sequence
  • Genome
  • Horses / genetics
  • Molecular Sequence Annotation
  • Open Reading Frames / genetics
  • RNA, Messenger / genetics

Grant Funding

  • P20 GM103436 / NIGMS NIH HHS
  • P20 RR016481 / NCRR NIH HHS
  • R01 HG006272 / NHGRI NIH HHS
  • 5P20RR016481-09 / NCRR NIH HHS

Conflict of Interest Statement

The authors have declared that no competing interests exist.

References

This article includes 28 references
  1. Wade CM, Giulotto E, Sigurdsson S, Zoli M, Gnerre S, Imsland F, Lear TL, Adelson DL, Bailey E, Bellone RR, Blöcker H, Distl O, Edgar RC, Garber M, Leeb T, Mauceli E, MacLeod JN, Penedo MC, Raison JM, Sharpe T, Vogel J, Andersson L, Antczak DF, Biagi T, Binns MM, Chowdhary BP, Coleman SJ, Della Valle G, Fryc S, Guérin G, Hasegawa T, Hill EW, Jurka J, Kiialainen A, Lindgren G, Liu J, Magnani E, Mickelson JR, Murray J, Nergadze SG, Onofrio R, Pedroni S, Piras MF, Raudsepp T, Rocchi M, Røed KH, Ryder OA, Searle S, Skow L, Swinburne JE, Syvänen AC, Tozaki T, Valberg SJ, Vaudin M, White JR, Zody MC, Lander ES, Lindblad-Toh K. Genome sequence, comparative analysis, and population genetics of the domestic horse.. Science 2009 Nov 6;326(5954):865-7.
    doi: 10.1126/science.1178158pmc: PMC3785132pubmed: 19892987google scholar: lookup
  2. Tang F, Barbacioru C, Wang Y, Nordman E, Lee C, Xu N, Wang X, Bodeau J, Tuch BB, Siddiqui A, Lao K, Surani MA. mRNA-Seq whole-transcriptome analysis of a single cell.. Nat Methods 2009 May;6(5):377-82.
    doi: 10.1038/nmeth.1315pubmed: 19349980google scholar: lookup
  3. Orlando L, Ginolhac A, Zhang G, Froese D, Albrechtsen A, Stiller M, Schubert M, Cappellini E, Petersen B, Moltke I, Johnson PL, Fumagalli M, Vilstrup JT, Raghavan M, Korneliussen T, Malaspinas AS, Vogt J, Szklarczyk D, Kelstrup CD, Vinther J, Dolocan A, Stenderup J, Velazquez AM, Cahill J, Rasmussen M, Wang X, Min J, Zazula GD, Seguin-Orlando A, Mortensen C, Magnussen K, Thompson JF, Weinstock J, Gregersen K, Røed KH, Eisenmann V, Rubin CJ, Miller DC, Antczak DF, Bertelsen MF, Brunak S, Al-Rasheid KA, Ryder O, Andersson L, Mundy J, Krogh A, Gilbert MT, Kjær K, Sicheritz-Ponten T, Jensen LJ, Olsen JV, Hofreiter M, Nielsen R, Shapiro B, Wang J, Willerslev E. Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse.. Nature 2013 Jul 4;499(7456):74-8.
    doi: 10.1038/nature12323pubmed: 23803765google scholar: lookup
  4. Doan R, Cohen ND, Sawyer J, Ghaffari N, Johnson CD, Dindot SV. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.. BMC Genomics 2012 Feb 17;13:78.
    doi: 10.1186/1471-2164-13-78pmc: PMC3309927pubmed: 22340285google scholar: lookup
  5. Metzger J, Tonda R, Beltran S, Agueda L, Gut M, Distl O. Next generation sequencing gives an insight into the characteristics of highly selected breeds versus non-breed horses in the course of domestication.. BMC Genomics 2014 Jul 4;15(1):562.
    doi: 10.1186/1471-2164-15-562pmc: PMC4097168pubmed: 24996778google scholar: lookup
  6. Coleman SJ, Zeng Z, Wang K, Luo S, Khrebtukova I, Mienaltowski MJ, Schroth GP, Liu J, MacLeod JN. Structural annotation of equine protein-coding genes determined by mRNA sequencing.. Anim Genet 2010 Dec;41 Suppl 2:121-30.
  7. Rustici G, Kolesnikov N, Brandizi M, Burdett T, Dylag M, Emam I, Farne A, Hastings E, Ison J, Keays M, Kurbatova N, Malone J, Mani R, Mupo A, Pedro Pereira R, Pilicheva E, Rung J, Sharma A, Tang YA, Ternent T, Tikhonov A, Welter D, Williams E, Brazma A, Parkinson H, Sarkans U. ArrayExpress update--trends in database growth and links to data analysis tools.. Nucleic Acids Res 2013 Jan;41(Database issue):D987-90.
    doi: 10.1093/nar/gks1174pmc: PMC3531147pubmed: 23193272google scholar: lookup
  8. Flicek P, Ahmed I, Amode MR, Barrell D, Beal K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fairley S, Fitzgerald S, Gil L, García-Girón C, Gordon L, Hourlier T, Hunt S, Juettemann T, Kähäri AK, Keenan S, Komorowska M, Kulesha E, Longden I, Maurel T, McLaren WM, Muffato M, Nag R, Overduin B, Pignatelli M, Pritchard B, Pritchard E, Riat HS, Ritchie GR, Ruffier M, Schuster M, Sheppard D, Sobral D, Taylor K, Thormann A, Trevanion S, White S, Wilder SP, Aken BL, Birney E, Cunningham F, Dunham I, Harrow J, Herrero J, Hubbard TJ, Johnson N, Kinsella R, Parker A, Spudich G, Yates A, Zadissa A, Searle SM. Ensembl 2013.. Nucleic Acids Res 2013 Jan;41(Database issue):D48-55.
    doi: 10.1093/nar/gks1236pmc: PMC3531136pubmed: 23203987google scholar: lookup
  9. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool.. J Mol Biol 1990 Oct 5;215(3):403-10.
    doi: 10.1016/S0022-2836(05)80360-2pubmed: 2231712google scholar: lookup
  10. Rebolledo-Mendez J, Hestand MS, Coleman SJ, Zeng Z, Orlando L, MacLeod JN, Kalbfleisch T. Comparison of the Equine Reference Sequence with Its Sanger Source Data and New Illumina Reads.. PLoS One 2015;10(6):e0126852.
  11. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer.. Nat Biotechnol 2011 Jan;29(1):24-6.
    doi: 10.1038/nbt.1754pmc: PMC3346182pubmed: 21221095google scholar: lookup
  12. Ladoukakis E, Pereira V, Magny EG, Eyre-Walker A, Couso JP. Hundreds of putatively functional small open reading frames in Drosophila.. Genome Biol 2011 Nov 25;12(11):R118.
    doi: 10.1186/gb-2011-12-11-r118pmc: PMC3334604pubmed: 22118156google scholar: lookup
  13. Magny EG, Pueyo JI, Pearl FM, Cespedes MA, Niven JE, Bishop SA, Couso JP. Conserved regulation of cardiac calcium uptake by peptides encoded in small open reading frames.. Science 2013 Sep 6;341(6150):1116-20.
    doi: 10.1126/science.1238802pubmed: 23970561google scholar: lookup
  14. Schwarzbauer JE, DeSimone DW. Fibronectins, their fibrillogenesis, and in vivo functions.. Cold Spring Harb Perspect Biol 2011 Jul 1;3(7).
    doi: 10.1101/cshperspect.a005041pmc: PMC3119908pubmed: 21576254google scholar: lookup
  15. MacLeod JN, Burton-Wurster N, Gu DN, Lust G. Fibronectin mRNA splice variant in articular cartilage lacks bases encoding the V, III-15, and I-10 protein segments.. J Biol Chem 1996 Aug 2;271(31):18954-60.
    doi: 10.1074/jbc.271.31.18954pubmed: 8702559google scholar: lookup
  16. Chomczynski P, Sacchi N. Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction.. Anal Biochem 1987 Apr;162(1):156-9.
    doi: 10.1016/0003-2697(87)90021-2pubmed: 2440339google scholar: lookup
  17. Lambert JD, Chan XY, Spiecker B, Sweet HC. Characterizing the embryonic transcriptome of the snail Ilyanassa.. Integr Comp Biol 2010 Nov;50(5):768-77.
    doi: 10.1093/icb/icq121pubmed: 21558239google scholar: lookup
  18. Hestand MS, Klingenhoff A, Scherf M, Ariyurek Y, Ramos Y, van Workum W, Suzuki M, Werner T, van Ommen GJ, den Dunnen JT, Harbers M, 't Hoen PA. Tissue-specific transcript annotation and expression profiling with complementary next-generation sequencing technologies.. Nucleic Acids Res 2010 Sep;38(16):e165.
    doi: 10.1093/nar/gkq602pmc: PMC2938216pubmed: 20615900google scholar: lookup
  19. Wang K, Singh D, Zeng Z, Coleman SJ, Huang Y, Savich GL, He X, Mieczkowski P, Grimm SA, Perou CM, MacLeod JN, Chiang DY, Prins JF, Liu J. MapSplice: accurate mapping of RNA-seq reads for splice junction discovery.. Nucleic Acids Res 2010 Oct;38(18):e178.
    doi: 10.1093/nar/gkq622pmc: PMC2952873pubmed: 20802226google scholar: lookup
  20. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.. Nat Biotechnol 2010 May;28(5):511-5.
    doi: 10.1038/nbt.1621pmc: PMC3146043pubmed: 20436464google scholar: lookup
  21. Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.. Genome Biol 2009;10(3):R25.
    doi: 10.1186/gb-2009-10-3-r25pmc: PMC2690996pubmed: 19261174google scholar: lookup
  22. Kent WJ. BLAT--the BLAST-like alignment tool.. Genome Res 2002 Apr;12(4):656-64.
    doi: 10.1101/gr.229202pmc: PMC187518pubmed: 11932250google scholar: lookup
  23. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D. The human genome browser at UCSC.. Genome Res 2002 Jun;12(6):996-1006.
    doi: 10.1101/gr.229102pmc: PMC186604pubmed: 12045153google scholar: lookup
  24. Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ. The UCSC Table Browser data retrieval tool.. Nucleic Acids Res 2004 Jan 1;32(Database issue):D493-6.
    doi: 10.1093/nar/gkh103pmc: PMC308837pubmed: 14681465google scholar: lookup
  25. Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, Goldman M, Barber GP, Clawson H, Coelho A, Diekhans M, Dreszer TR, Giardine BM, Harte RA, Hillman-Jackson J, Hsu F, Kirkup V, Kuhn RM, Learned K, Li CH, Meyer LR, Pohl A, Raney BJ, Rosenbloom KR, Smith KE, Haussler D, Kent WJ. The UCSC Genome Browser database: update 2011.. Nucleic Acids Res 2011 Jan;39(Database issue):D876-82.
    doi: 10.1093/nar/gkq963pmc: PMC3242726pubmed: 20959295google scholar: lookup
  26. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.. Genome Res 2010 Sep;20(9):1297-303.
    doi: 10.1101/gr.107524.110pmc: PMC2928508pubmed: 20644199google scholar: lookup
  27. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, Altshuler D, Daly MJ. A framework for variation discovery and genotyping using next-generation DNA sequencing data.. Nat Genet 2011 May;43(5):491-8.
    doi: 10.1038/ng.806pmc: PMC3083463pubmed: 21478889google scholar: lookup
  28. Rice P, Longden I, Bleasby A. EMBOSS: the European Molecular Biology Open Software Suite.. Trends Genet 2000 Jun;16(6):276-7.
    doi: 10.1016/S0168-9525(00)02024-2pubmed: 10827456google scholar: lookup

Citations

This article has been cited 20 times.