University of Maryland DRUM  
University of Maryland Digital Repository at the University of Maryland

DRUM >
College of Computer, Mathematical & Physical Sciences >
Computer Science >
Computer Science Research Works >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1903/8007

Title: Full-length messenger RNA sequences greatly improve genome annotation
Authors: Haas, Brian J
Volfovsky, Natalia
Town, Christopher D
Troukhan, Maxim
Alexandrov, Nickolai
Feldman, Kenneth A
Flavell, Richard B
White, Owen
Salzberg, Steven L.
Type: Article
Keywords: eukaryotic genomes
genome sequence
DNA
gene models
mRNA
exons
introns
Issue Date: 30-May-2002
Publisher: Genome Biology
Citation: Full-length messenger RNA sequences greatly improve genome annotation. B.J. Haas, N. Volfovsky, C.D. Town, M. Troukhan, N. Alexandrov, K.A. Feldmann, R.B. Flavell, O. White, and S.L. Salzberg. Genome Biology 3:6 (2002), research0029.1-12.
Abstract: Background: Annotation of eukaryotic genomes is a complex endeavor that requires the integration of evidence from multiple, often contradictory, sources. With the ever-increasing amount of genome sequence data now available, methods for accurate identification of large numbers of genes have become urgently needed. In an effort to create a set of very high-quality gene models, we used the sequence of 5,000 full-length gene transcripts from Arabidopsis to re-annotate its genome. We have mapped these transcripts to their exact chromosomal locations and, using alignment programs, have created gene models that provide a reference set for this organism. Results: Approximately 35% of the transcripts indicated that previously annotated genes needed modification, and 5% of the transcripts represented newly discovered genes. We also discovered that multiple transcription initiation sites appear to be much more common than previously known, and we report numerous cases of alternative mRNA splicing. We include a comparison of different alignment software and an analysis of how the transcript data improved the previously published annotation. Conclusions: Our results demonstrate that sequencing of large numbers of full-length transcripts followed by computational mapping greatly improves identification of the complete exon structures of eukaryotic genes. In addition, we are able to find numerous introns in the untranslated regions of the genes.
URI: http://hdl.handle.net/1903/8007
Appears in Collections:Computer Science Research Works

Files in This Item:

File Description SizeFormatNo. of Downloads
Full-length.pdf122.86 kBAdobe PDF65View/Open

All items in DRUM are protected by copyright, with all rights reserved.

 

DRUM is brought to you by the University of Maryland Libraries
University of Maryland, College Park, MD 20742-7011 (301)314-1328.
Please send us your comments. -
All Contents