Danforth Center Logo Donald Danforth Plant Science Center
Home About Us Research Resources Opportunities News & Media
Consortium for Maize Genomics
About MGC
Members
Danforth Goals
Strategies
Progress
Data Analysis
Downloads

Danforth Goals

The role of the Donald Danforth Plant Science Center in the Consortium for Maize Genomics is to apply a bioinformatics approach to analyze and assess two methods to isolate and sequence maize genes.

Danforth Center Analysis Goals includes:
1.) Evaluation of gene-hit rate for the individual technologies, and for the resource as a whole.
2.) Estimation of gene coverage.
3.) Estimation of maize genome coverage.
4.) Comparison of GeneThresher® vs. high Cot.
5.) Correlation of gene islands to the maize genetic/physical map.

Results of these studies will be available on the DATA ANALYSIS page when completed.
The Danforth center is in the process of establishing its bioinformatics resources required to perform the required analysis. These tasks include downloading and storing searchable collections of genome sequence data from public sources, licensing and installing software, installing and establishing a relational database to store results and allow more complex comparisons, and establishing a hardware infrastructure to underpin these activities.

Gene Hit rate:
Gene hit rate will be assessed by identifying confident sequence overlap between maize genomic sequence assemblies and collections of known or proposed proteins/genes obtained from public sources. We have been establishing local copies of sequence databanks including portions of Genebank, the TIGR gene indices, SWISS-PROT, Arabidopsis whole genome sequence gene predictions and rice whole genome sequence and gene prediction. Homology detection for gene detection, PFAM searches for domains and an in depth examination of the repeat content of the methyl-filtered assemblies is in process. We have established collaboration with Dr. Sue Wessler who has an interest in examining the island sequence for transposon and MITEs representation.

Gene Coverage:
Evaluation of gene coverage will be examined three ways:
Assembled maize genome island sequence will be aligned to rice gene models and the degree of coverage assessed.
Assembled maize genome island sequence will be aligned to predicted genic regions of the maize genome sequence (BAC clone), which will become available through the efforts of Messing, Wing and Soderlund.
Assembled maize genome island sequence will be aligned to maize mRNA or potentially full-length maize cDNAs culled from public sources.
Alignment of maize gene assemblies to rice genes is part of the "assessment of hit rate" process. Analysis of these alignments for coverage will be most informative when the majority of the sequence data has been acquired and assembled. An examination of the redundancy within the assembly component sequences may provide an indicator for completeness of a given island, which is required to robustly project the coverage potential of each method. We have acquired several ab initio gene prediction software packages that will be useful to help identify maize gene models within islands, and may be necessary to obtain maize gene models from maize BAC sequence. We are currently examining these methods to gain familiarity and determine their limitations. A pipeline will be built to automate their use and collect their outputs.

Estimation of maize genome coverage:
We are look at two methods to assess genome coverage. One method relies on examining the extent with which maize genome island DNA overlaps gene boundaries predicted within the maize BAC sequence. A second method relies on determining a mathematical prediction for coverage. We have established collaboration with Dr. Mike Wendl at the Genome Sequencing Center at Washington University. Dr. Wendl has examined the question of mathematically predicting the clone coverage of large genomes and has developed equations for this purpose. Dr. Wendl will aid in developing a mathematical model to predict genome coverage by these methods.

Comparison of GeneTresher® vs. high Cot:
We will compare the GeneTresher® and high Cot methods for gene content, representation of gene families and gene classes. We will assess the growth of sequence assembly size for each of the two methods individually, as well as examine the content each composite assembly island for components from each method. These analyses should aid in determining the correct mix of each sequence type to maximize gene detection, coverage and island contiguity.

Correlation of gene islands to the maize genetic/physical map:
Throughout this process we will strive to associate as many maize genome islands to the genetic and physical maps as possible. We will work in close collaboration with Dr. Cari Soderlund at University of Arizona to anchor the maize assemblies to the physical map through BACend sequence generated by the Messing, Wing and Soderlund team. This data will be potentially useful in closing gaps and evaluating accuracy. We will work in collaboration with Cari Soderlund and Mary Polacco at MaizeGDB to provide end user access to this mapping information.
Search F.A.Q. / Help Contact
In the spotlight
Release 4.0 - fourth and final assembly of methyl-filtered, high Cot and combined sequence reads are now available.



Release 3.0 - third methylation filtered, high Cot and combined assembly is now
available.



Maize BAC annotations and gene predictions are now available.



Maize Repeat Database can be download now.




975 N. Warson Rd. · St. Louis, Missouri  63132 · 314-587-1211
Karel Schubert: Project Coordinator · maize@danforthcenter.org
2007© Donald Danforth Plant Science Center · All rights reserved