Genome Sequencing and Analysis

Sample sequencing of the soybean genome
In a collaborative project, funded by NSF, between the Univ. of Missouri, Washington Univ. Genome Center and Orion Genomics, sample sequencing of the soybean genome was done to test methylfiltration as a means to enrich for gene-rich segments of the genome. A total of 24,224 sequences were generated from non-filtered, randomly sheared soybean DNA, while another 8,632 sequences were generated from a methylfiltered library. A comparison of these two libraries indicate an enrichment by methyfiltration of approximately 3.2-fold, estimating the soybean genespace at ~342 Mbp. Gene ontology annotation of these sequences indicated that both libraries gave a similar distribution and, thus, methylfiltration does not bias against any particular class of gene sequences. The data were utilized to analyze for soybean repetitive sequences. A number of classes of known repeats were identified. However, 348 novel repeats were found using the program RECON. More recently, more than 21,000 BAC-end sequences were also generated and used to screen for repetitive sequences. These have now been merged with those repeats generated by random sequencing. All of repeats identified are in a FASTA-format file, suitable for use in Repeat Masker for gene annotation, is available for download at the following:

* Repeats found in soybean genome
* TIGR plant repbase library