Church DM
National Center for Biotechnology Information

Co-Authors: Agarwala R, Chen H-C, Chetvernin S, Ermolaeva O, Geer R, Hlavina W, Jang W, Johnson P, Kans J, Katz K, Kitts P, Lipman D, Meric P, Ostell J, Pruitt K, Resenchuk S, Sapojnikov V, Schuler G, Sherry S, Shkeda A, Wagner L, Tatusova T, Maglott D
Institutions: National Center for Biotechnology Information

Genomic resources for the mouse genome have increased greatly. A Whole Genome Shotgun (WGS) assembly generated by the Mouse Genome Sequencing Consortium (MGSC) was released and published. During the past year, sequencing resources for mouse have shifted towards clone based (HTGS) sequence. As of July 25, 2003, 1.03 Gb of non-redundant finished sequence and 2.28 Gb of redundant draft sequence were available. Greater than 95% of the HTGS sequence generated is from the reference strain (C57BL/6J). To leverage all available sequence data, NCBI has been performing composite assemblies that integrate HTGS sequence from C57BL/6J into the MGSCv3. NCBI Build 30 (based on data from Jan. 27, 2003) integrated 0.736 Gb of HTGS phase 3 (finished) sequence. It is anticipated that future builds will utilize all C57BL/6J HTGS. In addition to producing the reference assembly, NCBI has been producing alternate assemblies in a strain specific manner. In Mouse Build 30, 6 alternate assemblies were produced.

In addition to producing these assemblies, NCBI provides annotation for all assemblies via a suite of software tools available from our website ( Current annotation provides prediction of gene models (based on alignment and ab initio prediction), clone and placement (BACs and fosmids based on end sequence alignment), variation, STSs, human and rat transcripts and phenotypes (via STS connections). Annotation information will be provided for the current mouse assemblies. In addition, improvements in the NCBI MapViewer and associated resources with respect to clone identification and comparative mapping will be discussed.