International Mammalian Genome Society

The 15th International Mouse Genome Conference (2001)


Eugene W. Myers
Celera Genomics
45 West Gude Drive
Rockville Maryland 20850 USA

Co-Authors: Sutton G, Mural R, Li P, Yandell M, Halpern A, Smith H, Venter JC
Institution: Celera Genomics

The assembly of a shotgun data set of end-sequence reads was controversial in 1998 with critics claiming that it would require an impossibly large computation and result in a very fragmented and error-ladden assembly.

In 1999 the informatics research team at Celera produced an assembly of the 130Mbp Drosophila genome from a 13X whole genome shotgun data set, followed by an assembly of the Human genome in 2000 with a 5.1X data set and synthetic reads from public data. This year we produced an assembly of the Mouse genome from a 5.3X data set of 3 different mouse strains in equal proportions.

Our results from the mouse project prove unequivocally that whole genome shotgun sequencing is effective at delivering a highly reliable reconstruction with long-range order and orientation and dense polymorphism information. The fact that assembly is achieved at only 5.3X implies that a relatively complete picture of a large vertebrate genome can be obtained in six to nine months at very competitive cost with current technology. We demonstrate that the 5.3X assembly is a solid substrate for annotation and that it has great syntenic power when compared against the human genome.

