Sequencing Overview


Sequencing is determining the exact genetic code of an organism. It is determing the order of nucleotides that make up the genome.

There are several methods for small-scale sequencing. However, most of these do not scale well to sequencing entire genomes. The Human Genome Project used a slow, steady approach, focusing on completeness and accuracy. Meanwhile, private companies such as Palo Alto's Incyte Genomics focused their sequencing efforts only on those portions of the genome that held promise for medical treatments. Celera Genomics ambitiously aimed to sequence the entire genome using a highly efficient "whole genome shotgunning" method, which critics derided as inaccurate and incomplete.

Intitially, scientists at the Human Genome Project scoffed at using automated gene sequencing machines, but they soon joined the private sector in realizing their necessity for large scale sequencing. This machine below (image from avery.rutgers.edu/WSSP/StudentScholars/Session14/Session14-1.html) is manufactured by Amersham-Pharmacia Biotech and is similar to machines from other companies. The gel looks the same as the gel prepared manually. The difference is that the primer has a fluorescent molecule and after the products of DNA synthesis track down the gel, a laser lights up the fluorescent molecule in each molecule and a detector figures out the amount of light emitted by each lane, reporting the quantities to a computer which reports a DNA sequence.


The Human Genome Project's Approach

Human Genome Project scientists divided the genome into 22,000 segments, each 150,000 letters long, whose positions were mapped. The segments were then cloned many times and the clones decoded by automatic gene sequencers. The process was repeated several times to ensure completeness and accuracy.


Celera's Approach

Craig Venter adopted whole genome shotgunning. In this method, DNA is cloned several times and then shredded into 60 million bits, each 2,000 to 10,000 letters long. Each fragment is decoded by machines, which then send the results to Celera's massive computers. Now comes the difficult, controversial step: The computers attempt to reassemble the miniscule fragments back into the 23 pairs of human chromosomes. Whole genome shotgunning is far faster than more traditional approaches, but critics charge that reassembly results in incomplete, inaccurate results (Venter admits some flaws but says they can be rectified and that HGP's method also yields holes.). Until 1995, shotgunning had been used only to sequence small parts of a genome. Then, along with Nobel winner Hamilton Smith, who had the idea, Venter tried the method on an entire genome. Together, they determined the entire genome of the bacteria that causes ear infections and meningitis. Beginning sequencing in September 1999, Celera massed over 20,000 CPU hours involving 500 million trillion base to base comparisons, and managed more than 80 terabytes of data. Just the algorithms and data for sequence and assembly of the 3.12 billion letter of the genetic code required 64 gigabytes of shared memory.



Sources:
avery.rutgers.edu/WSSP/StudentScholars/Session14/Session14-1.html
Thompson, Dick. January 24, 2000 issue of Time Magazine (vol. 155, no. 3). Viewed online 9-16-00.