Concept


Mendel described a gene as a discrete unit of heredity that influences a visible trait. Beadle and Tatum defined a gene as the discrete directions for making a single protein, which influences a metabolic trait. Early sequencing efforts showed that proteins are, in turn, long chains of amino acids arranged in a specific order. The triplet genetic code further refined the definition of a gene as a discrete sequence of DNA encoding a protein — beginning with a "start" codon and ending with a "stop" codon. Gene analysis took a giant step forward with the discovery of methods to determine the exact sequence of nucleotides that compose a specific gene. DNA sequencing was built upon earlier knowledge of DNA polymerases and cell-free systems for replicating DNA. The chain-termination method, which makes clever use of a "defective" DNA nucleotide, now dominates DNA sequencing technology.

Animation


I'm Fred Sanger. In the 50's I was the first to determine the sequence of amino acids in a protein. I showed that the 51 amino acids of the insulin protein are arranged in a specific order. Since the genetic code determines the order of amino acids, the DNA sequence is colinear with the amino acid sequence. However, knowing the amino acid sequence of a protein does not tell us the exact nucleotide sequence of its gene. Remember that the genetic code is redundant – more than one codon can code for an amino acid. For example, here are the redundant codons for these amino acids. In the early 70's, I developed a method to determine the exact sequence of nucleotides in a gene. At about the same time, Alan Maxxam and Walter Gilbert developed a different sequencing method. As it turns out my "chain termination" method is most commonly used in labs today. My method was based on Arthur Kornberg's earlier work on DNA replication. Remember that a new DNA strand is synthesized using an existing strand as a template. Here, I am using poly-thymine DNA as an example. The 5' carbon of an "incoming" deoxynucleotide (dNTP) is joined to the 3' carbon at the end of the chain. Hydroxyl groups in each position form ester linkages with a central phosphate. In this way, the nucleotide chain elongates. The key to my sequencing method is the peculiar chemistry of dideoxynucleotides (didNTP). Like a deoxynucleotide, a didNTP is incorporated into a chain by forming a phosphodiester linkage at its 5' end. However, the didNTP lacks a 3' hydroxyl group (OH) necessary to form the linkage with an incoming nucleotide. So, the addition of a didNTP halts elongation. I set up four separate reactions – one to provide sequence information about each of the nucleotides. Each reaction contained: template DNA, a short primer (about 20 nucleotides), DNA polymerase, and the four dNTPs (one radioactively labeled). One type of didNTP – A, T, C, or G – was added to each. Then I carried out a sequence of steps. Let's focus on the adenine (A) terminator reaction. First, the DNA is denatured into single strands at near-boiling temperature. At high temperatures, the kinetic motion of the DNA molecule disrupts the weak hydrogen bonds that join complementary DNA strands together. When the temperature is lowered, the primer binds to its complementary sequence in the template DNA. Kornberg's earlier work showed that a primer is needed for DNA polymerase to start replication. The DNA polymerase makes no distinction between dNTPs or didNTPs. Each time a didNTP is incorporated, in this case didATP, synthesis is "terminated" and a DNA strand of a discrete size is generated. In this sequencing example, the didATP (purple) has terminated the reaction. The dATP happens to be the radioactive tracer but this has no effect on elongation. Because billions of DNA molecules are present, the elongation reaction can be terminated at any adenine position. This results in collections of DNA strands of different lengths. The same is true for the other three terminator reactions. Each reaction is then loaded into a separate lane of a polyacrylamide gel containing urea, which prevents the DNA strands from renaturing during electrophoresis. Ionized phosphates give the DNA molecule a negative charge, so DNA migrate toward the positive pole of an electric field. The movement of DNA molecules through the polyacrylamide matrix is size dependant. A blue dye that also migrates to the + pole, is added to the samples to track the progress of DNA in the sequencing gel. The dye runs just a little faster than the smaller fragments of DNA. Over the course of electrophoresis, shorter DNA molecules will move further down the gel than larger ones. Millions of terminated molecules of the same size will migrate to the same place and "band" in the gel. After electrophoresis, the gel is sandwiched against X-ray film. The radioactive adenine in the synthesized DNA emit beta particles that expose the film, making a record of the positions of DNA bands in the gel. The sequencing gel is then "read" from bottom to top. The sequence of bands in the various terminator lanes gives the sequence of nucleotides in the template DNA. SEQUENCE: T G T A G A A G A A A C C A C G T T and on... A typical sequencing reaction will yield 200-500 base pairs of readable sequence.

Gallery


Fred Sanger (middle) at age 11 with his older brother and younger sister.
Fred Sanger and his wife, 1940.
Fred Sanger at a 1949 Cold Spring Harbor Symposium meeting.
Fred Sanger, late 1940's.
Fred Sanger in his lab, late 1950's. He is looking at sequencing results.

Audio/Video


Audio Glossary

DNA sequencing, Genetic code (ATGC), Nucleotide

Video Interviews

Richard McCombie

Richard McCombie is a research scientist and the Director of the Lita Annenberg Hazen Genome Center at Cold Spring Harbor Laboratory.

Clip 1 (00:28)
Comments about the way sequencing is done as developed by Fred Sanger and now.

Clip 2 (01:46)
What do DNA sequences actually tell us?

Clip 3 (00:23)
Some genome size comparisons.

Clip 4 (02:23)
Various steps involved in sequencing a genome.

Biography


 

Frederick Sanger received two Nobel prizes (in the same category), for his work on protein sequencing and DNA sequencing.

FREDERICK SANGER (1918-)

Frederick Sanger was born in Rendcombe, England. His father was a medical doctor and it was expected that Fred would also enter the medical field. As Sanger grew up, he became very interested in nature and science and when he went to Cambridge University, he made the decision not to study medicine. He felt that a career in science would give him a better chance to become a problem solver.

Sanger was a conscientious objector during the war because of his Quaker upbringing. After his B.A. in 1939, he stayed at Cambridge to do a Ph.D. with Albert Neuberger, on amino acid metabolism. After his Ph.D. in 1943, Sanger started working for A. C. Chibnall, on identifying the free amino groups in insulin. In the course of identifying the amino groups, Sanger figured out ways to order the amino acids. He was the first person to obtain a protein sequence. By doing so, Sanger proved that proteins were ordered molecules and by analogy, the genes and DNA that make these proteins should have an order or sequence as well. Sanger won his first Nobel Prize for Chemistry in 1958 for his work on the structure of protein.

By 1951, Sanger was on the staff of the Medical Research Council at Cambridge University. In 1962, he moved with the Medical Research Council to the Laboratory of Molecular Biology in Cambridge where Francis Crick, John Kendrew, Aaron Klug and others were all working on a DNA-related problem. Solving the problem of DNA sequencing became a natural extension of his work in protein sequencing. Sanger initially investigated ways to sequence RNA because it was smaller. Eventually, this led to techniques that were applicable to DNA and finally to the dideoxy method most commonly used in sequencing reactions today. Sanger won a second Nobel Prize for Chemistry in 1980 sharing it with Walter Gilbert, for their contributions concerning the determination of base sequences in nucleic acids, and Paul Berg for his work on recombinant DNA.

Sanger retired in 1985 and spends most of his time working in his garden. He gives his wife, Margaret Joan, a lot of credit for being a supportive helpmate in the non-science part of his life. In 1992, the Wellcome Trust and the Medical Research Council established the Sanger Centre, a research center for furthering the knowledge of genomes. The Sanger Centre is one of the main sequencing centers of the Human Genome Sequencing Project and sequencing projects of other organisms are also underway at the Sanger Centre. By all accounts, Sanger is a true "gentle" man, extremely courteous and charming.

Factoid

Links


 

Links

The Sanger Centre

Named for Fred Sanger, the Sanger Centre is one of the centers of the Human Genome Sequencing Project. Their homepage has news and facts about the progress of various sequencing projects.

National Center for Biotechnology Information

An extensive site that has a lot of web resources used by research scientists to analyze sequence data. A good place to start is their Human Genome Resources, which lists news events, techniques and projects relating to the Human Genome Project.

Genome Sequencing Center

Located in St. Louis, the Genome Sequencing Center is one of the major sequencing centers in the United States. They have a nice overview of the overall process involved in sequencing a genome.

Bibliography

  • Hargittai, I., 1999, Interview with Frederick Sanger, The Chemical Intelligencer, Springer-Verlag, New York.

  • Sanger, F., and Coulson, A.R., 1975, A Rapid Method for Determining Sequences in DNA by Primed Synthesis with DNA Polymerase, Journal of Molecular Biology, 94: 441-448.

  • Sanger, F., Nicklen, S., and Coulson, A.R., 1977, DNA Sequencing with Chain-terminating Inhibitors, Proc. Natl. Acad. Sci. U.S.A., 74: 5463-5467.

Glossary


DNA sequencing -
Genetic code (ATGC) -
Nucleotide - One of the structural components, or building blocks, of DNA and RNA. A nucleotide consists of a base (one of four chemicals: adenine, thymine, guanine, and cytosine) plus a molecule of sugar and one of phosphoric acid.

Children resemble their parents.
Genes come in pairs.
Genes don't blend.
Some genes are dominant.
Genetic inheritance follows rules.
Genes are real things.
All cells arise from pre-existing cells.
Sex cells have one set of chromosomes; body cells have two.
Specialized chromosomes determine gender.
Chromosomes carry genes.
Genes get shuffled when chromosomes exchange pieces.
Evolution begins with the inheritance of gene variation.
Mendelian laws apply to human beings.
Mendelian genetics cannot fully explain human health and behavior.
DNA and proteins are the molecules of the cell nucleus.
One gene makes one protein.
A gene is made of DNA.
Bacteria and viruses have DNA too.
The DNA molecule is shaped like a twisted ladder.
A half DNA ladder is a template for copying the whole.
RNA is an intermediary between DNA and protein.
DNA words are three letters long.
The RNA message is sometimes edited.
Some viruses store genetic information in RNA.
RNA was the first genetic molecule.
Mutations are changes in genetic information.
Some types of mutations are automatically repaired.
A chromosome is a package for DNA.
Higher cells incorporate an ancient chromosome.
Some DNA does not encode protein.
Some DNA can jump.
Genes can be turned on and off.
Genes can be moved between species.
DNA responds to signals from outside the cell.
Different genes are active in different kinds of cells.
Master genes control basic body plans.
Development balances cell growth and death.
A genome is an entire set of genes.
Living things share common genes.
DNA is only the starting point for understanding human biology.
adi_at_dnaftb