Understanding the strategy of Sanger DNA sequencing

Understanding the strategy of Sanger DNA sequencing

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

The Sanger sequencing method creates large numbers of sequences of all possible lengths, ending with a specific nucleotide, by terminating with a tagged (fluorescent) nucleotide at the end.

But if you already have fluorescent nucleotides of a specific base, why not just do regular PCR with them, create a huge number of full length copies of the original sequence, and then simply see the locations where the nucleotide fluoresces to determine all the locations of that base. Why do we need a large number of copies ending at each possible location, as with the Sanger method?

If you mark the full length strand of the DNA with the fluorescent labels, you will get a lot of signals from the same nucleotide without the possibility to discriminate where the actual nucleotide is located on the strand.

Sanger sequencing doesn't end with the preparation of the terminated and labelled DNA strands, the following step is crucial in discriminating where the labelled base actually is. After labelling, the sample is run through high resolution capillary gel electrophoresis to sort them by size. The smallest fragment comes out first, then the one with one base more and so on. The detector which then identifies which fragment is coming through is located at the end of the capillary.

This works like shown in the schematic image (from here):

If you would mark a complete strand, no dicrimination by size is possible (ideally, all DNA would run on the same height of a gel) and all flourescent signals for all positions would come on the same spot. Not very helpful.


I originally suggested closing this question as I think it confuses methodological details with strategy. There is an accepted answer that clarifies things with a clear diagram of the Sanger method. In the circumstances I decided to add a different sort of answer - one that emphasises the principles involved in the different approaches, rather than practical details which are not intimately associated with logic of the different approaches.

General Strategy of DNA sequencing

Irrespective of the method, the determination of DNA (or any linear) sequence requires two pieces of conceptual information:

  1. Knowledge of a particular position in the sequence.
  2. The identity of the base at that position. This is illustrated in the graphic. In addition, there is a practical requirement:
  3. Means to uniquely detect this.

Let us compare these in three different sequencing approaches.

A. Chemical fragmentation (Maxam & Gilbert)

Here the DNA to be sequenced labelled at one end is treated with different chemicals that cleave at different bases in separate reactions.

  1. The position is indicated by the length of the fragment.
  2. The base is known from the chemical used.
  3. Detection is performed by separating the different fragments by length and using their radioactivity to visualize them (by autoradiography).

B. Chain termination during synthesis (Sanger)

This method uses enzyme-catalysed copying of the DNA to be sequenced, with termination of the copying produced by the four di-deoxy analogues of the deoxynucleotide triphosphates (dNTPs) in separate reactions.

  1. The position is indicated by the length of the fragment (as in A).
  2. The base is known from the particular di-deoxy analogue used to terminate the reaction.
  3. Detection is performed by separating the products of synthesis by length and using the radioactivity or fluorescence of the dNTPs used in synthesis to visualize them.

(Note to the poster: obviously you need fragments of all different lengths if you are going to identify the base at each the position in the chain.)

C. Phased synthesis ('Next Generation Sequencing')

This type of method also uses enzyme-catalysed copying of the DNA to be sequenced. However, there is no termination of the chains, as in B, but the cycles of addition are followed by various techniques.

  1. The position of a base in a sequence is indicated by the cycle of addition at which the insertion of the base is detected.
  2. The base is known from which of the different dNTPs allow the extension reaction to occur.
  3. Detection methods vary but include detecting the release of pyrophosphate by coupling to a light-emitting reaction (pyrosequencing) and detecting hydrogen ions released during the polymerization using a semiconductor chip (ion torrent).

Coda: Principle and Practice

In stressing the general principles of DNA sequencing the methodology mentioned has been limited to that required for detection - there was no need to even mention PCR. Of course, different methodologies for preparing multiple samples for sequencing have hugely influenced the speed and cost of DNA sequencing. However the student should understand the principles before getting bogged down in the details, which can be found e.g. in this Wikipedia article on sequencing and on the website of ATDBIO (albeit a not disinterested company).

How Does Sanger Sequencing Work?

Let’s go back to the basics and explore the technology platform that has been regarded as the gold standard for many years. You guessed it – we’re talking about Sanger Sequencing by capillary electrophoresis. Many might ask, “why is it called Sanger Sequencing?” Sanger Sequencing is named after the inventor of this ground breaking technology, Dr. Frederick Sanger, who developed this method over 40 years ago in the mid-70s. So, what are the basics of Sanger Sequencing?

It all starts by having a short primer binding next to the region of interest. In the presence of the 4 nucleotides, the polymerase will extend the primer by adding on the complementary nucleotide from the template DNA strand. To find the exact composition of the DNA sequence, we need to bring this reaction to a defined stop that allows us to identify the base of the very end of this particular DNA fragment. Sanger did this by removing an oxygen atom from the ribonucleotide. Such a nucleotide is called a dideoxynucleotide. This is analogous to throwing a wrench into a gear. The polymerase enzyme can no longer add normal nucleotides onto this DNA chain. The extension has stopped and we now need to identify what it is. We identify the chain terminating nucleotide by a specific fluorescent dye, 4 specific colors to be exact. Sanger sequencing results in the formation of extension products of various lengths terminated with dideoxynucleotides at the 3′ end.

The extension products are then separated by Capillary Electrophoresis or CE. The molecules are injected by an electrical current into a long glass capillary filled with a gel polymer. During CE, an electrical field is applied so that the negatively charged DNA fragments move toward the positive electrode. The speed at which a DNA fragment migrates through the medium is inversely proportional to its molecular weight. This process can separate the extension products by size at a resolution of one base. A laser excites the dye labeled DNA fragments as they pass through a tiny window at the end of the capillary. The excited dye emits a light at a characteristic wavelength that is detected by a light sensor. Software can then interpret the detected signal and translate it into a base call. When the sequencing reaction is performed in the presence of all four terminated nucleotides, you eventually get a pool of DNA fragments that are measured and separated base by base. What you will get in the end is a data file showing the sequence of the DNA in a colorful electropherogram and a text file which you can use to answer the questions you may be asking.

And that, in a nutshell is Sanger Sequencing.

If you want to learn more, just download our free Sanger Sequencing Handbook.

We’re making new Seq It Out videos all the time, so if you have an idea the would make a great topic, just drop us a line.

Understanding the strategy of Sanger DNA sequencing - Biology

* Discovery power is the ability to identify novel variants.
† Mutation resolution is the size of the mutation identified. NGS can identify large chromosomal rearrangements down to single nucleotide variants.
‡ 10 ng DNA will produce

1 kb with Sanger sequencing or

300 kb with targeted resequencing (250 bp amplicon length × 1536 amplicons with an AmpliSeq for Illumina workflow)

Options for Sanger vs. Next-Generation Sequencing

Sanger sequencing is an effective approach for variant screening studies when the total number of samples is low. For variant screening studies where the sample number is high, amplicon sequencing with NGS is more efficient and cost-effective. For discovery-related applications, any NGS approach will provide higher discovery power compared to Sanger sequencing. The discovery power will increase as the total target sequence size increases. Targeted resequencing (amplicon or enrichment methods) is the most cost-effective solution when sequencing more than 20 target regions.

DMCA Complaint

If you believe that content available by means of the Website (as defined in our Terms of Service) infringes one or more of your copyrights, please notify us by providing a written notice (“Infringement Notice”) containing the information described below to the designated agent listed below. If Varsity Tutors takes action in response to an Infringement Notice, it will make a good faith attempt to contact the party that made such content available by means of the most recent email address, if any, provided by such party to Varsity Tutors.

Your Infringement Notice may be forwarded to the party that made the content available or to third parties such as

Please be advised that you will be liable for damages (including costs and attorneys’ fees) if you materially misrepresent that a product or activity is infringing your copyrights. Thus, if you are not sure content located on or linked-to by the Website infringes your copyright, you should consider first contacting an attorney.

Please follow these steps to file a notice:

You must include the following:

A physical or electronic signature of the copyright owner or a person authorized to act on their behalf An identification of the copyright claimed to have been infringed A description of the nature and exact location of the content that you claim to infringe your copyright, in sufficient detail to permit Varsity Tutors to find and positively identify that content for example we require a link to the specific question (not just the name of the question) that contains the content and a description of which specific portion of the question – an image, a link, the text, etc – your complaint refers to Your name, address, telephone number and email address and A statement by you: (a) that you believe in good faith that the use of the content that you claim to infringe your copyright is not authorized by law, or by the copyright owner or such owner’s agent (b) that all of the information contained in your Infringement Notice is accurate, and (c) under penalty of perjury, that you are either the copyright owner or a person authorized to act on their behalf.

Send your complaint to our designated agent at:

Charles Cohn Varsity Tutors LLC
101 S. Hanley Rd, Suite 300
St. Louis, MO 63105

1. You can use any of the following programs to view your .ab1 chromatogram file

2. You should see individual, sharp and evenly spaced peaks

3. Expect to get 500-700 bases of clean reliable DNA sequence

Anything less and you might suspect contamination in your sample or consider asking your sequencing facility to apply a special protocol for a difficult template. Anything more and you’re venturing into the uncertain terrain.

4. Never trust the first 20-30 bases of a DNA sequencing read

The peaks here are usually unresolved and small, so I suggest designing your primer at least 50bp upstream of the sequence of interest.

5. Use a silica spin column for purification of the samples you send for DNA sequencing

If your sequencing facility requires you to perform your own Big Dye PCR amplification reaction (as opposed to using the all inclusive service some companies offer), you can purify the product either via the Sodium Acetate/isopropanol precipitation method or using a silica spin column available from several vendors. The precipitation method has an unfortunate side effect of messing up the reaction around base 70-75 of the read (see image below), so I would strongly recommend using a silica spin column. They can be pricey, but well worth it.

6. Edit your DNA sequence

Finally, when you do see a miscalled peak, don’t be shy. Feel free to edit it. Most chromatogram viewing programs (even the free ones) allow you to edit the sequence.

I hope these tips will help you get the most out of your DNA sequencing results and to troubleshoot any problems that come up. Good luck analyzing your sequences!

More DNA Sequencing Resources:

    Thank you BitesizeBio for originally publishing this and allowing us to share it with our readers!

DNA Sequencing

David P. Clark , Nanette J. Pazdernik , in Molecular Biology (Second Edition) , 2013

4 Cycle Sequencing

Most sequencing reactions that are used for automated sequencing are a combination of PCR and regular sequencing. As in automated sequencing, a mixture of double-stranded DNA template, primer, and deoxynucleotides (dNTPs) are added. In addition, fluorescently labeled ddNTPs are mixed in a reaction tube. Instead of using Sequenase or Klenow polymerase, a heat resistant DNA polymerase, such as a modified Taq polymerase, is used. The reactions are cycled through three different temperatures to generate the labeled fragments. Just as in PCR, the first step is to denature the template DNA at a high temperature (90°C). The next step is to anneal the primer at a lower temperature (50–60°C) and then finally increase the temperature to the optimum for Taq polymerase (70°C). These three steps are repeated over and over in order to generate a large number of labeled fragments for automated sequencing. As before, each of these fragments varies by size in single base increments due to the random incorporation of the dideoxynucleotides. The final sequencing products are separated by size and recorded for the fluorescent dye as described in automated sequencing (above).

The Sanger Sequencing Method

The Sanger sequencing method relies on dideoxynucleotides (ddNTPs),a type of deoxynucleoside triphosphates (dNTPs), that lack a 3′ hydroxyl group and have a hydrogen atom instead . When these bases bind to the growing DNA sequence, they terminate replication as they cannot bind other bases. To perform Sanger Sequencing, you add your primers to a solution containing the genetic information to be sequenced, then divide up the solution into four PCR reactions. Each reaction contains a with dNTP mix with one of the four nucleotides substituted with a ddNTP (A, T, G, and C ddNTP groups). At the end of the PCR, each of your four reactions will yield PCR products of various lengths because replication is randomly terminated. By running the samples on a gel with 4 lanes, you can piece together the sequence as each sequence has been replicated from the same original material. Here is an example where the ddNTPs are in bold and the dNTPs are not:

Your sequence is ATGCTCAG.

Your four reactions give you:
Reaction with ddATP: A, ATGCTCA, ATGCTCAG
Reaction with ddGTP: ATG, ATGCTCAG
Reaction with ddTTP: AT, ATGCT, ATGCTCAG

All the reactions once run a gel would look something like this (Image by Olwen Reina):

Each band denotes the different lengths code. For example, the band is the right under the “A” symbolizes the sequence: “ATGCTCA

Let’s imagine a party game. The game is a guessing game. Here is how it is played:

You are thinking of a number and the group has to guess it. The tricky part is that the number is 200-digits in length. You are reading the digits of the number in your head without making a sound. Every so often a person interrupts you, and you tell them the single digit you were just thinking and where it is in the sequence of 200. Each time you are interrupted, you have to start again. You leave after a few hours and the group has to figure out the 200-digit number. They have to piece together the information you gave them, for example the 25 th number was 5, the 40 th number was 0, and so on. Using the information from their interruptions, they can repeat the number they gave you.

While this sounds like the lamest game in the world, it works very well for sequencing!

Unfortunately, it is slow, expensive, and (previously) relies on radioactive materials. This pushed scientists to develop new and better forms of genome sequencing.

2. Size Separation by Gel Electrophoresis

In the second step, the chain-terminated oligonucleotides are separated by size via gel electrophoresis. In gel electrophoresis, DNA samples are loaded into one end of a gel matrix, and an electric current is applied DNA is negatively charged, so the oligonucleotides will be pulled toward the positive electrode on the opposite side of the gel. Because all DNA fragments have the same charge per unit of mass, the speed at which the oligonucleotides move will be determined only by size. The smaller a fragment is, the less friction it will experience as it moves through the gel, and the faster it will move. In result, the oligonucleotides will be arranged from smallest to largest, reading the gel from bottom to top.

In manual Sanger sequencing, the oligonucleotides from each of the four PCR reactions are run in four separate lanes of a gel. This allows the user to know which oligonucleotides correspond to each ddNTP.

In automated Sanger sequencing, all oligonucleotides are run in a single capillary gel electrophoresis within the sequencing machine.

DNA Sequencing Techniques

DNA sequencing techniques are used to determine the order of nucleotides (A,T,C,G) in a DNA molecule.

Learning Objectives

Differentiate among the techniques used to sequence DNA

Key Takeaways

Key Points

  • Genome sequencing will greatly advance our understanding of genetic biology and has vast potential for medical diagnosis and treatment.
  • DNA sequencing technologies have gone through at least three “generations”: Sanger sequencing and Gilbert sequencing were first-generation, pyrosequencing was second-generation, and Illumina sequencing is next-generation.
  • Sanger sequencing is based on the use of chain terminators, ddNTPs, that are added to growing DNA strands and terminate synthesis at different points.
  • Illumina sequencing involves running up to 500,000,000 different sequencing reactions simultaneously on a single small slide. It makes use of a modified replication reaction and uses fluorescently-tagged nucleotides.
  • Shotgun sequencing is a technique for determining the sequence of entire chromosomes and entire genomes based on producing random fragments of DNA that are then assembled by computers which order fragments by finding overlapping ends.

Key Terms

  • DNA sequencing: a technique used in molecular biology that determines the sequence of nucleotides (A, C, G, and T) in a particular region of DNA
  • dideoxynucleotide: any nucleotide formed from a deoxynucleotide by loss of an a second hydroxyl group from the deoxyribose group
  • in vitro: any biochemical process done outside of its natural biological environment, such as in a test tube, petri dish, etc. (from the Latin for “in glass”)

DNA Sequencing Techniques

While techniques to sequence proteins have been around since the 1950s, techniques to sequence DNA were not developed until the mid-1970s, when two distinct sequencing methods were developed almost simultaneously, one by Walter Gilbert’s group at Harvard University, the other by Frederick Sanger’s group at Cambridge University. However, until the 1990s, the sequencing of DNA was a relatively expensive and long process. Using radiolabeled nucleotides also compounded the problem through safety concerns. With currently-available technology and automated machines, the process is cheaper, safer, and can be completed in a matter of hours. The Sanger sequencing method was used for the human genome sequencing project, which was finished its sequencing phase in 2003, but today both it and the Gilbert method have been largely replaced by better methods.

Sanger Method: In Frederick Sanger’s dideoxy chain termination method, fluorescent-labeled dideoxynucleotides are used to generate DNA fragments that terminate at each nucleotide along the template strand. The DNA is separated by capillary electrophoresis on the basis of size. From the order of fragments formed, the DNA sequence can be read. The smallest fragments were terminated earliest, and they come out of the column first, so the order in which different fluorescent tags exit the column is also the sequence of the strand. The DNA sequence readout is shown on an electropherogram that is generated by a laser scanner.

Sanger Sequencing

The Sanger method is also known as the dideoxy chain termination method. This sequencing method is based on the use of chain terminators, the dideoxynucleotides (ddNTPs). The dideoxynucleotides, or ddNTPSs, differ from deoxynucleotides by the lack of a free 3′ OH group on the five-carbon sugar. If a ddNTP is added to a growing DNA strand, the chain is not extended any further because the free 3′ OH group needed to add another nucleotide is not available. By using a predetermined ratio of deoxyribonucleotides to dideoxynucleotides, it is possible to generate DNA fragments of different sizes when replicating DNA in vitro.

A Sanger sequencing reaction is just a modified in vitro DNA replication reaction. As such the following components are needed: template DNA (which will the be DNA whose sequence will be determined), DNA Polymerase to catalyze the replication reactions, a primer that basepairs prior to the portion of the DNA you want to sequence, dNTPs, and ddNTPs. The ddNTPs are what distinguish a Sanger sequencing reaction from just a replication reaction. Most of the time in a Sanger sequencing reaction, DNA Polymerase will add a proper dNTP to the growing strand it is synthesizing in vitro. But at random locations, it will instead add a ddNTP. When it does, that strand will be terminated at the ddNTP just added. If enough template DNAs are included in the reaction mix, each one will have the ddNTP inserted at a different random location, and there will be at least one DNA terminated at each different nucleotide along its length for as long as the in vitro reaction can take place (about 900 nucleotides under optimal conditions.)

The ddNTPs which terminate the strands have fluorescent labels covalently attached to them. Each of the four ddNTPs carries a different label, so each different ddNTP will fluoresce a different color.

After the reaction is over, the reaction is subject to capillary electrophoresis. All the newly synthesized fragments, each terminated at a different nucleotide and so each a different length, are separated by size. As each differently-sized fragment exits the capillary column, a laser excites the flourescent tag on its terminal nucleotide. From the color of the resulting flouresence, a computer can keep track of which nucleotide was present as the terminating nucleotide. The computer also keeps track of the order in which the terminating nucleotides appeared, which is the sequence of the DNA used in the original reaction.

Second Generation and Next-generation Sequencing

The Sanger and Gilbert methods of sequencing DNA are often called “first-generation” sequencing because they were the first to be developed. In the late 1990s, new methods, called second-generation sequencing methods, that were faster and cheaper, began to be developed. The most popular, widely-used second-generation sequencing method was one called Pyrosequencing.

Today a number of newer sequencing methods are available and others are in the process of being developed. These are often called next-generation sequencing methods. The most widely-used sequencing method currently is one called Illumina sequencing (after the name of the company which commercialized the technique), but numerous competing methods are in the developmental pipeline and may supplant Illumina sequencing.

In Illumina sequencing, up to 500,000,000 separate sequencing reactions are run simultaneously on a single slide (the size of a microscope slide) put into a single machine. Each reaction is analyzed separately and the sequences generated from all 500 million DNAs are stored in an attached computer. Each sequencing reaction is a modified replication reaction involving flourescently-tagged nucleotides, but no chain-terminating dideoxy nucleotides are needed.

When the human genome was first sequenced using Sanger sequencing, it took several years, hundreds of labs working together, and a cost of around $100 million to sequence it to almost completion. Next generation sequencing can sequence a comparably-sized genome in a matter of days, using a single machine, at a cost of under $10,000. Many researchers have set a goal of improving sequencing methods even more until a single human genome can be sequenced for under $1000.

Shotgun Sequencing

Sanger sequence can only produce several hundred nucleotides of sequence per reaction. Most next-generation sequencing techniques generate even smaller blocks of sequence. Genomes are made up of chromosomes which are tens to hundreds of millions of basepairs long. They can only be sequenced in tiny fragments and the tiny fragments have to put in the correct order to generate the uninterrupted genome sequence. Most genomic sequencing projects today make use of an approach called whole genome shotgun sequencing.

Whole genome shotgun sequencing involves isolating many copies of the chromosomal DNA of interest. The chromosomes are all fragmented into sizes small enough to be sequenced (a few hundred basepairs) at random locations. As a result, each copy of the same chromosome is fragmented at different locations and the fragments from the same part of the chromosome will overlap each other. Each fragment is sequenced and sophisticated computer algorithms compare all the different fragments to find which overlaps with which. By lining up the overlapped regions, a process called tiling, the computer can find the largest possible continuous sequences that can be generated from the fragments. Ultimately, the sequence of entire chromosomes are assembled.

Whole genome shotgun sequencing.: In shotgun sequencing, multiple copies of the same chromosome are isolated and then fragmented in random locations. The different copies of the chromosome end up generating different length fragments. When the complete collection of fragments has been sequenced, comparing the sequences of all the fragments will reveal which fragments have ends that overlap with other fragments. The complete sequence from one end of the original DNA to the other can be assembled by following the sequence from the first overlapping fragment to the last.

Genome sequencing will greatly advance our understanding of genetic biology. It has vast potential for medical diagnosis and treatment.

Molecular Cloning and Recombinant DNA Technology

DNA Sequencing

DNA sequencing is used to determine the exact sequence of nucleotides (A, G, C, T) in a strand of DNA. The Sanger dideoxy chain-termination method can determine the sequence of nucleotides with high fidelity for a stretch of approximately 200–500 base pairs in any purified DNA sample. Based on in vitro DNA synthesis, the Sanger method synthesizes short pieces of DNA in the presence of nucleotide bases to which other bases cannot be added: chain-terminating dideoxyribonuceoside triphosphates. These chain-terminating nucleotides are labeled and mixed with regular nucleotide bases so that fragments of DNA will be created at many different lengths, each randomly stopped by the addition of a chain-terminating nucleotide. The four chain-terminating nucleotides are each labeled with a different colored fluorescent dye. Thousands of fragments of different lengths are run on a gel, and an automated fluorescence detector can quickly scan the gel to read the identity of the last, terminating nucleotide ( Figure 10.6 ). Modern molecular biology laboratories tend not to run sequencing reactions themselves, as it is typically faster and less expensive to send DNA samples off to a dedicated facility for sequencing.

Figure 10.6 . DNA sequencing.

Each chain-terminating nucleotide is labeled with a different fluorescent dye, and many fragments of DNA are synthesized. When all of the synthesized DNA fragments are separated using gel electrophoresis, they will be separated by size and a detector can automatically detect which label is at each position by measuring the intensity of each fluorescent signal.

The Sanger method is the gold standard for DNA sequencing, and it is still commonly used for the day-to-day verification of molecular biology experiments. However, DNA sequencing methods have evolved rapidly over the past decade in what are referred to as next-generation sequencing (NGS) techniques, which make sequencing large stretches of DNA cheaper and faster. NGS techniques have enabled large-scale sequencing of entire genomes and allowed for the rapid development of the field of genomics.

Watch the video: Metodo Sanger (July 2022).


  1. Cinneididh

    I removed this phrase

  2. Vicq

    Agree, this remarkable idea is right about

  3. Roche

    My opinion, the question is fully disclosed, the author tried, for which my bow to him!

  4. Kim

    It is the good idea.

  5. Kenward

    Interesting article, respect to the author

Write a message