
The first step in gene cloning is cutting DNA into appropriately sized pieces. The size of DNA molecules is measured by subjecting the DNA to a technique called "electrophoresis". An electrophoretic apparatus gives DNA an electric shove, forcing it to move through a porous, water-filled gel. Bigger pieces of DNA find it more difficult to find their way through the gel and so move more slowly. Smaller pieces more more quickly. Gel electrophoresis thereby separates DNA according to length. Two types of gels are commonly used.
Agarose gels
The most popular variant of electrophoresis is carried out with
agarose gels. In the laboratory, suspensions of agarose in
buffer are heated to boiling (often in microwave ovens), and the hot
solution is poured on plastic or glass plates to a height of a few
millimeters or so. Agarose is a derivative of agar -- an edible
polysaccharide extracted from seaweed. When cooled from heated
solutions, solutions of agarose form a nearly transparent gel that
looks like a slab of gelatin dessert. After the gel sets, DNA
solutions are placed in little depressions formed by leaving a comb
in the gel. The gel is then flooded with a weak salt solution, and a
voltage is applied. Since DNA is negatively charged (due to its
phosphate groups), it moves toward the positive pole.
Large DNA molecules advance slowly in the gel because they are
impeded by the gel matrix, while smaller ones encounter fewer barriers and
move more quickly. By plotting the distance migrated against the
reciprocal of size (in base pairs) of a group of DNA standards, a
fairly straight line can be obtained. From these data, the length of
unknowns can be easily calculated. For ease of comparison, the
standards are usually run in the same gel beside the
unknowns.
Gels of differing porosities can be made by adjusting the concentration of agarose. With 2 and 3% solutions, double-stranded DNA's as small as 50 or 100 base pairs can be resolved. More dilute gels can resolve fragments as large as about 30,000 base pairs (30 kilobase pairs or 30kb).
The picture at the right, and the photograph of the gel below, come from Amersham Pharmacia Biotech (APB)'s web site. APB is a manufacture and distributor of biotechnology products.
Polyacrylamide gels
When small DNA molecules need to be analyzed or when
high-resolution separations are required, electrophoresis is carried
out in polyacrylamide gels. They, like the gels made from
agarose, are water white. Unlike agarose, polyacrylamide gels are
polymers of acrylamide, a small synthetic organic compound.
Polyacrylamide gel electrophoresis (PAGE) is most often used to
separate fragments of DNA that are between 6 and a few thousand base
pairs in length. The technique is used mostly in DNA sequencing, a
topic that we'll come back to in a later session.
Detecting DNA
DNA doesn't have any color; it only absorbs light in the
ultraviolet range. How is it visualized in a clear gel after
electrophoresis? DNA can be detected in both agarose and
polyacrylamide gels after staining with various dyes, incuding
ethidium bromide, a dye that forms a fluorescent complex upon
binding to DNA. Usually, the et
hidium
bromide is added to the gel before electrophoresis. After the run,
the gel is examined and often photographed under an ultraviolet
light. Yellow-orange zones (bands) of fluorescence appear (similar to
those in the picture at the right), indicating migration of discrete
pieces of DNA. Other agents -- like methylene blue or silver stains
-- can also be used to visualize DNA, but they are either less
sensitive or less convient, and aren't used as often. By the way,
many dyes that bind to DNA are mutagenic; i.e, they cause mutations.
Be sure to wear gloves when working with substances such as ethidium
bromide.
Separating large fragments of DNA
Contrary to what you might expect, very large fragments of DNA
readily move in agarose gels when an electrical field is applied. But
they don't separate according to size. In fact, very big pieces of
DNA all migrate at about the same velocity regardless of whether they
are 20kb in length or 2000kb.
PFE
Since many techniques, particulary recently derived ones, depend
on working with very large DNA fragments, even pieces the size of
some chromosomes, several newly developed electrophoretic techniques
have been developed. In one of these, called pulsed field
electrophoresis (PFE), DNA molecules are analyzed on agarose gels
but with a modified apparatus. Instead of having only two sets of
electrodes situated at opposite ends of the electrophoresis device as
in the picture shown above, the pulsed field apparatus bears four
sets of electrodes. As shown in the figure at the right, two are
designated A- and A+ and the other two are called B- and B+.
The two A's and
the two B's are set at an angle of 120° with respect to one
another. During electrophoresis, the DNA molecules are subjected to
alternating bursts of current from the two A and B pairs, thereby
pulling the DNA alternately to the right and to the left as it
advances (from south to north in this case) through the gel. The DNA
seems to reorient itself each time the current is switched, and the
time that it takes to turn itself around is dependent on its length.
The result is that very large pieces of DNA can be separated from
another on the basis of their size. The technique also allows one to
estimate the length of an unknown when run beside standards of known
size. One of the most impressive accomplishments of this technique is
the separation of the 16 chromosomes of yeast in a single
electrophoretic run, as shown at the right.
FIE
Another procedure, field inversion electrophoresis, is also
effective at separating large molecules of DNA. In it, the electrodes
are set up as in ordinary electrophoresis, but the current is
switched so that the DNA first moves forward and then backward. Of
course, if the switching were to be done in equal time intervals --
say, 1 second forward and 1 second back -- the DNA wouldn't move at
all. To get it to move at all, electrophoresis is conducted forward
for a longer time than backward (or at a greater voltage forward
versus backward). One might expect that moving ahead three steps and
back two would be equivalent to moving forward one step at a time,
but that's not what happens. Apparently the DNA reorients itself
during the switching cycles, just as it does in pulsed field
electrophoresis. Again, the ability to change directions seems to be
dependent on size. This allows different lengths of DNA to be
separated from one another.
One critically important advance that has greatly stimulated the rapid progress in molecular biology and genetic engineering was the discovery of a set of enzymes that are capable of cutting DNA at defined sequences. These enzymes are found in a variety of microorganisms and are called restriction endonucleases, or more simply, restriction enzymes. The first specific restriction enzyme was discovered by Hamilton Smith in 1970, an accomplishment for which he (and two others) were awarded the Nobel Prize. Since then, more than 3,000 similar enzymes have been reported, many of which have different specificities.
Richard Roberts has set up a database, called REBASE, describing the properties of all the known restriction endonucleases. (As of today, 3080 enzymes are officialy in the Rebase database, 71 possible enzymes are in the wings, and there is 1 wierd enzyme that is not offically classified).
Nomenclature
Smith and another Nobel laureate, Daniel Nathans, devised a
nomenclature for these enzymes. In brief, the name of each
restriction enzyme derives from the organism from which it is
isolated. The first letter of the genus name plus the first two
letters of the species name form the first three letters of the
restriction enzyme's name. If necessary, a letter indicating strain
designation is added, and finally a number is appended that stands
for the order in which the enzyme was discovered in each organism.
For example, BamHI is the name of an enzyme that is isolated from the
bacterium Bacillus amyloliquifaciens, strain H, and it was,
presumably, the first restriction endonuclease identified from that
source. The first three letters of the name should be italicized, but
because italicized letters are often hard to read on a computer, I
have left them in plain text.
What they do
The restriction enzymes owe their usefulness to the fact that they
bind to DNA at specific DNA sequences, four to eight nucleotides in
size, called recognition sites. Once bound, restriction
enzymes cut the DNA at or near this site. With a little thought, it
should be clear that an enzyme that has a six base pair recognition
site will, on the average, produce larger pieces of DNA than one that
recognizes a four base site. Expressed quantitatively, the
approximate size of the fragments produced by a particular enzyme,
given that it is cutting a DNA containing an equal proportion of all
four nucleotides, can be calculated from the formula:
average size of fragment = 4N
where N is the number of bases that the enzyme recognizes. Hence, a four-cutter (the shorthand name for an enzyme that recognizes a site containing four base pairs) is expected to cleave random DNA into fragments of about 44 (256) base pairs while an enzyme with a recognition site of six bases will produce pieces (on the average) of about 46 (4096) base pairs.
Properties of restriction enzymes
Ends
As illustrated, restriction enzymes invariably cut DNA in such a
way as to leave a 3' hydroxyl on one end, and a 5' phosphate on the
other.
In addition, most (but not all) enzymes recognize a symmetrical site (see the figures below).
Another interesting property of restriction enzymes is that while they often recognize a symmetrical site, they do not always cut at the axis of symmetry. For instance, the enzyme EcoRI (from Escherichia coli strain RY13 (I guess they didn't want to put all those letters in the name of the enzyme) and pronounced "echo are one"), recognizes the site GAATTC and cuts the DNA between the G's and the A's in a manner depicted below.
Note that the cut produces an overhanging 5' single-stranded end of four nucleotides on each of the two pieces that are newly liberated (this is clearer if you look at the next two illustrations).
Similarly, the enzyme BglII (from the microorganisms, Bacillus globiggi and universally and irreverently pronounced BAGEL TWO) recognizes the sequence AGATCT and cuts between the first A and G residues.
Notice that BglII also yields a single-stranded 5' end.
A third enzyme, BamHI (from the bacterium Bacillus amyloliquifaciens) cuts similarly.
In fact, the four nucleotide single-stranded ends are the same for both BglII and BamHI. Moreover, there are at least two other six cutting enzymes that have been discovered that leave the same four nucleotide overhang: BclI and XhoII. These overhanging ends are very useful because -- under the proper conditions -- they may base pair with each other. In fact, because of their affinity for one another, they are often called cohesive or sticky ends. Moreover, if molecules with these ends are treated with the appropriate enzyme -- DNA ligase -- their phosphodiester bonds may be rejoined (ligated). When two ends that originate from digestion by a single enzyme are ligated, the resulting molecule can be cut by the same enzyme again. But if the ends of a DNA molecule that originated with a BamHI cut and a BglII cut are joined together, the new sequence will not be cut with either enzyme (Why?).
Note that all restriction endonucleases do not generate 5' single-strand overhangs. In fact, some don't even produce an overhang at all. Several enzymes -- like SacI -- produce 3' single-stranded sticky ends. And some enzymes -- like PvuII -- cut at the axis of symmetry, leaving perfectly aligned ends. DNA molecules without overhangs are said to have blunt ends.
In addition there are restriction enzymes that cleave DNA some distance away from the sequence that they recognize. For example the enzyme HgaI makes staggered cuts that lie 5 and 10 nucleotides away from a 5 base pair sequence, GACGC. This leaves 5' overhanging ends, but, in contrast to the enzymes described above, these will be different almost every time the enzyme cuts (Why?).
The utility of the restriction enzymes
The discovery of these many restriction endonucleases have allowed
genetic engineers to cut pieces of DNA at specific sites and into
defined sizes. The result has been that a scientist can work with a
collection of molecules all of the same size and with ends of known
sequence. Restriction enzymes have proved to be valuable analytical
and diagnostic tools as well.
Some questions concerning the utility of the restriction enzymes
to bacteria
Why do many organisms carry enzymes that cut DNA, especially in
view of the importance of DNA as the genetic material?
But how does the organism's own DNA avoid this fate? Do the bacteria lack the sequences at which these enzymes cut?