Tangles and DNA
In the 1950s, it was realized that the genetic code appeared in the double helix structure of DNA. Deoxyribonucleic acid (DNA) is a molecule that is formed by pairs of long molecular strands that are bonded together by ladder rungs and that spiral around each other, forming the so-called double helix. The molecular strands are made up of alternating sugars and phosphates. Each sugar is bonded to one of four bases, A = Adenine, T = Thyamine, C = Cytosine, and G = Guanine (Figure 1).
The rungs of the ladder are formed by hydrogen bonding between pairs of bases, where A always binds to T and C always binds to G . Note that the sequence of bases as we move down one strand is then mimicked by the other strand, except that the As and Ts have been exchanged and the Cs and Gs also. The sequence of As, Ts, Cs, and Gs as we run down one of the strands is the genetic code, giving a blueprint for life. These molecules contain on the order of millions of individual atoms, all of which are packed into the tiny nucleus of a cell.
But the DNA has to be utilized in order to perform various biological functions, such as replication, transcription, and recombination. These are the processes of reproducing a given DNA molecule, copying segments of DNA, and modifying DNA molecules, respectively. All three of these are necessary for life. The knotting and tangling in the DNA molecules make the performance of these processes difficult. In order for these biological mechanisms to function, there must be some way of manipulating the tangled masses of DNA molecules.
Nature gets around this problem by providing enzymes called topoisomerases. These enzymes manipulate the DNA topologically. In Figure 2 you can see three of the possible actions of the enzymes. However, a particular enzyme may have a much more sophisticated action.
Knots
Let's now change a topic for a little and try to find a way to the mathematical description of such tangled masses.
Take a piece of string and tie a knot by it. Now glue the two ends of the string together to form a knotted loop. The result is a string that has no loose ends and that is truly knotted. Unless we use scissors or just rip the string apart, there is no way that we can untangle this string.
The simplest knot is the unknot or the trivial knot. The first one in Figure 3.
The most crucial point here is that simple deformations of such a knot will not change it. We will not distinguish between the original closed knotted curve and the continuous deformations of that curve through space. All of these deformed curves will be considered to be the same knot.
There are many different pictures of the same knot. In Figure 5, we
see three different pictures of a new knot, called the figure-eight knot. We call such a picture of a knot a projection of the knot.
Much of knot theory is concerned with telling which knots are the
same and which are different. What if we play with a knot for two weeks and we still haven’t transformed it into another? It will mean that knots are the same?
There are three Reidemeister moves that could be performed on knots and don’t change their structure — only visible projection. Each move operates on a small region of the knot projection. Keep in mind that this is a highly deformable knot and any continuous deformations accepted.
In 1926, the German mathematician Kurt Reidemeister (1893–1971)
proved that if we have two distinct projections of the same knot, we can get from one projection to the other by a series of Reidemeister moves and planar isotopies.
The first Reidemeister move allows us to put in or take out a twist in the knot, as in Figure 6.
The second Reidemeister move allows us to either add two crossings or remove two crossings as in Figure 7.
The third Reidemeister move allows us to slide a strand of the knot from one side of a crossing to the other side of the crossing, as in Figure 8.
Notice that although each of these moves changes the projection of the knot, it does not change the knot represented by the projection. Each such move is an ambient isotopy.
Example of such planar isotopy by Reidemeister moves, which shows the equivalence of two different projections of the single knot.
If you want to see proof of the statement, that these Reidemeister moves preserve knot, it appears in Burde and Zeischang (1986).
It might now seem that the problem of determining whether two projections represent the same knot would be easy to solve. We just check whether or not there is a sequence of Reidemeister moves to get us from one projection to the other. Unfortunately, there is no limit on the number of Reidemeister moves that it might take us to get from one projection to the other. So the way exists but is not easy with big knots to solve the problem of equivalence between their projections.
There are also links (many knotted together loops) and braids. But we will not discuss them here, because of no real need.
Tangles
A tangle in a knot or link projection is a region in the projection plane
surrounded by a circle such that the knot or link crosses the circle exactly four times (Figure 3). We will always think of the four points where the knot or link crosses the circle as occurring in the four compass directions NW, NE, SW, and SE.
Tangles are building blocks of knots or links projections (Figure 4).
Two tangles are equivalent if we can get from one to the other by a sequence of Reidemeister moves while the four endpoints of the strings in the tangle remain fixed and while the strings of the tangle never journey outside the circle defining the tangle.
Let’s look at some particular tangles that are easy to form.
The first two of them (a and b) we just denote like infinity and 0 for more convenience of algebra structure on their transformations. We denote c tangle by the number of right-handed twists we put in. In this case, the number is 3. If we had twisted the other way around, we would have denoted the resulting tangle by -3. Note that for a positive-integer twist, the overstrand always has a positive slope, if we think of it as a small segment of a line.
Return to DNA and supercoiling
But how all of this connects to biology and DNA structure? In nature we can meet not only linear forms of DNA molecules but also circular ones. In the early 1960s, it was discovered that DNA in a polyomavirus appeared as a ring or form circular DNA. It consists of two antiparallel circular DNA single strands winding helically around one another, that is, the ends of each strand of a linear double-stranded molecule are joined in the conventional 5'to 3' manner to form a double-stranded circle. This configuration is now known as closed-circular DNA.
Here need one important addition: in these viruses were found that there are two types of circular DNA: I and II. I — the one, about which we discussed earlier — closed-circular DNA from two circular linear strands. But also were found single-stranded circular form — II. However, a problem arose with the discovery that a single break in one of the two strands of component I, caused by the endonuclease DNase I, converted it directly to component II. How could the cleavage of one strand of a closed-circular duplex convert it to a linear molecule? Subsequently, electron micrographs of the component II molecules formed in this way showed them also to be circular. So, how could two circular DNA molecules, I and II, distinguished only by the breakage of one backbone phosphodiester bond, have such different properties?
The clue required to solve this conundrum came from the electron micrographs. Micrographs of the component I molecules showed many crossings of the DNA double strands (Figure 17, Left), whereas the component II molecules were mainly open rings (Figure 17, Right). Such crossings had been tentatively ascribed to protein cross-links, but Jerome Vinograd and his colleagues suggested that they could result from a ‘twisted circular form’ of the DNA, which explained the more compact nature of component I.
One of the best models of this behaviour of DNA is the length of rubber tubing. If the unconstrained tubing represents the double helix of a linear DNA, then a relative twisting of the ends, followed by the closure of the tubing into a circle with a connector, will result, when external constraints are released, in something that looks like Figure 18. Such coiling of the DNA helix upon itself is the literal meaning of the term supercoiling; that is, higher-order coiling of the DNA helix.
Thus, with the powerful but (in retrospect) simple insight that a closed circular DNA molecule could exist in a ‘twisted circular’ or supercoiled form, Vinograd and his colleagues neatly explained the properties of polyomavirus DNA and analogous results pertaining to other circular DNA molecules.
A quantitative measure of DNA supercoiling
Can we devise a quantitative measure of the supercoiling of a DNA molecule?
In mathematics, the linking number is a numerical invariant that describes the linking of two closed curves in three-dimensional space. Intuitively, the linking number represents the number of times that each curve winds around the other. Like in circle DNA, when one string is winding around the other. The linking number is always an integer, but may be positive or negative depending on the orientation of the two curves. There is an algorithm to compute the linking number of two curves from a link diagram. Label each crossing as positive or negative, according to the following rule:
All self-intersections are not counted at all. The total number of positive crossings minus the total number of negative crossings is equal to twice the linking number. That is:
Lk = 1/2 * (positive crossings — negative crossings)
Notice that we use a particular projection of the link in order to compute the linking number. In fact, we can show that the computed linking number will always be the same, no matter what projection of the link we use to compute it. So, that means, using Reidemeister moves will not change the linking number!
And we can calculate Lk for the closed-circular DNA molecule too. DNA has a natural orientation on its molecules. Each of the phosphates along the ladder edge is bonded to two different sugar molecules. Each of the sugars is bonded to a base molecule, being one of the C, T, G, or A molecules, and also to two phosphates, which occur at two different sites on the sugar molecule, called the 3' and the 5' sites (Figure 23). A single phosphate will be bonded to the 3' site of one sugar and to the 5' site of another sugar. Hence we can think of phosphate as the connector, sticking a 3' site on one sugar to a 5' site on a second sugar. Thus, a linear strand of DNA will have two ends, one of which is sugar with an open 3' site and one of which is sugar with an open 5' site. This gives an orientation to a single strand of DNA, determined by the convention that we start at the 5' end of the strand and head toward the 3' end.
In reality, DNA replication also occurs in a 5' — 3' direction, in that new nucleotides are added to the C3 hydroxyl group such that the strand grows from the 3' end (Figure 24).
This means that the DNA polymerase enzyme responsible for adding new nucleotides moves along the original template strand in a 3' — 5' direction.
Interestingly enough, linear duplex DNA has no such orientation. If
an end of one strand has an open 3' site, the corresponding end of the parallel strand will have an open 5' site. In particular, each end of the linear duplex DNA will have both an open 3' and an open 5' site. The two strands are oppositely oriented, giving us no way to orient the duplex linear molecule.
There is an even more important consequence from these sites. Namely, if the ends of the linear duplex DNA are brought together to form a cyclic duplex DNA molecule, the 3' site must be glued to a 5' site and vice versa. This forces each strand of the DNA to glue its head to its own tail rather than to the tail of the other strand. And if we will want to make a twist on DNA string, we will need to twist two ends at the same time.
This implies that Lnk number of a DNA double-stranded molecule will increase by each twist of a molecule by 1 (with the sign depends only on the chosen orientation of the strings from the beginning). And because we cannot choose some concrete orientation, we will decide to work with a positive one (clockwise movement in Figure 27). Of course, a given length of real DNA has an inherent number of double-helical turns by virtue of its structure. This number is the length of the DNA, N, in base pairs — usually, thousands for natural DNAs, divided by the number of base pairs per turn of the helix, h, which is dependent on the conditions, although a standard value, h◦, is defined under standard conditions (0.2 M NaCl, pH 7, 37 ◦C) and is often taken to be 10.5 bp/turn. The value of N/h will not in general be an integer, so when the DNA is bent into a simple, planar circle, the strand ends will not line up precisely, although the slight twisting required to join the ends is relatively insignificant over thousands of base pairs. Hence a DNA molecule joined into a circle with the minimum of torsional stress will have a linking number that is the closest integer to N/h. Let's name this number for DNA as Lk_r:
Lk_r = N/h = int(N/10.5)
Thus, for example, for plasmid pBR322, with 4361 base pairs (N), in its relaxed form, Lk_r is +415 under conditions where the helical repeat (h) is 10.5 bp/turn.
DNA molecules differing only in linking number are known as topological isomers or topoisomers.
So if int(10.5) = 10 and we will use this number for filling the space between twists the double-stranded DNA string with twists will look like in Figure 30. But it is a simplification, sometimes the number of nucleotides would be much smaller.
But in this situation string behaviour will changing in space — DNA is uncomfortably overwound and the axis of the ribbon will now become contorted in space. This effect is known as supercoiling. You can see such an effect if you will try to twist some rubber too much. If you will keep twisting a rubber and a the same time keep holding the ends of it tight, you will see that rubber start to make supercoils in the centre.
So instead of staying in the plane, it will decrease Tw(R) by one and therefore increase Wr(R) by one. This means the axis of the ribbon will tangle with itself. It will no longer lie flat.
Lk number components — Wr and Tw numbers
The curve that runs along the centre of the ribbon is called the axis of the ribbon. Although it doesn’t model a part of the molecule, it does tell us how contorted theme molecule is in space. It is possible to choose an orientation on the axis and then give the two boundaries of the ribbon orientations that match it.
First, we define the twist number of the ribbon, denoted Tw(R). It measures how much the ribbon twists around its axis. When the axis lies flat in the plane, without crossing itself, the twist of the ribbon is simply one-half of the sum of the +1s and -1s occurring at the crossings between the axis and a particular one of the two link components bounding the ribbon.
Next, we define the writhe number of the ribbon, denoted Wr(R). It measures how much the axis of the ribbon is contorted in space. For any particular projection of the axis, define the signed crossover number to be the sum of all the ±1s occurring at crossings where the axis crosses itself. It becomes trickier to compute the writhe when the axis is not in a plane, as some projections will have crossings and others will not.
Finally, we can treat the two boundaries of the ribbon as components of a link and then compute the linking number of the two components, denoting the result by Lk(R). Remember the linking number is just one half of the sum of the ±1s occurring at the crossings between the two components.
Lk(R) = Tw(R) + Wr(R)
Remember the linking number is just one half of the sum of the ±1s occurring at the crossings between the two components. This last invariant does not depend on the particular placement of the link in space.
Therefore when we achieve supercoiling phenomena, the absolute values of each of these three numbers will increase.
Experimentally separation of supercoiled molecules from a normal is possible — the molecules place in a gel and then pass electricity through the gel to attract the molecules toward an electrode. The molecules with greater supercoiling are more compact and hence move more quickly through the gel, allowing their separation.
Now we can return to the original question, which was how to determine
the action of an enzyme on DNA. We discuss a particular type of action by an enzyme called site-specific recombination, which is a process whereby an enzyme attaches to two specific sites on two strands of DNA, called recombination sites, each of which corresponds to a particular sequence of base pairs that the enzyme recognizes.
There are two types of coiling: negative and positive supercoils as we can see in Figure 35. Negative supercoils favour local unwinding of the DNA, allowing processes such as transcription, DNA replication, and recombination.