Why do Alu repeats form secondary structures?

Why do Alu repeats form secondary structures?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I've been doing a lot of research on Alu repeats and how they mediate the gene expression. I read the following article "Useful junk: Alu RNAs in the human transcriptome".

And it says that alu repeats embedded in 5'UTR makes stable secondary structure that prevents the assembly of ribosomal subunits, thus blocking the process of translation. I've been reading various articles to find out why Alu repeats form this secondary structure and why only in the 5'UTR and not in the 3'UTR. But haven't been able to find an answer.

Single-stranded RNA can easily form secondary structures, a very important example of this are tRNAs.

From a quick look at the article you read it seems that these Alu repeats work similar to riboswitches in the 5'-UTR. They can form secondary structures that block translation initiation, riboswitches commonly do this by hiding the Shine-Dalgarno sequence in a stable helix.

This kind of regulation of translation can obviously only work in the 5'-UTR as that is the place where the translation starts. The 3'-UTR can't affect translation in this way as the translation starts at the other end of the mRNA.

The Alu repeats still can form secondary structures in the 3'-UTR, and the article suggests that those might have an effect on mRNA stability.

Transposons or Jumping Genes: Types, Structure, Mechanism and Functions

Let us make an in-depth study of the transposons or jumping genes. After reading this article you will learn about: 1. Introduction to Transposons 2. Types of Transposons 3. Structure of a Transposon 4. Target Sequence 5. Mechanism of Transposition 6. Functions of Transposons 7. Formation of New Genes and 8. Pseudogenes.

Introduction to Transposons:

Earlier it was thought that genes were static and had definite and fixed locus. However, several types of gene rearrangement and recombination have come to light recently. Segments of DNA that can jump to target sites in the genome were first discovered by Barbara Mc Clintock. She discovered that Indian maize corn have cobs with kernels of different colours.

According to her, the light coloured kernels were caused by a segment of DNA that jumped into the genes coding the pigmented kernels, thus inactivating the pigmented kernels. These mobile genes are called transposons or transposable elements.

Transposons can jump within the genome thus affecting the expression of genes. They are quite different from the reciprocal or homologous exchanges of DNA. The movement of DNA segments is called transposition.

It is a specific form of genetic recombination that causes movement of certain genetic elements from one DNA site to another. Transposons inserted within a gene lead to disruption of their function. When they are inserted within the regulatory sequence of genes, they cause change in gene expression.

Transposons are present in all life forms. They are the main components of the moderately repetitive DNA. In human beings more than 50% genome is composed of transposable elements.

Types of Transposons:

Transposition occurs by two methods:

1. Simple non-replicative transposition:

It involves excision of the transposon from its original location to the new DNA site. This is also known cut-and-paste transposition.

2. DNA transposition by replicative mechanism:

Transposable segments generate a new copy by replication. The first copy remains at the original site and second copy moves to a new site anywhere with in the genome.

The movement to the target site requires breaking up of the chromosome at the new site and inserting the transposon between the two ends generated. The enzymes required for breaking and re-joining the chromosome are present in the transposon itself.

Structure of a Transposon:

Transposons are stretches of DNA that have repeated DNA segments at either end. A transposon consists of a central sequence that has transposes gene and additional genes. This is flanked on both sides by short repeated DNA segments. The repeated segments may be direct repeats or inverted repeats. These terminal repeats help in identifying transposons.

The number of repeated nucleotides is uneven 5 or 7 or 9 nucleotides are due to its method of insertion at the target site.

Target Sequence:

The site where a transposon is inserted is called target site or recipient site. Before the transposon is moved into the target site, the target sequence is duplicated.

The two copies formed move apart. The transposon is inserted in between the two copies of the target sequences.

Mechanism of Transposition:

The enzyme transposase present in the transposon itself makes nicks or cuts in each strand of the target sequence. The target sequence is duplicated and two copies move away to make way for the transposon in the centre. The transposon then fixes itself into the two free ends generated. The nicks are sealed by ligase and two strands become continuous.

Functions of Transposons:

Mutation Caused by Transposons:

Transposons are inserted within genes affecting their function, thus cause disruption of their functions. When they are inserted within the regulatory sequence of genes, they cause change in their expression. They are most common source of mutation. Transposons may insert stop codons thus producing truncated proteins.

In drosophilla, majority of spontaneous mutations are caused by transposons jumping into a gene. The mutant white-eyed drosphilla is produced by a transposable element inserted into the gene, which normally produces red pigment.

In human beings, transposons cause many genetic diseases. Transposons lead to the development of functional immune system in vertebrates.

In bacteria, the transposable elements are present on extra chromosomal DNA called plasmid. Transposable elements on plasmids carry genes for proteins that nullify the effects of antibacterial drugs and toxins.

Formation of New Genes:

How do cells make new genes? Often it is done by exon shuffling by which the functional units of two existing genes recombine generating a new gene. Exon duplication and divergence also play their role in formation of new genes.


Transposons introduce great genome flexibility. Sometimes duplicated genes and other transposons do not succeed in making a functional gene and therefore they become

Transposons or Jumping Genes:

preudogens or dead by products of evolution. The pseudogenes are frequent in mammalian genome and they are unable to produce mRNA for translation.

Beta Structure

Beta Structure: Parallel and antiparallel beta strands are much more extended than alpha helices (phi/psi of -57,-47) but not as extended as a fully extended polypeptide chain (with phi/psi angles of +/- 180). The beta sheets are not quit so extended (parallel -119, +113 antiparallel, -139, +135), and can be envisioned as rippled sheets. They can be visualized by laying thin, pleated strips of paper side by side to make a "pleated sheet" of paper. Each strip of paper can be pictured as a single peptide strand in which the peptide backbone makes a zig-zag along the strip, with the alpha carbons lying at the folds of the pleats. Each single strand of the beta-sheet can be pictured as a twofold helix, i.e. a helix with 2 residues/turn. The arrangement of each successive peptide plane is pleated due to the tetrahedral nature of the alpha C. The H bonds are interstrand, not intrastrand as in the alpha helix.

Figure: Parallel beta strands (image made with Spartan)

Figure: Antiparallel beta strands (image made with Spartan)

Consider a strand as a continuous and contiguous polypeptide backbone propagating in one direction. Hence, using this definition, a helix consist of a single strand, and all the H-bonds are within the strand (or intrastrand). A beta sheet would then consist of multiple strands, since each "strand" is separated from other "strands" by an intervening contiguous stretch of amino acid which bends within the protein in a way which allows the next section of the peptide backbone, the next "strand", to H-bond with the first "strand". But remember, even in this case, all the H-bonds holding the alpha and beta structure together are intramolecular.

In a parallel beta sheet structure, the optimal H bond pattern leads to a less extended structure (phi/psi of -119, +113) than the optimal arrangement of the H bonds in the antiparallel structure (phi/psi of -139, +135). Also the H bonds in the parallel sheet are bent significantly. (i.e. the carbonyl O on one strand is not exactly opposite the amide H on the adjacent strand, as it is in the antiparallel sheet.) Hence antiparallel beta strands are presumably more stable, even though both are abundantly found in nature. Short parallel beta sheets of 4 strands or less are not common, which might reflect their lower stability.

The side chains in the beta sheet are normal to the plane of the sheet, extending out from the plane on alternating sides. Parallel sheets characteristically distribute hydrophobic side chains on both side of the sheet, while antiparallel sheets are usually arranged with all the hydrophobic residues on one side. This requires an alternation of hydrophilic and hydrophobic side chains in the primary sequence. Antiparallel sheets are found in silk with the sheets running parallel to the silk fibers. The following repeat is found in the primary sequence: (Ser-Gly-Ala-Gly)n), with Gly pointing out from one face, and Ser or Ala from the other.

Beta strands have a tendency to twist in the right hand direction. This leads to important consequences in how the beta strands are connected. Parallel strands can form twisted sheets or saddles as well as beta barrels.

Figure: Twisted Beta Sheet/Saddle (image made with VMD)

Figure: Beta Barrel (image made with VMD)

  • in parallel strands, right handed connectivity is common.
  • in a protein with parallel strand in register, and given the inherent twist in the stands, the strands arrange in a way to have the H bonds stretched equally at the ends of the chains, giving rise to a twisted saddle shape (top structure above).

Jmol: Updated Twisted beta sheet from Arabinose Binding Protein Jmol14 (Java) | JSMol (HTML5)

  • in a protein with parallel strand out of register, and given the inherent twist in the stands, the strands arrange in a way to have the H bonds stretched equally at the ends of the chains, giving rise to a beta barrel (bottom structure above).

Jmol: Beta barrel from triose phosphate isomerase

7.3: Prokaryotic Replication

  • Contributed by E. V. Wong
  • Axolotl Academica Publishing (Biology) at Axolotl Academica Publishing

DNA replication begins at an origin of replication. There is only one origin in prokaryotes (in E. coli, oriC) and it is characterized by arrays of repeated sequences. These sequences wrap around a DNA-binding protein, and in doing so, exert pressure on the H-bonds between the strands of DNA, and the chromosome begins to unzip in an AT-rich area wrapped around this protein. Remember that A-T pairs are 33% weaker than G-C pairs due to fewer hydrogen bonds. The use of AT-rich stretches of DNA as points of strand separation is a recurring theme through a variety of DNA operations. The separation of the two strands is bidirectional, and DNA polymerases will act in both directions in order to finish the process as quickly as possible. Speed is important here because while replication is happening, the DNA is vulnerable to breakage, and most metabolic processes are shut down to devote the energy to the replication. Even in prokaryotes, where DNA molecules are orders of magnitude smaller than in eukaryotes, the size of the DNA molecule when it is unraveled from protective packaging proteins makes it highly susceptible to physical damage just from movements of the cell.

The first OriC binding protein, DnaA, binds to DnaA boxes, which are 9 base pair segments with a consensus sequence of TTATCCACA. OriC has five of these repeats, and one DnaA protein binds to each of them. HU and IHF are histone-like proteins that associate with DnaA and together bend that part of the DNA into a circular loop, situating it just over the other major feature of oriC, the 13-bp AT-rich repeats (GATCTNTTNTTTT). DnaA hydrolyzes ATP and breaks the H-bonds between strands in the 13mer repeats, also known as melting the DNA. This allows complexes of DnaB [and DnaC, which is a loading protein that helps attach DnaB(6) to the strand with accompanying hydrolysis of ATP. Also, five more DnaA are recruited to stabilize the loop.] to bind to each single-stranded region of the DNA on opposite sides of the newly opened replication bubble.

DnaB is a helicase its enzymatic activity is to unzip/unwind the DNA ahead of the DNA polymerase, to give it single-stranded DNA to read and copy. It does so in association with single-stranded-DNA binding proteins (SSBs), and DNA gyrase. The function of SSB is nearly self-explanatory: single-stranded DNA is like RNA in its ability to form complex secondary structures by internal base-pairing, so SSB prevents that. DNA gyrase is a type II topoisomerase, and is tasked with introducing negative supercoiling to the DNA. This is necessary because the unzipping of the DNA by helicase also unwinds it (since it is a double helix) and causes the introduction of positive supercoiling. This means that the entire circular molecule twists on itself: imagine holding a rubber band in two hands and twisting it. As the supercoiling accumulates, the DNA becomes more tightly coiled, to the point that it would be impossible for helicase to unzip it. DnaB/gyrase can relieve this stress by temporarily cutting the double-stranded DNA, passing a loop of the molecule through the gap, and resealing it.

Figure (PageIndex<8>). Type II DNA topoisomerases like DNA gyrase relieve supercoiling by making temporary double-strand cut.

This (hopefully) makes a lot more sense looking at the diagram. Or, going back to our rubber band, give the rubber band a twist or two, then tape down the two ends. If you snip the rubber band, and pass an adjacent portion of the rubber band through that snip, then reconnect the cut ends, you will find that there is one less twist. Nifty, eh? At this point, some of you are going to say, but if you twist a free-floating rubber band, as one might imagine a free-floating circular DNA chromosome in E. coli, you would expect it to naturally untwist. Technically, yes, but due to the large mass of the chromosome, its association with various proteins and the cell membrane, and the viscosity of its environment, it does not behave as though it were completely free.

Figure (PageIndex<9>). Detail of DNA topoisomerase type II action. (A) First, the enzyme binds to the DNA and initiates an endonuclease activity, cutting both strands of the DNA at that point. (B) This complete transection of the DNA (as opposed to the single-strand cut by type I topoisomerases) allows another part of the same DNA molecule to slip through the gap. (C) Finally, the two temporarily broken ends of the DNA, which had been held closely in place by the enzyme.

Once oriC has been opened and the helicases have attached to the two sides of the replication fork, the replication machine, aka the replisome can begin to form. However, before the DNA polymerases take positions, they need to be primed. DNA polymerases are unable to join two individual free nucleotides together to begin forming a nucleic acid they can only add onto a pre-existing strand of at least two nucleotides. Therefore, a specialized RNA polymerase (RNAP&rsquos do not have this limitation) known as primase is a part of the replisome, and reads creates a short RNA strand termed the primer for the DNA polymerase to add onto. Although only a few nucleotides are needed, the prokaryotic primers may be as long as 60 nt depending on the species.

At least five prokaryotic DNA polymerases have been discovered to date. The primary DNA polymerase for replication in E. coli is DNA Polymerase III (Pol III). Pol I is also involved in the basic mechanism of DNA replication, primarily to ll in gaps created during lagging strand synthesis (defined 3 pages ahead) or through error-correcting mechanisms. DNA polymerase II and the recently discovered Pol IV and Pol V do not participate in chromosomal replication, but rather are used to synthesize DNA when certain types of repair is needed at other times in the cellular life cycle.

DNA polymerase III is a multi-subunit holoenzyme, with &alpha, &epsilon, and &theta subunits comprising the core polymerase, and &tau, &gamma, &delta, &delta&rsquo, &chi, &Psi, and &beta coming together to form the complete holoenzyme. The core polymerase has two activities: the &alpha subunit is the polymerase function, reading a strand of DNA and synthesizing a complementary strand with great speed, around 150 nt/sec the &epsilon subunit is a 3&rsquo-5&rsquo &ldquoproofreading&rdquo exonuclease and acts as an immediate proofreader, removing the last nucleotide if it is incorrect. This proof- reading does not reach any further back: it only acts on the most recently added nucleotide to correct misincorporation. Other mechanisms and enzymes are used to correct DNA lesions that arise at other times. [As a matter of nomenclature, exonucleases only cut off nucleotides from DNA or RNA from either end, but not in the middle. Endonucleases cleave phosphodiester bonds located deeper within a nucleic acid strand.] The &theta subunit has no enzymatic activity and regulates the exonuclease function. Although it has polymerase activity, the Pol III core polymerase has poor processivity - that is, it can only add up to 15 nucleotides before dissociating from the template DNA. Since genomes of E. coli strains average near 5 million base pairs, replication in little 15 nt segments would be extraordinarily inefficient.

Figure (PageIndex<10>). The dimeric &beta clamp holds DNA Polymerase III on the template, allowing it polymerize more nucleotides before dissociating.

The clamp loader complex is an ATPase assembly that binds to the &beta-clamp unit upon binding of ATP (but the ATPase activity is not turned on). When the complex then binds to DNA, it activates the ATPase, and the resulting hydrolysis of ATP leads to conformational changes that open up the clamp temporarily (to encircle or to move off of the DNA strand), and then dissociation of the clamp loader from the clamp assembly.

This is where the &beta subunit is needed. Also known as the &beta clamp, it is a dimer of semicircular subunits that has a central hole through which the DNA is threaded. The core polymerase, via an &alpha-&beta interaction, is attached to this &beta clamp so that it stays on the DNA longer, increasing the processivity of Pol III to over 5000nt. The &beta clamp is loaded onto (and unloaded off of) the DNA by a clamp loader complex (also called &gamma complex) consisting of &gamma (x3), &delta, &delta&rsquo, &chi, and &Psi subunits.

The replication bubble has two replication forks - once the DNA is opened up (unzipped) at the origin, a replication machine can form on each end, with the helicases heading in opposite directions. For simplification, we will consider just one fork &mdash opening left to right &mdash in this discussion with the understanding that the same thing is happening with the other fork, but in the opposite direction.

The first thing to notice when looking at a diagram of a replication fork (Figure (PageIndex<11>)) is that the two single-stranded portions of template DNA are anti-parallel. This should come as no surprise at this point in the course, but it does introduce an interesting mechanical problem. Helicase opens up the double stranded DNA and leads the rest of the replication machine along. So, in the single-stranded region trailing the helicase, if we look left to right, one template strand is 3&rsquo to 5&rsquo (in blue), while the other is 5&rsquo to 3&rsquo (in red). Since we know that nucleic acids are polymerized by adding the 5&rsquo phosphate of a new nucleotide to the 3&rsquo hydroxyl of the previous nucleotide (5&rsquo to 3&rsquo, in green), this means that one of the strands, called the leading strand, is being synthesized in the same direction that the replication machine moves. No problem there.

The other strand is problematic: looked at linearly, the newly synthesized strand would be going 3&rsquo to 5&rsquo from left to right, but DNA polymerases cannot add nucleotides that way. How do cells resolve this problem? A number of possibilities have been proposed, but the current model is depicted here. The replication machine consists of the helicase, primases, and two DNA polymerase III holoenzymes moving in the same physical direction (following the helicase). In fact, the pol III complexes are physically linked through &tau subunits.

Figure (PageIndex<11>). DNA Replication in prokaryotes.

In order for the template strand that is 5&rsquo to 3&rsquo from left to right to be replicated, the strand must be fed into the polymerase backwards. This can be accomplished either by turning the polymerase around or by looping the DNA around. As the Figure shows, the current model is that the primase is also moving along left to right, so it has just a short time to quickly synthesize a short primer before having to move forward with the replisome and starting up again, leaving intermittent primers in its wake. Because of this, Pol III is forced to synthesize only short fragments of the chromosome at a time, called Okazaki fragments after their discoverer. Pol III begins synthesizing by adding nucleotides onto the 3&rsquo end of a primer and continues until it hits the 5&rsquo end of the next primer. It does not (and can not) connect the strand it is synthesizing with the 5&rsquo primer end.

DNA replication is called a semi-discontinuous process because while the leading strand is being synthesized continuously, the lagging strand is synthesized in fragments. This leads to two major problems: first, there are little bits of RNA left behind in the newly made strands (just at the 5&rsquo end for the leading strand, in many places for the lagging) and second, Pol III can only add free nucleotides to a fragment of single stranded DNA it cannot connect another fragment. Therefore, the new &ldquostrand&rdquo is not whole, but riddled with missing phosphodiester bonds.

The first problem is resolved by DNA polymerase I. Unlike Pol III, Pol I is a monomeric protein and acts alone, without additional proteins. There are also 10-20 times as many Pol I molecules as there are Pol III molecules, since they are needed for so many Okazaki fragments. DNA Polymerase I has three activities: (1) like Pol III, it can synthesize a DNA strand based on a DNA template, (2) also like Pol III, it is a 3&rsquo-5&rsquo proofreading exonuclease, but unlike Pol III, (3) it is also a 5&rsquo-3&rsquo exonuclease. The 5&rsquo-3&rsquo exonuclease activity is crucial in removing the RNA primer (Figure (PageIndex<12>)). The 5&rsquo-3&rsquo exonuclease binds to double- stranded DNA that has a single-stranded break in the phosphodiester backbone such as what happens after Okazaki fragments have been synthesized from one primer to the next, but cannot be connected. This 5&rsquo-3&rsquo exonuclease then removes the RNA primer. The polymerase activity then adds new DNA nucleotides to the upstream Okazaki fragment, filling in the gap created by the removal of the RNA primer. The proofreading exonuclease acts just like it does for Pol III, immediately removing a newly incorporated incorrect nucleotide. After proofreading, the overall error rate of nucleotide incorporation is approximately 1 in 107.

Technically, the 5&rsquo-3&rsquo exonuclease cleaves the DNA at a double-stranded region downstream of the nick, and may then remove anywhere from 1-10nt at a time. Experimentally, the 5&rsquo-3&rsquo exonuclease activity can be cleaved from the rest of Pol I by the protease trypsin. This generates the &ldquoKlenow fragment&rdquo containing the polymerase and 3&rsquo-5&rsquo proofreading exonuclease.

Figure (PageIndex<12>). Lagging Strand Synthesis. After DNA polymerase III has extended the primers (yellow), DNA polymerase I removes the primer and replaces it by adding onto the previous fragment. When it finishes removing RNA, and replacing it with DNA, it leaves the DNA with a missing phosphodiester bond between the pol III-synthesized DNA downstream and the pol I-synthesized DNA upstream. This break in the sugar-phosphate backbone is repaired by DNA ligase.

Even though the RNA has been replaced with DNA, this still leaves a fragmented strand. The last major player in the DNA replication story finally appears: DNA ligase. This enzyme has one simple but crucial task: it catalyzes the attack of the 3&rsquo-OH from one fragment on the 5&rsquo phosphate of the next fragment, generating a phosphodiester bond. This reaction requires energy in the form of hydrolysis of either ATP or NAD + depending on the species (E. coli uses NAD + ) generating AMP and either PPi or NMN + .

Noncanonical DNA Structures Based on Guanine–Guanine Interactions Seem to Have Played a Role in Telomere Origin and Evolution

In most eukaryotic chromosomes, telomere DNA sequences are arrays of short guanine-rich repetitive sequences that terminate in a 3′-single-strand G-rich overhang (150–200 nucleotides). The G-rich strand is synthesized by a telomere-specific RT, called telomerase, using a small region of its RNA subunit as template and the 3′-OH on the end of the chromosome as a primer ( Blackburn 1992). Found in animals, fungi, and Amoebozoa, TTAGGG was the telomeric simple repeat sequence present in the ancestral Unikont. Moreover, its occurrence in some species of the supergroups Plantae, Chromalveolata, Excavata, and Rhizaria suggests that TTAGGG could be the ancestral telomeric repeat sequence for eukaryotes ( Fulnecková et al. 2013).

On the other hand, the use of prokaryotic retroelements to root a RT phylogenetic tree shows that the telomerase seems to have evolved from the RT of an ancestral non-LTR retrotransposon ( Eickbush 1997). Furthermore, it is believed that the ability of non-LTR RTs to use the 3′-OH of chromosome ends to prime reverse transcription was crucial for the birth of early telomerases ( Moore and Haber 1996 Morrish et al. 2002, 2007 Curcio and Belfort 2007).

Telomerase-based telomeres brought two principal advantages: facilitated telomere homeostasis and a greater structural protection by the incorporation of simple G-rich repeats with the inherent ability to form G-quadruplex structures ( Henderson et al. 1987 Arthanari and Bolton 2003 Teixeira and Gilson 2005). G-quadruplexes consist of stacked G-quartets, which are planar arrangements of four guanines held together by Hoogsteen hydrogen bonds ( Neidle 2009) ( fig. 2A). G-quadruplex formation may occur within the terminal G-rich 3′-overhang ( fig. 2B) or when the overhang invades the adjacent double-stranded region of the telomere to form T-loop structures ( fig. 2C) ( Maizels 2006 Rhodes 2006 Xu et al. 2008 Bochman et al. 2012). Nevertheless, it has been hypothesized that after the appearance of telomerase, the maintenance of telomeres by the primitive T-loop-replication mechanism becomes less relevant ( de Lange 2004). The first visualization of telomeric G-quadruplex formation in vivo was performed in the ciliate Stylonychia ( Schaffitzel et al. 2001 Paeschke et al. 2005). Most recently, a highly specific DNA G-quadruplex antibody has been employed to visualize G-quadruplex structures at human telomeres ( Biffi et al. 2013).

Schematic diagrams of a G-quartet and two telomeric G-quadruplexes. (A) Four guanines assemble in a planar arrangement to form a G-quartet. Hydrogen bonds are in dashed lines. (B) Diagram of an intramolecular G-quadruplex at a telomere end. (C) Diagram of a G-quadruplex at a T-loop. The G-quadruplexes in the figure are composed of three stacked G-quartets (shaded squares).

Schematic diagrams of a G-quartet and two telomeric G-quadruplexes. (A) Four guanines assemble in a planar arrangement to form a G-quartet. Hydrogen bonds are in dashed lines. (B) Diagram of an intramolecular G-quadruplex at a telomere end. (C) Diagram of a G-quadruplex at a T-loop. The G-quadruplexes in the figure are composed of three stacked G-quartets (shaded squares).

It is important to point out that the putative ancestral telomerase-synthesized sequence, (TTAGGG)n, is not only capable of folding into a G-quadruplex structure but is the best one at doing so in vitro ( Tran et al. 2011). In addition, recent biophysical studies on the folding of these telomeric G-quadruplexes have shown that structure formation occurs in milliseconds. These folding kinetics are biologically relevant because they are comparable to those of transcription and DNA replication ( Zhang and Balasubramanian 2012).

During evolution, mutations in the telomerase RNA template have given rise to repeat variants with different lengths of guanine motifs (G2, G4, and more). Recent experiments have found that the G-quadruplexes formed by telomeric repeats with only two consecutive guanines (TTAGG in arthropods and TTAGGC in nematodes) are in equilibrium with G-hairpins and other noncanonical structures ( Tran et al. 2011).

In the silk moth Bombyx mori (Lepidoptera) and the flour beetle Tribolium castaneum (Coleoptera), the telomerase activity is weak, and telomere-specific non-LTR retroelements (TRAS and SART family elements in B. mori and SART family elements in T. castaneum) are inserted into the telomeric repeats in a specific manner ( Fujiwara et al. 2005 Osanai et al. 2006) that preserves the G/C strand bias ( fig. 3). The massive integration of these elements into the proximal regions of the TTAGG repeat arrays of B. mori and the TCAGG arrays of T. castaneum (an alternative telomere variant in insects) gives rise to huge telomeres with sizes larger than 200 kb.

Distribution of telomeric sequences within Bilateria. Most eukaryotes have G-rich telomerase-synthesized repeats with adjacent complex subtelomeric repeats called telomere-associated sequences (TAS). In most arthropods, telomere-specific retrotransposons are inserted into telomerase-synthesized repeats. As can be seen in the diagram, TRAS elements insert in reverse orientation to that of the SART elements. In an ancestor of diptean insects, the telomerase gene was lost (green line). In Chironomus tentans (lower Diptera), the telomeric sequences consist of complex tandem repeats maintained by recombination. However, Drosophila species have multiple telomere-specific retrotransposons (autonomous and nonautonomous) that transpose to chromosomal ends. The deletion event in the ancestral TAHRE element is shown with dashed lines.

Distribution of telomeric sequences within Bilateria. Most eukaryotes have G-rich telomerase-synthesized repeats with adjacent complex subtelomeric repeats called telomere-associated sequences (TAS). In most arthropods, telomere-specific retrotransposons are inserted into telomerase-synthesized repeats. As can be seen in the diagram, TRAS elements insert in reverse orientation to that of the SART elements. In an ancestor of diptean insects, the telomerase gene was lost (green line). In Chironomus tentans (lower Diptera), the telomeric sequences consist of complex tandem repeats maintained by recombination. However, Drosophila species have multiple telomere-specific retrotransposons (autonomous and nonautonomous) that transpose to chromosomal ends. The deletion event in the ancestral TAHRE element is shown with dashed lines.

The telomeres of the honey bee Apis mellifera (Hymenoptera) are exceptional among the arthropods because they do not have non-LTR elements inserted into their telomeric repeats. Instead, the telomere sequence consists of TTAGG repeat arrays ( Robertson and Gordon 2006) interspersed with TCAGGCTGGG, TCAGGCTGGGTTGGG, and TCAGGCTGGGTGAGGATGGG higher order repeat arrays (Garavís M, Villasante A, unpublished results) ( fig. 3). These higher order repeats arose by amplification of the mutated repeats present in proximal telomeric regions and the interspersed pattern developed by further amplifications of the 5-bp repeat arrays together with higher order repeat arrays. However, the TTAGG repeats of Acyrthosiphon pisum (Hemiptera) and Pediculus humanus (Phthiraptera) contain insertions of non-LTR retrotransposons of the TRAS and SART family, respectively ( International Aphid Genomics Consortium 2010 Kirkness et al. 2010) ( fig. 3). As Hemiptera and Phthiraptera are basal to Hymenoptera, Coleoptera, Lepidoptera, and Diptera ( fig. 3), the telomeres of A. mellifera seem to represent a case where the TRAS and/or SART retrotransposons were lost at a later stage in evolution. It is tempting to speculate that the appearance of those higher order repeat arrays with propensity to form 3-quartet G-quadruplexes caused the decay, and eventual loss, of the telomeric retrotransposons.

Because the telomeres of the spider mite Tetranychus urticae (from the basal branch Chelicerata) are also a mosaic of short TTAGG repeats interrupted by non-LTR retrotransposons closely related to TRAS ( Grbić et al. 2011) ( fig. 3), the telomeres of the arthropods seem to be maintained by telomerase, by insertion of specific non-LTR retrotransposons into the TTAGG repeat array and by recombination. The same system of telomere maintenance has also been found in some nonarthropod species ( Arkhipova and Morrison 2001 Yamamoto et al. 2003 Gladyshev and Arkhipova 2007 Starnes et al. 2012). It has not escaped our notice that the appearance of this apparently suboptimal mechanism of telomere maintenance, which might have created chromosome instability, seems to have coincided with the great arthropod radiation into Chelicerata and Mandibulata.

On the other hand, certain yeasts from the Ascomycota phylum have telomeric repeats that are diverse in terms of their sequence, length, and homogeneity ( McEachern and Blackburn 1994). In these yeasts, the degenerate repeats result from the nonprocessivity of their telomerases ( Prescott and Blackburn 1997). Importantly, these repeats, despite their TG-richness, are less prone to fold into G-quadruplexes ( Tran et al. 2011), and it has been shown that in these organisms, the telomere-binding proteins are fast evolving ( Teixeira and Gilson 2005). Therefore, it is possible that these yeasts are using an ancestral system of chromosome end protection where the ssDNA-binding proteins facilitate the folding of the 3′-overhangs into G-quadruplex-like structures. This yeast-capping mechanism likely arose de novo by convergent evolution.

Do Noncanonical Secondary Structures Have a Role in the Maintenance of Telomeres without Telomerase?

Once telomerase becomes completely dysfunctional, the gene encoding telomerase could be lost if telomeres are maintained by the ancestral alternative mechanism of homologous recombination. Apparently, this is what happened in the ancestor of Diptera about 260 Ma ( Wiegmann et al. 2011). In the lower Diptera, Anopheles, Rhynchosciara, and Chironomus, long tandem repeats are present at chromosome ends, suggesting that telomere maintenance takes place by homologous recombination ( Nielsen and Edstrom 1993 Biessmann et al. 1998 Madalena et al. 2010) ( fig. 3). In Drosophila, however, telomere maintenance occurs primarily by transposition of telomere-specific retrotransposons to receding chromosome ends ( fig. 3). In addition to retrotransposition, Drosophila telomeres are also maintained, as in any eukaryote, by recombination/gene conversion ( Kahn et al. 2000).

In D. melanogaster, three telomeric retrotransposons TART, TAHRE, and HeT-A (a nonautonomous element derived from an ancestral TAHRE that loss its RT), transpose occasionally to chromosome ends using the free 3′-OH at chromosome termini to prime reverse transcription ( Biessmann et al. 1990, 1992 Sheen and Levis 1994 Abad et al. 2004a, 2004b). In agreement with this mechanism, the telomeric elements appear randomly mixed in head-to-tail arrangements and variably truncated at the 5′-end ( Mason et al. 2008 Villasante et al. 2008 Pardue and DeBaryshe 2011) ( fig. 3). It is noteworthy that deletion of the RT coding region of the telomeric elements has occurred recurrently during Drosophila evolution, and multiple nonautonomous elements appear at the telomeres of the Drosophila species examined. As an example, up to four nonautonomous elements along with their corresponding autonomous elements have been found in D. mojavensis telomeres ( Villasante et al. 2007). Interestingly, similar situations occur with group II introns where their RTs also act in trans to mobilize multiple deleted introns ( Mohr et al. 2010).

Because Drosophila telomeres consist of retrotransposon arrays in constant flux, there is not a specific terminal sequence and their telomere-capping proteins (the “terminin” complex) have evolved to bind chromosome ends independently of the primary DNA sequence ( Raffa et al. 2009, 2010). The “terminin” complex is functionally analogous to the “shelterin” complex (human telomere-capping proteins), but their components are not evolutionarily conserved ( Palm and de Lange 2008 Raffa et al. 2009, 2010). Thus, Drosophila telomeres are made of rapidly evolving telomeric retrotransposons ( Villasante et al. 2007) and telomere-capping proteins ( Gao et al. 2010 Raffa et al. 2010). Moreover, as Verrochio is a telomere-capping protein with one OB-fold domain and all telomeric proteins containing OB folds are 3′-overhang binding proteins, Drosophila telomeres also seem to have single-strand overhangs ( Raffa et al. 2010).

It is noteworthy that, despite the complexity of telomeric sequences in the genus Drosophila and Chironomus, the Drosophila telomeric retrotransposon arrays and the Chironomus telomeric complex repeats also have the telomeric G/C strand bias ( Nielsen and Edstrom 1993 Danilevskaya et al. 1998). The conservation of this G/C strand bias may indicate that telomere capping depends on the formation of noncanonical structures based on guanine–guanine interactions. In agreement with this idea, it has been shown that the 3′-untranslated region of the abundant D. melanogaster telomeric element HeT-A contains sequences with propensity to form G-quadruplexes ( Abad and Villasante 1999).

The structural and phylogenetic analyses of all Drosophila telomeric-specific retrotransposons show that they had a common ancestor and indicate that non-LTR retrotransposons have been recruited to perform the cellular function of telomere maintenance. Therefore, we propose that the recruitment of Drosophila telomeric elements may resemble the ancestral mechanism that led to the maintenance of the “proto-telomeres” of the first eukaryotic chromosomes.

On the other hand, it has been found that yeast cells lacking telomerase can survive telomere sequence loss through the formation of terminal blocks of heterochromatin. This happens by amplifying and rearranging either subtelomeric sequences in S. cerevisiae and S. pombe or rDNA sequences in S. pombe ( Lundblad and Blackburn 1993 Jain et al. 2010). Significantly, the S. cerevisiae subtelomeric Y’ repeats also have purine/pyrimidine strand bias ( Nickles and McEachern 2004), and the S. pombe end-protection protein POT1 (protection of telomeres 1) binds, in a nonsequence-specific manner, to the 3′-overhangs of G-rich rDNA ( Jain et al. 2010). Interestingly, adaptive recombination-based mechanisms of telomere maintenance (called ALT for alternative lengthening of telomeres) also occur in tumor cells that lack telomerase ( Bryan et al. 1995 Cesare and Reddel 2010).

To summarize, in species that have lost telomerase either during evolution (order Diptera) or through experimental manipulation, the data available suggest a role of structural DNA features in telomere maintenance, reveal the importance of telomeric heterochromatin (regardless of the underlying primary sequence) in the recruitment of end-binding proteins, and show how easily backup mechanisms may have been used to maintain telomeres during evolution.

Difference Between VNTR and STR


VNTR: VNTR is a type of tandem repeat in which a short sequence of nucleotides (10-60 base pairs) are repeated a variable number of times in a particular locus.

STR: STR is a type of tandem repeat in which a short sequence of nucleotides (2-6 base pairs) are repeated a variable number of times in a particular locus.

Number of Repeating Nucleotides

VNTR: VNTR consists of 10-60 base pairs.

STR: STR consists of 2-6 base pairs.

Type of Repetitive DNA

VNTR: VNTR is a type of minisatellite DNA.

STR: STR is a type of microsatellite DNA.

Number of Repeats

VNTR: VNTR consists of 10-1,500 repeats in the array.

STR: STR consists of 5-200 repeats in the array.

Size of the Array

VNTR: VNTR forms an array of 0.5-15 kb.

STR: STR forms an array of 10-1000 bp.

Complexity of the Array

VNTR: VNTR produces heterogeneous arrays.

STR: STR produces homogenous arrays.


VNTR and STR are two types of tandem repeats that form arrays of adjacent repetitive units in the eukaryotic genome. VNTR consists of comparatively a long repeating units of nucleotides (10-60 base pairs). STR consists of short repeating units of nucleotides (2-6 bp). The main difference between VNTR and STR is the length of the repeating units of each type of tandem repeats.


1. “Variable number tandem repeat.” ScienceDirect Topics, Available here.
2. “The Science of Forensic Genetics.” CRG – Council for Responsible Genetics, Available here.

Image Courtesy:

1. “D1S80Demo” By PaleWhaleGail at English Wikipedia (CC BY-SA 3.0 ) via Commons Wikimedia
2. “Stages of Gene Fingerprinting” By Sneptunebear16 – Own work (CC BY-SA 4.0) via Commons Wikimedia

About the Author: Lakna

Lakna, a graduate in Molecular Biology & Biochemistry, is a Molecular Biologist and has a broad and keen interest in the discovery of nature related things

Binding and interaction of α-synucleinwith lipid membranes

Under normal conditions, α-synucleinexists as a randomly structured and natively unfolded protein and remains as a monomer within the cytoplasm. Under pathological conditions, however, α-synucleinundergoes structural/conformational changes causing the monomers to aggregate with each other and become insoluble. Much evidence suggests that changes to the α-synucleinstructure and properties are initiated when the protein binds and interacts with lipid surfaces, such as lipid droplets, phospholipid bilayers or lipid membranes. When α-synucleinmonomers, isolated from human neurons, were exposed to synthetic lipid membranes, they readily bound to the membrane surface and formed dimers and oligomers [56],[57]. Such an interaction is thought to induce a dramatic change in α-synucleinstructure from its unfolded form to a folded α-helical secondary structure [57]. The imperfect repeats of 11 amino acids present in α-synuclein, similar to the amphipathic α-helical motif common to apolipoproteins and other lipid-binding proteins, appear to play an important role in the lipid membrane binding process [58]. What is significant about such a change is that the α-helical form of α-synucleinis prone to forming different types of oligomers, the species that are thought to be toxic to cells. The lipid composition of membranes has been shown to affect the binding/interaction of α-synucleinto the membrane and subsequent oligomerization [56],[59]. α-synucleinis thought to preferentially bind to regions of membranes that are enriched in lipids [60]. These regions are called lipid rafts and are characterized by high concentrations of cholesterol and sphingolipids and altered surface charge that may favor α-synucleinbinding. The lipid rafts appear to serve as a platform that promotes α-synucleinbinding and oligomerization.

Contrary to overwhelming evidence that α-synuclein exists as an unfolded monomer in the cytosol, Bartels and colleagues reported that endogenous α-synuclein exists predominantly as a folded tetramer (

58 kDa) [61]. The explanation provided by the authors for this apparent difference is that most studies claiming the unfolded monomer hypothesis commonly use sample heating and denaturing gels to analyze α-synuclein, whereas the authors used nondenaturing conditions. They have also provided evidence by other means - that is, scanning transmission electron microscopy and cell cross-linking - to confirm the prevalence of α-synuclein tetramer in neurons and human brain tissues [61]. Bartels and colleagues proposed that since α-synuclein tetramers are less likely to form aggregates, the tetramers first undergo destabilization prior to forming aggregates. The authors suggested that stabilizing the physiological tetramers could reduce Contrary to overwhelming evidence that-synuclein pathogenicity in PD and other α-synucleinopathies.

Secondary Structure: α-Helices

An &alpha-helix is a right-handed coil of amino-acid residues on a polypeptide chain, typically ranging between 4 and 40 residues. This coil is held together by hydrogen bonds between the oxygen of C=O on top coil and the hydrogen of N-H on the bottom coil. Such a hydrogen bond is formed exactly every 4 amino acid residues, and every complete turn of the helix is only 3.6 amino acid residues. This regular pattern gives the &alpha-helix very definite features with regards to the thickness of the coil and the length of each complete turn along the helix axis.

The structural integrity of an &alpha-helix is in part dependent on correct steric configuration. Amino acids whose R-groups are too large (tryptophan, tyrosine) or too small (glycine) destabilize &alpha-helices. Proline also destabilizes &alpha-helices because of its irregular geometry its R-group bonds back to the nitrogen of the amide group, which causes steric hindrance. In addition, the lack of a hydrogen on Proline's nitrogen prevents it from participating in hydrogen bonding.

Another factor affecting &alpha-helix stability is the total dipole moment of the entire helix due to individual dipoles of the C=O groups involved in hydrogen bonding. Stable &alpha-helices typically end with a charged amino acid to neutralize the dipole moment.


In this study, we applied an integrated omics approach to understand dinoflagellate secondary metabolite biosynthesis. To this end, we sequenced the genome of A. gibbosum and identified key features that regulate secondary metabolite levels and structural diversity. We hypothesize that miRNA-mediated, post-transcriptional regulation in A. gibbosum, which targets primary pyruvate metabolism, subsequently affects secondary metabolism. This study represents a first step to illuminate key molecular events involved in dinoflagellate secondary metabolism, and it should facilitate studies of HAB formation and associated toxin production. Ongoing high-throughput sequencing of dinoflagellate genomes promises to be informative, not only for understanding toxin secondary metabolism genes, but also for better insights into their genome organization. The availability of this first basal dinoflagellate genome provides important clues about dinoflagellate evolution and extends the genome size limit that has been a challenge for several years.

Primer Based Approach for PCR Amplification of High GC Content Gene: Mycobacterium Gene as a Model

The genome of Mycobacterium is rich in GC content and poses problem in amplification of some genes, especially those rich in the GC content in terminal regions, by standard/routine PCR procedures. Attempts have been made to amplify three GC rich genes of Mycobacterium sp. (Rv0519c and Rv0774c from M. tuberculosis and ML0314c from M. leprae). Out of these three genes, Rv0774c gene was amplified with normal primers under standard PCR conditions, while no amplification was observed in case of Rv0519c and ML0314c genes. In the present investigation a modified primer based approach was successfully used for amplification of GC rich sequence of Rv0519c through codon optimization without changing the native amino acid sequence. The strategy was successfully confirmed by redesigning the standard primers with similar modifications followed by amplification of ML0314c gene.

1. Introduction

Polymerase chain reaction (PCR) based cloning of gene of interest with high GC content is a long recognized problem. PCR is a most sensitive tool and various factors have to be optimized for amplification of gene of interest. Primer is one of the precise control elements in this process. Designing of primers directly influences the result of standardized cloning procedures. High GC content of the gene generates complication during primer designing like mismatch and high annealing temperature, self-dimer formation, and secondary structure. Sometimes, amplification of gene is not routinely achieved by normal PCR techniques. The most prominent problem associated is hairpin loop, which directly interferes during annealing of primers on difficult DNA template that leads to no amplification. Different strategies have been proposed to sort out this problem. Use of DMSO and glycerol was reported to reduce the annealing temperature and denaturation temperature, increase the chances of breakage of secondary structure, and increase the efficiency of amplification [1–5]. The whole genome sequence of Mycobacterium tuberculosis was deciphered by Cole et al. [6]. The genes of M. tuberculosis are being cloned and expressed in E. coli cells in order to identify their possible role in Mycobacterium life. The Mycobacterium genome has very high GC content (66%) which raised the possibility of hairpin structure in the genomic structure. From genome sequence analysis it was observed that PPE, PE, and PGRS multigene family code for proteins of approximately 110–80 amino acids rich in proline and glutamic acid at N-terminal position. Proline and glutamic acid residues are mainly coded by triplet of GC bases in Mycobacterium genome. Most of the genes for membrane proteins of M. tuberculosis were rich in GC content at terminal regions. Presence of high GC content increased the annealing temperature beyond the extension temperature (72°C) and also repeated stretches generate the hairpin structure. In such cases, effectiveness and reproducibility of PCR amplification depend on detailed analysis of the possible secondary structures of the oligonucleotide primers as well as formation of self-dimers and cross-dimers with other interrelating oligonucleotides [7]. Though these problems have been considered by several investigators, no systematic details are available to approach this problem.

In an attempt to clone GC rich genes (Rv0519c and Rv0774c from M. tuberculosis and ML0314c from M. leprae) from Mycobacterium sp., we designed primers by using standard method for gene amplification. Rv0774c and Rv0519c genes demonstrated 100% nucleotide identity in M. tuberculosis H37Rv and M. tuberculosis H37Ra. Therefore, M. tuberculosis H37Ra chromosomal DNA was used as template for amplification of these two genes. We could amplify Rv0774c gene, but Rv0519c and ML0314c genes having high GC content at terminal region were not amplified by standard PCR procedures. Therefore, an attempt has been made in the present investigation to standardize the conditions and ingredients that favor the amplification of GC rich sequences.

2. Materials and Methods

2.1. Materials

E. coli DH5α cells and pET-28a were procured from Invitrogen. Taq polymerase and dNTPs were purchased from Fermentas, USA. Restriction enzymes were purchased from New England Biolabs. Kanamycin, Middlebrook media, OADC, and tween 80 were purchased from Hi-Media, India. The strain Mycobacterium tuberculosis H37Ra was a kind gift from Director of National Institute of Leprosy and Other Mycobacterial Diseases, Agra, India. Genomic DNA for Mycobacterium leprosy was a kind gift from Dr. Mallika Lavania of Stanley Browne Laboratory, the Leprosy Mission, Nandnagri, Shahdara, New Delhi.

2.2. Mycobacterium Genomic DNA Isolation

M. tuberculosis H37Ra strains were routinely cultured for one week on middlebrook, 0.05% Tween 80, enriched with OADC. One week grown M. tuberculosis H37Ra cells (1.5 mL) were harvested by centrifugation at 5000 ×g for 20 min. Harvested pellet was resuspended in 400 μL Tris-EDTA buffer (100 mM Tris/10 mM EDTA, pH 8). The cells were lysed by putting the sample in boiling water bath for 5 min followed by cooling on ice for 5 min. Forty microliters of 20 mg/mL lysozyme and 5 μL of proteinase K were added. After 1 h of incubation at 37°C, solution A (56 μL of 10% SDS, 64 μL of CTAB-NaCl) was added to the reaction mixture. After 2 h of incubation at 65°C, proteins were removed by 2-3 times of washing with phenol : chloroform : isoamyl alcohol (25 : 24 : 1). The genomic DNA was precipitated with 0.6 volume of isopropanol at room temperature for 1 h. The precipitated DNA was dried and dissolved in sterile water and stored at −20°C.

2.3. Modification of Primers by Codon Optimization

Degeneracy of codon is normally used to overcome the existing problem including change of base at wobble position specific for coding sequence of Mycobacterium genome. Designed forward primer of Rv0519c contributes about 64% GC content and stretches of GC led to generation of complicated hairpin structure with high value of free energy change

G. By carefully examining the hairpin structure, introduction of the small base/pair distorted the whole secondary structure. The incorporation we opted in the primer sequence was as follows: guanine (G) base turned into adenosine (A) at wobble position of third codon CGG and thymine (T) to adenine (A) in codon CGT (Table 1). Similarly in reverse primer of Rv0519c primer sequence, the adenosine (A) base was turned into thymine (T) at wobble position of last sixth codon CGA. Mycobacterium leprae genome sequence also has high guanine and cytosine stretches. Reverse primer sequence of ML0314c of leprae gene was also modified. Guanine was turned into cytosine at wobble position of TCG codon. The effect of modification was analysed by IDT oligoanalyzer tools.