We are searching data for your request:
Upon completion, a link will appear to access the found materials.
Methylation on gene-body and 3'UTRs if copied to mRNA can potentially regulate post-transcription modifications or expression regulation. But I'm not sure if they are maintained after transcription or if they get all de-novo methylation.
I don't believe anything should change in the majority of DNA->RNA transcription. DNA methylation typically occurs on the non-watson crick side of Cytosine so it shouldn't affect the base-pairing.
However, there are a few hypothetical situations that would result in alterations of the transcribed RNA. The sponatneous deamination of the 4' amine would convert the base into uracil. If there is an additional 5' methyl, the 5-methyluracil would be recognized as Thymine.
(edit) I've talked with several genomics folks about this topic and it turns out that M5Cytosine is very resistant to deamination due to the presence of the Methyl group. As a result, the instance that I have just described is actually very rare.
5mCytosine to Thymine deamination
The other situation would be to errors in DNA proofreading. Does the 5mCytosine affect the fidelity of RNA polymerase? I honestly don't know but it would be worth examining.
There is an article in PNAS Conservation and divergence in eukaryotic DNA methylation who used:
next-generation sequencing to investigate the DNA methylation patterns in eight divergent species, including green algae, flowering plants, insects, and vertebrates. Their data allowed a comprehensive comparison of whole-genome methylation profiles across the plant and animal kingdoms, revealing both conserved and divergent features of DNA methylation in eukaryotes.
So there is some conservation.
Three types of natural methylation have been reported in DNA. Cytosine can be modified either on the ring to form 5-methylcytosine or on the exocyclic amino group to form N 4 -methylcytosine. Adenine may be modified to form N 6 -methyladenine. N 4 -methylcytosine and N 6 -methyladenine are found only in bacteria and archaea, whereas 5-methylcytosine is widely distributed. Special enzymes called DNA methyltransferases are responsible for this methylation they recognize specific sequences within the DNA molecule so that only a subset of the bases is modified. Other methylations of the bases or of the deoxyribose are sometimes induced by carcinogens. These usually lead to mispairing of the bases during replication and have to be removed if they are not to become mutagenic.
Natural methylation has many cellular functions. In bacteria and archaea, methylation forms an essential part of the immune system by protecting DNA molecules from fragmentation by restriction endonucleases. In some organisms, methylation helps to eliminate incorrect base sequences introduced during DNA replication. By marking the parental strand with a methyl group, a cellular mechanism known as the mismatch repair system distinguishes between the newly replicated strand where the errors occur and the correct sequence on the template strand. In higher eukaryotes, 5-methylcytosine controls many cellular phenomena by preventing DNA transcription. Methylation is also believed to signal imprinting, a process whereby some genes inherited from one parent are selectively inactivated. Correct methylation may also repress or activate key genes that control embryonic development. On the other hand, 5-methylcytosine is potentially mutagenic because thymine produced during the methylation process converts C:G pairs to T:A pairs. In mammals, methylation takes place selectively within the dinucleotide sequence CG—a rare sequence, presumably because it has been lost by mutation. In many cancers, mutations are found in key genes at CG dinucleotides.
The DNA methylation landscape of vertebrates is very particular compared to other organisms. In mammals, around 75% of CpG dinucleotides are methylated in somatic cells,  and DNA methylation appears as a default state that has to be specifically excluded from defined locations.   By contrast, the genome of most plants, invertebrates, fungi, or protists show “mosaic” methylation patterns, where only specific genomic elements are targeted, and they are characterized by the alternation of methylated and unmethylated domains.  
High CpG methylation in mammalian genomes has an evolutionary cost because it increases the frequency of spontaneous mutations. Loss of amino-groups occurs with a high frequency for cytosines, with different consequences depending on their methylation. Methylated C residues spontaneously deaminate to form T residues over time hence CpG dinucleotides steadily deaminate to TpG dinucleotides, which is evidenced by the under-representation of CpG dinucleotides in the human genome (they occur at only 21% of the expected frequency).  (On the other hand, spontaneous deamination of unmethylated C residues gives rise to U residues, a change that is quickly recognized and repaired by the cell.)
CpG islands Edit
In mammals, the only exception for this global CpG depletion resides in a specific category of GC- and CpG-rich sequences termed CpG islands that are generally unmethylated and therefore retained the expected CpG content.  CpG islands are usually defined as regions with: 1) a length greater than 200bp, 2) a G+C content greater than 50%, 3) a ratio of observed to expected CpG greater than 0.6, although other definitions are sometimes used.  Excluding repeated sequences, there are around 25,000 CpG islands in the human genome, 75% of which being less than 850bp long.  They are major regulatory units and around 50% of CpG islands are located in gene promoter regions, while another 25% lie in gene bodies, often serving as alternative promoters. Reciprocally, around 60-70% of human genes have a CpG island in their promoter region.   The majority of CpG islands are constitutively unmethylated and enriched for permissive chromatin modification such as H3K4 methylation. In somatic tissues, only 10% of CpG islands are methylated, the majority of them being located in intergenic and intragenic regions.
Repression of CpG-dense promoters Edit
DNA methylation was probably present at some extent in very early eukaryote ancestors. In virtually every organism analyzed, methylation in promoter regions correlates negatively with gene expression.   CpG-dense promoters of actively transcribed genes are never methylated, but, reciprocally, transcriptionally silent genes do not necessarily carry a methylated promoter. In mouse and human, around 60–70% of genes have a CpG island in their promoter region and most of these CpG islands remain unmethylated independently of the transcriptional activity of the gene, in both differentiated and undifferentiated cell types.   Of note, whereas DNA methylation of CpG islands is unambiguously linked with transcriptional repression, the function of DNA methylation in CG-poor promoters remains unclear albeit there is little evidence that it could be functionally relevant. 
DNA methylation may affect the transcription of genes in two ways. First, the methylation of DNA itself may physically impede the binding of transcriptional proteins to the gene,  and second, and likely more important, methylated DNA may be bound by proteins known as methyl-CpG-binding domain proteins (MBDs). MBD proteins then recruit additional proteins to the locus, such as histone deacetylases and other chromatin remodeling proteins that can modify histones, thereby forming compact, inactive chromatin, termed heterochromatin. This link between DNA methylation and chromatin structure is very important. In particular, loss of methyl-CpG-binding protein 2 (MeCP2) has been implicated in Rett syndrome and methyl-CpG-binding domain protein 2 (MBD2) mediates the transcriptional silencing of hypermethylated genes in "cancer".
Repression of transposable elements Edit
DNA methylation is a powerful transcriptional repressor, at least in CpG dense contexts. Transcriptional repression of protein-coding genes appears essentially limited to very specific classes of genes that need to be silent permanently and in almost all tissues. While DNA methylation does not have the flexibility required for the fine-tuning of gene regulation, its stability is perfect to ensure the permanent silencing of transposable elements.  Transposon control is one of the most ancient functions of DNA methylation that is shared by animals, plants and multiple protists.  It is even suggested that DNA methylation evolved precisely for this purpose. 
Genome expansion Edit
DNA methylation of transposable elements has been known to be related to genome expansion. However, the evolutionary driver for genome expansion remains unknown. There is a clear correlation between the size of the genome and CpG, suggesting that the DNA methylation of transposable elements led to a noticeable increase in the mass of DNA. 
Methylation of the gene body of highly transcribed genes Edit
A function that appears even more conserved than transposon silencing is positively correlated with gene expression. In almost all species where DNA methylation is present, DNA methylation is especially enriched in the body of highly transcribed genes.   The function of gene body methylation is not well understood. A body of evidence suggests that it could regulate splicing  and suppress the activity of intragenic transcriptional units (cryptic promoters or transposable elements).  Gene-body methylation appears closely tied to H3K36 methylation. In yeast and mammals, H3K36 methylation is highly enriched in the body of highly transcribed genes. In yeast at least, H3K36me3 recruits enzymes such as histone deacetylases to condense chromatin and prevent the activation of cryptic start sites.  In mammals, DNMT3a and DNMT3b PWWP domain binds to H3K36me3 and the two enzymes are recruited to the body of actively transcribed genes.
During embryonic development Edit
DNA methylation patterns are largely erased and then re-established between generations in mammals. Almost all of the methylations from the parents are erased, first during gametogenesis, and again in early embryogenesis, with demethylation and remethylation occurring each time. Demethylation in early embryogenesis occurs in the preimplantation period in two stages – initially in the zygote, then during the first few embryonic replication cycles of morula and blastula. A wave of methylation then takes place during the implantation stage of the embryo, with CpG islands protected from methylation. This results in global repression and allows housekeeping genes to be expressed in all cells. In the post-implantation stage, methylation patterns are stage- and tissue-specific, with changes that would define each individual cell type lasting stably over a long period. 
Whereas DNA methylation is not necessary per se for transcriptional silencing, it is thought nonetheless to represent a “locked” state that definitely inactivates transcription. In particular, DNA methylation appears critical for the maintenance of mono-allelic silencing in the context of genomic imprinting and X chromosome inactivation.   In these cases, expressed and silent alleles differ by their methylation status, and loss of DNA methylation results in loss of imprinting and re-expression of Xist in somatic cells. During embryonic development, few genes change their methylation status, at the important exception of many genes specifically expressed in the germline.  DNA methylation appears absolutely required in differentiated cells, as knockout of any of the three competent DNA methyltransferase results in embryonic or post-partum lethality. By contrast, DNA methylation is dispensable in undifferentiated cell types, such as the inner cell mass of the blastocyst, primordial germ cells or embryonic stem cells. Since DNA methylation appears to directly regulate only a limited number of genes, how precisely DNA methylation absence causes the death of differentiated cells remain an open question.
Due to the phenomenon of genomic imprinting, maternal and paternal genomes are differentially marked and must be properly reprogrammed every time they pass through the germline. Therefore, during gametogenesis, primordial germ cells must have their original biparental DNA methylation patterns erased and re-established based on the sex of the transmitting parent. After fertilization, the paternal and maternal genomes are once again demethylated and remethylated (except for differentially methylated regions associated with imprinted genes). This reprogramming is likely required for totipotency of the newly formed embryo and erasure of acquired epigenetic changes. 
In cancer Edit
In many disease processes, such as cancer, gene promoter CpG islands acquire abnormal hypermethylation, which results in transcriptional silencing that can be inherited by daughter cells following cell division.  Alterations of DNA methylation have been recognized as an important component of cancer development. Hypomethylation, in general, arises earlier and is linked to chromosomal instability and loss of imprinting, whereas hypermethylation is associated with promoters and can arise secondary to gene (oncogene suppressor) silencing, but might be a target for epigenetic therapy. 
Global hypomethylation has also been implicated in the development and progression of cancer through different mechanisms.  Typically, there is hypermethylation of tumor suppressor genes and hypomethylation of oncogenes. 
Generally, in progression to cancer, hundreds of genes are silenced or activated. Although silencing of some genes in cancers occurs by mutation, a large proportion of carcinogenic gene silencing is a result of altered DNA methylation (see DNA methylation in cancer). DNA methylation causing silencing in cancer typically occurs at multiple CpG sites in the CpG islands that are present in the promoters of protein coding genes.
Altered expressions of microRNAs also silence or activate many genes in progression to cancer (see microRNAs in cancer). Altered microRNA expression occurs through hyper/hypo-methylation of CpG sites in CpG islands in promoters controlling transcription of the microRNAs.
Silencing of DNA repair genes through methylation of CpG islands in their promoters appears to be especially important in progression to cancer (see methylation of DNA repair genes in cancer).
In atherosclerosis Edit
Epigenetic modifications such as DNA methylation have been implicated in cardiovascular disease, including atherosclerosis. In animal models of atherosclerosis, vascular tissue, as well as blood cells such as mononuclear blood cells, exhibit global hypomethylation with gene-specific areas of hypermethylation. DNA methylation polymorphisms may be used as an early biomarker of atherosclerosis since they are present before lesions are observed, which may provide an early tool for detection and risk prevention. 
Two of the cell types targeted for DNA methylation polymorphisms are monocytes and lymphocytes, which experience an overall hypomethylation. One proposed mechanism behind this global hypomethylation is elevated homocysteine levels causing hyperhomocysteinemia, a known risk factor for cardiovascular disease. High plasma levels of homocysteine inhibit DNA methyltransferases, which causes hypomethylation. Hypomethylation of DNA affects genes that alter smooth muscle cell proliferation, cause endothelial cell dysfunction, and increase inflammatory mediators, all of which are critical in forming atherosclerotic lesions.  High levels of homocysteine also result in hypermethylation of CpG islands in the promoter region of the estrogen receptor alpha (ERα) gene, causing its down regulation.  ERα protects against atherosclerosis due to its action as a growth suppressor, causing the smooth muscle cells to remain in a quiescent state.  Hypermethylation of the ERα promoter thus allows intimal smooth muscle cells to proliferate excessively and contribute to the development of the atherosclerotic lesion. 
Another gene that experiences a change in methylation status in atherosclerosis is the monocarboxylate transporter (MCT3), which produces a protein responsible for the transport of lactate and other ketone bodies out of many cell types, including vascular smooth muscle cells. In atherosclerosis patients, there is an increase in methylation of the CpG islands in exon 2, which decreases MCT3 protein expression. The downregulation of MCT3 impairs lactate transport and significantly increases smooth muscle cell proliferation, which further contributes to the atherosclerotic lesion. An ex vivo experiment using the demethylating agent Decitabine (5-aza-2 -deoxycytidine) was shown to induce MCT3 expression in a dose dependent manner, as all hypermethylated sites in the exon 2 CpG island became demethylated after treatment. This may serve as a novel therapeutic agent to treat atherosclerosis, although no human studies have been conducted thus far. 
In heart failure Edit
In addition to atherosclerosis described above, specific epigenetic changes have been identified in the failing human heart. This may vary by disease etiology. For example, in ischemic heart failure DNA methylation changes have been linked to changes in gene expression that may direct gene expression associated with the changes in heart metabolism known to occur.  Additional forms of heart failure (e.g. diabetic cardiomyopathy) and co-morbidities (e.g. obesity) must be explored to see how common these mechanisms are. Most strikingly, in failing human heart these changes in DNA methylation are associated with racial and socioeconomic status which further impact how gene expression is altered,  and may influence how the individual's heart failure should be treated.
In aging Edit
In humans and other mammals, DNA methylation levels can be used to accurately estimate the age of tissues and cell types, forming an accurate epigenetic clock. 
A longitudinal study of twin children showed that, between the ages of 5 and 10, there was divergence of methylation patterns due to environmental rather than genetic influences.  There is a global loss of DNA methylation during aging. 
In a study that analyzed the complete DNA methylomes of CD4 + T cells in a newborn, a 26 years old individual and a 103 years old individual were observed that the loss of methylation is proportional to age.  Hypomethylated CpGs observed in the centenarian DNAs compared with the neonates covered all genomic compartments (promoters, intergenic, intronic and exonic regions).  However, some genes become hypermethylated with age, including genes for the estrogen receptor, p16, and insulin-like growth factor 2. 
In exercise Edit
High intensity exercise has been shown to result in reduced DNA methylation in skeletal muscle.  Promoter methylation of PGC-1α and PDK4 were immediately reduced after high intensity exercise, whereas PPAR-γ methylation was not reduced until three hours after exercise.  At the same time, six months of exercise in previously sedentary middle-age men resulted in increased methylation in adipose tissue.  One study showed a possible increase in global genomic DNA methylation of white blood cells with more physical activity in non-Hispanics. 
In B-cell differentiation Edit
A study that investigated the methylome of B cells along their differentiation cycle, using whole-genome bisulfite sequencing (WGBS), showed that there is a hypomethylation from the earliest stages to the most differentiated stages. The largest methylation difference is between the stages of germinal center B cells and memory B cells. Furthermore, this study showed that there is a similarity between B cell tumors and long-lived B cells in their DNA methylation signatures. 
In the brain Edit
Two reviews summarize evidence that DNA methylation alterations in brain neurons are important in learning and memory.   Contextual fear conditioning (a form of associative learning) in animals, such as mice and rats, is rapid and is extremely robust in creating memories.  In mice  and in rats  contextual fear conditioning, within 1–24 hours, it is associated with altered methylations of several thousand DNA cytosines in genes of hippocampus neurons. Twenty four hours after contextual fear conditioning, 9.2% of the genes in rat hippocampus neurons are differentially methylated.  In mice,  when examined at four weeks after conditioning, the hippocampus methylations and demethylations had been reset to the original naive conditions. The hippocampus is needed to form memories, but memories are not stored there. For such mice, at four weeks after contextual fear conditioning, substantial differential CpG methylations and demethylations occurred in cortical neurons during memory maintenance, and there were 1,223 differentially methylated genes in their anterior cingulate cortex.  Active changes in neuronal DNA methylation and demethylation appear to act as controllers of synaptic scaling and glutamate receptor trafficking in learning and memory formation. 
In mammalian cells, DNA methylation occurs mainly at the C5 position of CpG dinucleotides and is carried out by two general classes of enzymatic activities – maintenance methylation and de novo methylation. 
Maintenance methylation activity is necessary to preserve DNA methylation after every cellular DNA replication cycle. Without the DNA methyltransferase (DNMT), the replication machinery itself would produce daughter strands that are unmethylated and, over time, would lead to passive demethylation. DNMT1 is the proposed maintenance methyltransferase that is responsible for copying DNA methylation patterns to the daughter strands during DNA replication. Mouse models with both copies of DNMT1 deleted are embryonic lethal at approximately day 9, due to the requirement of DNMT1 activity for development in mammalian cells.
It is thought that DNMT3a and DNMT3b are the de novo methyltransferases that set up DNA methylation patterns early in development. DNMT3L is a protein that is homologous to the other DNMT3s but has no catalytic activity. Instead, DNMT3L assists the de novo methyltransferases by increasing their ability to bind to DNA and stimulating their activity. Mice and rats have a third functional de novo methyltransferase enzyme named DNMT3C, which evolved as a paralog of Dnmt3b by tandem duplication in the common ancestral of Muroidea rodents. DNMT3C catalyzes the methylation of promoters of transposable elements during early spermatogenesis, an activity shown to be essential for their epigenetic repression and male fertility.   It is yet unclear if in other mammals that do not have DNMT3C (like humans) rely on DNMT3B or DNMT3A for de novo methylation of transposable elements in the germline. Finally, DNMT2 (TRDMT1) has been identified as a DNA methyltransferase homolog, containing all 10 sequence motifs common to all DNA methyltransferases however, DNMT2 (TRDMT1) does not methylate DNA but instead methylates cytosine-38 in the anticodon loop of aspartic acid transfer RNA. 
Since many tumor suppressor genes are silenced by DNA methylation during carcinogenesis, there have been attempts to re-express these genes by inhibiting the DNMTs. 5-Aza-2'-deoxycytidine (decitabine) is a nucleoside analog that inhibits DNMTs by trapping them in a covalent complex on DNA by preventing the β-elimination step of catalysis, thus resulting in the enzymes' degradation. However, for decitabine to be active, it must be incorporated into the genome of the cell, which can cause mutations in the daughter cells if the cell does not die. In addition, decitabine is toxic to the bone marrow, which limits the size of its therapeutic window. These pitfalls have led to the development of antisense RNA therapies that target the DNMTs by degrading their mRNAs and preventing their translation. However, it is currently unclear whether targeting DNMT1 alone is sufficient to reactivate tumor suppressor genes silenced by DNA methylation.
Significant progress has been made in understanding DNA methylation in the model plant Arabidopsis thaliana. DNA methylation in plants differs from that of mammals: while DNA methylation in mammals mainly occurs on the cytosine nucleotide in a CpG site, in plants the cytosine can be methylated at CpG, CpHpG, and CpHpH sites, where H represents any nucleotide but not guanine. Overall, Arabidopsis DNA is highly methylated, mass spectrometry analysis estimated 14% of cytosines to be modified.  : abstract
The principal Arabidopsis DNA methyltransferase enzymes, which transfer and covalently attach methyl groups onto DNA, are DRM2, MET1, and CMT3. Both the DRM2 and MET1 proteins share significant homology to the mammalian methyltransferases DNMT3 and DNMT1, respectively, whereas the CMT3 protein is unique to the plant kingdom. There are currently two classes of DNA methyltransferases: 1) the de novo class or enzymes that create new methylation marks on the DNA 2) a maintenance class that recognizes the methylation marks on the parental strand of DNA and transfers new methylation to the daughter strands after DNA replication. DRM2 is the only enzyme that has been implicated as a de novo DNA methyltransferase. DRM2 has also been shown, along with MET1 and CMT3 to be involved in maintaining methylation marks through DNA replication.  Other DNA methyltransferases are expressed in plants but have no known function (see the Chromatin Database).
It is not clear how the cell determines the locations of de novo DNA methylation, but evidence suggests that for many (though not all) locations, RNA-directed DNA methylation (RdDM) is involved. In RdDM, specific RNA transcripts are produced from a genomic DNA template, and this RNA forms secondary structures called double-stranded RNA molecules.  The double-stranded RNAs, through either the small interfering RNA (siRNA) or microRNA (miRNA) pathways direct de-novo DNA methylation of the original genomic location that produced the RNA.  This sort of mechanism is thought to be important in cellular defense against RNA viruses and/or transposons, both of which often form a double-stranded RNA that can be mutagenic to the host genome. By methylating their genomic locations, through an as yet poorly understood mechanism, they are shut off and are no longer active in the cell, protecting the genome from their mutagenic effect. Recently, it was described that methylation of the DNA is the main determinant of embryogenic cultures formation from explants in woody plants and is regarded the main mechanism that explains the poor response of mature explants to somatic embryogenesis in the plants (Isah 2016).
Diverse orders of insects show varied patterns of DNA methylation, from almost undetectable levels in flies to low levels in butterflies and higher in true bugs and some cockroaches (up to 14% of all CG sites in Blattella asahinai). 
Functional DNA methylation has been discovered in Honey Bees.   DNA methylation marks are mainly on the gene body, and current opinions on the function of DNA methylation is gene regulation via alternative splicing 
DNA methylation levels in Drosophila melanogaster are nearly undetectable.  Sensitive methods applied to Drosophila DNA Suggest levels in the range of 0.1–0.3% of total cytosine.  This low level of methylation  appears to reside in genomic sequence patterns that are very different from patterns seen in humans, or in other animal or plant species to date. Genomic methylation in D. melanogaster was found at specific short motifs (concentrated in specific 5-base sequence motifs that are CA- and CT-rich but depleted of guanine) and is independent of DNMT2 activity. Further, highly sensitive mass spectrometry approaches,  have now demonstrated the presence of low (0.07%) but significant levels of adenine methylation during the earliest stages of Drosophila embryogenesis.
Many fungi have low levels (0.1 to 0.5%) of cytosine methylation, whereas other fungi have as much as 5% of the genome methylated.  This value seems to vary both among species and among isolates of the same species.  There is also evidence that DNA methylation may be involved in state-specific control of gene expression in fungi. [ citation needed ] However, at a detection limit of 250 attomoles by using ultra-high sensitive mass spectrometry DNA methylation was not confirmed in single cellular yeast species such as Saccharomyces cerevisiae or Schizosaccharomyces pombe, indicating that yeasts do not possess this DNA modification.  : abstract
Although brewers' yeast (Saccharomyces), fission yeast (Schizosaccharomyces), and Aspergillus flavus  have no detectable DNA methylation, the model filamentous fungus Neurospora crassa has a well-characterized methylation system.  Several genes control methylation in Neurospora and mutation of the DNA methyl transferase, dim-2, eliminates all DNA methylation but does not affect growth or sexual reproduction. While the Neurospora genome has very little repeated DNA, half of the methylation occurs in repeated DNA including transposon relics and centromeric DNA. The ability to evaluate other important phenomena in a DNA methylase-deficient genetic background makes Neurospora an important system in which to study DNA methylation.
DNA methylation is largely absent from Dictyostelium discoidium  where it appears to occur at about 0.006% of cytosines.  In contrast, DNA methylation is widely distributed in Physarum polycephalum  where 5-methylcytosine makes up as much as 8% of total cytosine 
Adenine or cytosine methylation is part of the restriction modification system of many bacteria, in which specific DNA sequences are methylated periodically throughout the genome. A methylase is the enzyme that recognizes a specific sequence and methylates one of the bases in or near that sequence. Foreign DNAs (which are not methylated in this manner) that are introduced into the cell are degraded by sequence-specific restriction enzymes and cleaved. Bacterial genomic DNA is not recognized by these restriction enzymes. The methylation of native DNA acts as a sort of primitive immune system, allowing the bacteria to protect themselves from infection by bacteriophage.
32 kDa that does not belong to a restriction/modification system. The target recognition sequence for E. coli Dam is GATC, as the methylation occurs at the N6 position of the adenine in this sequence (G meATC). The three base pairs flanking each side of this site also influence DNA–Dam binding. Dam plays several key roles in bacterial processes, including mismatch repair, the timing of DNA replication, and gene expression. As a result of DNA replication, the status of GATC sites in the E. coli genome changes from fully methylated to hemimethylated. This is because adenine introduced into the new DNA strand is unmethylated. Re-methylation occurs within two to four seconds, during which time replication errors in the new strand are repaired. Methylation, or its absence, is the marker that allows the repair apparatus of the cell to differentiate between the template and nascent strands. It has been shown that altering Dam activity in bacteria results in an increased spontaneous mutation rate. Bacterial viability is compromised in dam mutants that also lack certain other DNA repair enzymes, providing further evidence for the role of Dam in DNA repair.
One region of the DNA that keeps its hemimethylated status for longer is the origin of replication, which has an abundance of GATC sites. This is central to the bacterial mechanism for timing DNA replication. SeqA binds to the origin of replication, sequestering it and thus preventing methylation. Because hemimethylated origins of replication are inactive, this mechanism limits DNA replication to once per cell cycle.
Expression of certain genes, for example, those coding for pilus expression in E. coli, is regulated by the methylation of GATC sites in the promoter region of the gene operon. The cells' environmental conditions just after DNA replication determine whether Dam is blocked from methylating a region proximal to or distal from the promoter region. Once the pattern of methylation has been created, the pilus gene transcription is locked in the on or off position until the DNA is again replicated. In E. coli, these pili operons have important roles in virulence in urinary tract infections. It has been proposed [ by whom? ] that inhibitors of Dam may function as antibiotics.
On the other hand, DNA cytosine methylase targets CCAGG and CCTGG sites to methylate cytosine at the C5 position (C meC(A/T) GG). The other methylase enzyme, EcoKI, causes methylation of adenines in the sequences AAC(N6)GTGC and GCAC(N6)GTT.
In Clostridioides difficile, DNA methylation at the target motif CAAAAA was shown to impact sporulation, a key step in disease transmission, as well as cell length, biofilm formation and host colonization. 
Molecular cloning Edit
Most strains used by molecular biologists are derivatives of E. coli K-12, and possess both Dam and Dcm, but there are commercially available strains that are dam-/dcm- (lack of activity of either methylase). In fact, it is possible to unmethylate the DNA extracted from dam+/dcm+ strains by transforming it into dam-/dcm- strains. This would help digest sequences that are not being recognized by methylation-sensitive restriction enzymes.  
The restriction enzyme DpnI can recognize 5'-GmeATC-3' sites and digest the methylated DNA. Being such a short motif, it occurs frequently in sequences by chance, and as such its primary use for researchers is to degrade template DNA following PCRs (PCR products lack methylation, as no methylases are present in the reaction). Similarly, some commercially available restriction enzymes are sensitive to methylation at their cognate restriction sites and must as mentioned previously be used on DNA passed through a dam-/dcm- strain to allow cutting.
DNA methylation can be detected by the following assays currently used in scientific research: 
- is a very sensitive and reliable analytical method to detect DNA methylation. MS, in general, is however not informative about the sequence context of the methylation, thus limited in studying the function of this DNA modification. , which is based on a chemical reaction of sodium bisulfite with DNA that converts unmethylated cytosines of CpG dinucleotides to uracil or UpG, followed by traditional PCR.  However, methylated cytosines will not be converted in this process, and primers are designed to overlap the CpG site of interest, which allows one to determine methylation status as methylated or unmethylated. , also known as BS-Seq, which is a high-throughput genome-wide analysis of DNA methylation. It is based on the aforementioned sodium bisulfite conversion of genomic DNA, which is then sequenced on a Next-generation sequencing platform. The sequences obtained are then re-aligned to the reference genome to determine the methylation status of CpG dinucleotides based on mismatches resulting from the conversion of unmethylated cytosines into uracil. , also known as RRBS knows several working protocols. The first RRBS protocol was called RRBS and aims for around 10% of the methylome, a reference genome is needed. Later came more protocols that were able to sequence a smaller portion of the genome and higher sample multiplexing. EpiGBS was the first protocol where you could multiplex 96 samples in one lane of Illumina sequencing and were a reference genome was no longer needed. A de novo reference construction from the Watson and Crick reads made population screening of SNP's and SMP's simultaneously a fact.
- The HELP assay, which is based on restriction enzymes' differential ability to recognize and cleave methylated and unmethylated CpG DNA sites. , which is based on a new type of enzymes – site-specific methyl-directed DNA endonucleases, which hydrolyze only methylated DNA. assays, which is based on the ability of commercially prepared antibodies to bind to DNA methylation-associated proteins like MeCP2. , a complicated and now rarely used assay based upon restriction enzymes' differential recognition of methylated and unmethylated CpG sites the assay is similar in concept to the HELP assay. (MeDIP), analogous to chromatin immunoprecipitation, immunoprecipitation is used to isolate methylated DNA fragments for input into DNA detection methods such as DNA microarrays (MeDIP-chip) or DNA sequencing (MeDIP-seq). of bisulfite treated DNA. This is the sequencing of an amplicon made by a normal forward primer but a biotinylated reverse primer to PCR the gene of choice. The Pyrosequencer then analyses the sample by denaturing the DNA and adding one nucleotide at a time to the mix according to a sequence given by the user. If there is a mismatch, it is recorded and the percentage of DNA for which the mismatch is present is noted. This gives the user a percentage of methylation per CpG island.
- Molecular break light assay for DNA adenine methyltransferase activity – an assay that relies on the specificity of the restriction enzyme DpnI for fully methylated (adenine methylation) GATC sites in an oligonucleotide labeled with a fluorophore and quencher. The adenine methyltransferase methylates the oligonucleotide making it a substrate for DpnI. Cutting of the oligonucleotide by DpnI gives rise to a fluorescence increase. 
- Methyl Sensitive Southern Blotting is similar to the HELP assay, although uses Southern blotting techniques to probe gene-specific differences in methylation using restriction digests. This technique is used to evaluate local methylation near the binding site for the probe.
- MethylCpG Binding Proteins (MBPs) and fusion proteins containing just the Methyl Binding Domain (MBD) are used to separate native DNA into methylated and unmethylated fractions. The percentage methylation of individual CpG islands can be determined by quantifying the amount of the target in each fraction.  Extremely sensitive detection can be achieved in FFPE tissues with abscription-based detection. Analysis (HRM or HRMA), is a post-PCR analytical technique. The target DNA is treated with sodium bisulfite, which chemically converts unmethylated cytosines into uracils, while methylated cytosines are preserved. PCR amplification is then carried out with primers designed to amplify both methylated and unmethylated templates. After this amplification, highly methylated DNA sequences contain a higher number of CpG sites compared to unmethylated templates, which results in a different melting temperature that can be used in quantitative methylation detection. 
- Ancient DNA methylation reconstruction, a method to reconstruct high-resolution DNA methylation from ancient DNA samples. The method is based on the natural degradation processes that occur in ancient DNA: with time, methylated cytosines are degraded into thymines, whereas unmethylated cytosines are degraded into uracils. This asymmetry in degradation signals was used to reconstruct the full methylation maps of the Neanderthal and the Denisovan.  In September 2019, researchers published a novel method to infer morphological traits from DNA methylation data. The authors were able to show that linking down-regulated genes to phenotypes of monogenic diseases, where one or two copies of a gene are perturbed, allows for
Differentially methylated regions, are genomic regions with different methylation statuses among multiple samples (tissues, cells, individuals or others), are regarded as possible functional regions involved in gene transcriptional regulation. The identification of DMRs among multiple tissues (T-DMRs) provides a comprehensive survey of epigenetic differences among human tissues.  For example, these methylated regions that are unique to a particular tissue allow individuals to differentiate between tissue type, such as semen and vaginal fluid. Current research conducted by Lee et al., showed DACT1 and USP49 positively identified semen by examining T-DMRs.  The use of T-DMRs has proven useful in the identification of various body fluids found at crime scenes. Researchers in the forensic field are currently seeking novel T-DMRs in genes to use as markers in forensic DNA analysis. DMRs between cancer and normal samples (C-DMRs) demonstrate the aberrant methylation in cancers.  It is well known that DNA methylation is associated with cell differentiation and proliferation.  Many DMRs have been found in the development stages (D-DMRs)  and in the reprogrammed progress (R-DMRs).  In addition, there are intra-individual DMRs (Intra-DMRs) with longitudinal changes in global DNA methylation along with the increase of age in a given individual.  There are also inter-individual DMRs (Inter-DMRs) with different methylation patterns among multiple individuals. 
QDMR (Quantitative Differentially Methylated Regions) is a quantitative approach to quantify methylation difference and identify DMRs from genome-wide methylation profiles by adapting Shannon entropy.  The platform-free and species-free nature of QDMR makes it potentially applicable to various methylation data. This approach provides an effective tool for the high-throughput identification of the functional regions involved in epigenetic regulation. QDMR can be used as an effective tool for the quantification of methylation difference and identification of DMRs across multiple samples. 
Gene-set analysis (a.k.a. pathway analysis usually performed tools such as DAVID, GoSeq or GSEA) has been shown to be severely biased when applied to high-throughput methylation data (e.g. MeDIP-seq, MeDIP-ChIP, HELP-seq etc.), and a wide range of studies have thus mistakenly reported hyper-methylation of genes related to development and differentiation it has been suggested that this can be corrected using sample label permutations or using a statistical model to control for differences in the numbers of CpG probes / CpG sites that target each gene. 
DNA methylation marks – genomic regions with specific methylation patterns in a specific biological state such as tissue, cell type, individual – are regarded as possible functional regions involved in gene transcriptional regulation. Although various human cell types may have the same genome, these cells have different methylomes. The systematic identification and characterization of methylation marks across cell types are crucial to understanding the complex regulatory network for cell fate determination. Hongbo Liu et al. proposed an entropy-based framework termed SMART to integrate the whole genome bisulfite sequencing methylomes across 42 human tissues/cells and identified 757,887 genome segments.  Nearly 75% of the segments showed uniform methylation across all cell types. From the remaining 25% of the segments, they identified cell type-specific hypo/hypermethylation marks that were specifically hypo/hypermethylated in a minority of cell types using a statistical approach and presented an atlas of the human methylation marks. Further analysis revealed that the cell type-specific hypomethylation marks were enriched through H3K27ac and transcription factor binding sites in a cell type-specific manner. In particular, they observed that the cell type-specific hypomethylation marks are associated with the cell type-specific super-enhancers that drive the expression of cell identity genes. This framework provides a complementary, functional annotation of the human genome and helps to elucidate the critical features and functions of cell type-specific hypomethylation.
The entropy-based Specific Methylation Analysis and Report Tool, termed "SMART", which focuses on integrating a large number of DNA methylomes for the de novo identification of cell type-specific methylation marks. The latest version of SMART is focused on three main functions including de novo identification of differentially methylated regions (DMRs) by genome segmentation, identification of DMRs from predefined regions of interest, and identification of differentially methylated CpG sites. 
In identification and detection of body fluids Edit
DNA methylation allows for several tissues to be analyzed in one assay as well as for small amounts of body fluid to be identified with the use of extracted DNA. Usually, the two approaches of DNA methylation are either methylated-sensitive restriction enzymes or treatment with sodium bisulphite.  Methylated sensitive restriction enzymes work by cleaving specific CpG, cytosine and guanine separated by only one phosphate group, recognition sites when the CpG is methylated. In contrast, unmethylated cytosines are transformed to uracil and in the process, methylated cytosines remain methylated. In particular, methylation profiles can provide insight on when or how body fluids were left at crime scenes, identify the kind of body fluid, and approximate age, gender, and phenotypic characteristics of perpetrators.  Research indicates various markers that can be used for DNA methylation. Deciding which marker to use for an assay is one of the first steps of the identification of body fluids. In general, markers are selected by examining prior research conducted. Identification markers that are chosen should give a positive result for one type of cell. One portion of the chromosome that is an area of focus when conducting DNA methylation are tissue-specific differentially methylated regions, T-DMRs.The degree of methylation for the T-DMRs ranges depending on the body fluid.  A research team developed a marker system that is two-fold. The first marker is methylated only in the target fluid while the second is methylated in the rest of the fluids.  For instance, if venous blood marker A is un-methylated and venous blood marker B is methylated in a fluid, it indicates the presence of only venous blood. In contrast, if venous blood marker A is methylated and venous blood marker B is un-methylated in some fluid, then that indicates venous blood is in a mixture of fluids. Some examples for DNA methylation markers are Mens1(menstrual blood), Spei1(saliva), and Sperm2(seminal fluid).
DNA methylation provides a relatively good means of sensitivity when identifying and detecting body fluids. In one study, only ten nanograms of a sample was necessary to ascertain successful results.  DNA methylation provides a good discernment of mixed samples since it involves markers that give “on or off” signals. DNA methylation is not impervious to external conditions. Even under degraded conditions using the DNA methylation techniques, the markers are stable enough that there are still noticeable differences between degraded samples and control samples. Specifically, in one study, it was found that there were not any noticeable changes in methylation patterns over an extensive period of time. 
DNA methylation can also be detected by computational models through sophisticated algorithms and methods. Computational models can facilitate the global profiling of DNA methylation across chromosomes, and often such models are faster and cheaper to perform than biological assays. Such up-to-date computational models include Bhasin, et al.,  Bock, et al.,  and Zheng, et al.   Together with biological assay, these methods greatly facilitate the DNA methylation analysis.
RNA modification discovery suggests new code for control of gene expression
A new cellular signal discovered by a team of scientists at the University of Chicago and Tel Aviv University provides a promising new lever in the control of gene expression.
The study, published online Feb. 10 in the journal Nature, describes a small chemical modification that can significantly boost the conversion of genes to proteins. Together with other recent findings, the discovery enriches a critical new dimension to the “Central Dogma” of molecular biology: the epitranscriptome.
“This discovery further opens the window on a whole new world of biology for us to explore,” said Chuan He, the John T. Wilson Distinguished Service Professor in Chemistry, Howard Hughes Medical Institute investigator and senior author of the study. “These modifications have a major impact on almost every biological process.”
The central dogma of molecular biology describes the cellular pathway where genetic information from DNA is copied into temporary RNA “transcripts,” which provide the recipe for the production of proteins. Since Francis Crick first postulated the theory in 1956, scientists have discovered a multitude of modifications to DNA and proteins that regulate this process.
Only recently, however, have scientists focused on investigating dynamic modifications that specifically target the RNA step. In 2011, He’s group discovered the first RNA demethylase that reverses the most prevalent mRNA methylation N6-methyladenosine, or m6A, implying that the addition and removal of the methyl group could dramatically affect these messengers and impact the outcome of gene expression, as also seen for DNA and histones. Subsequently, scientists discovered that the dynamic and reversible methylation of m6A dramatically controlled the metabolism and function of most cellular messenger RNA, and thus, the production of proteins.
In the new Nature study, researchers from UChicago and Tel Aviv University describe a second functional mRNA methylation, N1-methyladenosine, or m1A. Like m6A, the small modification is evolutionarily conserved and common, and present in humans, rodents and yeast, the authors found. But its location and effect on gene expression reflect a new form of epitranscriptome control and suggest an even larger cellular “control panel.”
“The discovery of m1A is extremely important, not only because of its own potential in affecting biological processes, but also because it validates the hypothesis that there is not just one functional modification,” He said. “There could be multiple modifications at different sites where each may carry a distinct message to control the fate and function of mRNA.”
The researchers estimated that m1A was present on transcripts of more than one out of three expressed human genes. Methylated genes exhibited enhanced translation compared to unmethlyated genes, producing protein levels nearly twice as high in all cell types. This increase suggests that m1A, like m6A, may be a mechanism by which cells rapidly boost the expression of hundreds or thousands of specific genes, perhaps during important processes such as cell division, differentiation or under stress.
“mRNA is the perfect place to regulate gene expression, because they can code information from transcription and directly impact translation you can add a consensus sequence to a group of genes and use a modification of the sequence to readily control several hundred transcripts simultaneously,” He said. “If you want to rapidly change the expression of several hundred or a thousand genes, this offers the best way.”
However, despite their complementary effects, m1A and m6A exert their influence on mRNA through different pathways. While studies have found that m6A localizes predominantly to the tail of messenger RNA molecules, increasing their translation and turnover rate, m1A was found largely near the start codon of mRNA transcripts, where protein translation begins. The different mechanisms could allow for finer tuning of post-transcriptional gene expression, or the selective activation of particular genes in different physiological situations.
“This study represents a breakthrough discovery in the exciting, nascent field of the ‘epitranscriptome,’ which is how RNAs are regulated, akin to the genome and the epigenome,” said Christopher Mason, associate professor at Weill Cornell Medicine, who was not affiliated with the study. “What is important about this work is that m6A was recently found to enrich at the ends of genes, and now we know that m1A is what is helping regulate the beginning of genes, and this opens up many questions about revealing the ‘epitranscriptome code’ just like the histone code or the genetic code.”
Future studies will examine the role of m1A methylation in human development, diseases such as diabetes and cancer, and its potential as a target for therapeutic uses.
Citation: “The dynamic N1-methyladenosine methylome in eukaryotic messenger RNA,” Nature, Feb. 10, 2016, by Chuan He, Dan Dominissini, Sigrid Nachtergaele, Qing Dai, Dali Han, Wesley Clark, Guanqun Zheng, Tao Pan and Louis Dore from the University of Chicago, and Sharon Moshitch-Moshkovitz, Eyal Peer, Nitkan Kol, Moshe Shay Ben-Haim, Ayelet Di Segni, Mali Salmon-Divon, Oz Solomon, Eran Eyal, Vera Hershkovitz, Ninette Amariglio and Gideon Rechavi from Tel Aviv University. DOI: 10.1038/nature16998
Funding: National Institutes of Health, Howard Hughes Medical Institute, Flight Attendant Medical Research Institute, Israel Science Foundation, Israeli Centers of Excellence Program, Ernest and Bonnie Beutler Research Program, Chicago Biomedical Consortium, Damon Runyon Cancer Research Foundation and Kahn Family Foundation.
18.1: Transcription—from DNA to RNA
Bacteria, archaea, and eukaryotes must all transcribe genes from their genomes. While the cellular location may be different (eukaryotes perform transcription in the nucleus bacteria and archaea perform transcription in the cytoplasm), the mechanisms by which organisms from each of these clades carry out this process are fundamentally the same and can be characterized by three stages: initiation, elongation, and termination.
A short overview of transcription
Transcription is the process of creating an RNA copy of a segment of DNA. Since this is a process, we want to apply the Energy Story rubric to develop a functional understanding of transcription. What does the system of molecules look like before the start of the transcription? What does it look like at the end? What transformations of matter and transfers of energy happen during the transcription and what, if anything, catalyzes the process? We also want to think about the process from a Design Challenge standpoint. If the biological task is to create a copy of DNA in the chemical language of RNA, what challenges can we reasonably hypothesize or anticipate, given our knowledge about other nucleotide polymer processes, must be overcome? Is there evidence that Nature solved these problems in different ways? What seem to be the criteria for success of transcription? You get the idea.
Listing some of the basic requirements for transcription
Let us first consider the tasks at hand by using some of our foundational knowledge and imagining what might need to happen during transcription if the goal is to make an RNA copy of a piece of one strand of a double-stranded DNA molecule. We'll see that using some basic logic allows us to infer many of the important questions and things that we need to know in order to properly describe the process.
Let's imagine that we want to design a nanomachine/nanobot that would conduct transcription. We can use some Design Challenge thinking to identify problems and subproblems that need to be solved by our little robot.
&bull Where should the machine start? Along the millions to billions of base pairs, where should the machine be directed?
&bull Where should the machine stop?
&bull If we have start and stop sites, we will need ways of encoding that information so that our machine(s) can read this information&mdashhow will that be accomplished?
&bull How many RNA copies of the DNA will we need to make?
&bull How fast do the RNA copies need to be made?
&bull How accurately do the copies need to be made?
&bull How much energy will the process take and where is the energy going to come from?
These are, of course, only some of the core questions. One can dig deeper if they wish. However, these are already good enough for us to start getting a good feel for this process. Notice, too, that many of these questions are remarkably similar to those we inferred might be necessary to understand about DNA replication.
The building blocks of transcription
The building blocks of RNA
Recall from our discussion on the structure of nucleotides that the building blocks of RNA are very similar to those in DNA. In RNA, the building blocks consists of nucleotide triphosphates that are composed of a ribose sugar, a nitrogenous base, and three phosphate groups. The key differences between the building blocks of DNA and those of RNA are that RNA molecules are composed of nucleotides with ribose sugars (as opposed to deoxyribose sugars) and utilize uridine, a uracil containing nucleotide (as opposed to thymidine in DNA). Note below that uracil and thymine are structurally very similar&mdashthe uracil is just lacking a methyl (CH3) functional group compared to thymine.
Figure 1. The basic chemical components of nucleotides.
Attribution: Marc T. Facciotti (original work)
Proteins responsible for creating an RNA copy of a specific piece of DNA (transcription) must first be able to recognize the beginning of the element to be copied. A promoter is a DNA sequence onto which various proteins, collectively known as the transcription machinery, bind and initiates transcription. In most cases, promoters exist upstream (5' to the coding region) of the genes they regulate. The specific sequence of a promoter is very important because it determines whether the corresponding coding portion of the gene is transcribed all the time, some of the time, or infrequently. Although promoters vary among species, a few elements of similar sequence are sometimes conserved. At the -10 and -35 regions upstream of the initiation site, there are two promoter consensus sequences, or regions that are similar across many promoters and across various species. Some promoters will have a sequence very similar to the consensus sequence (the sequence containing the most common sequence elements), and others will look very different. These sequence variations affect the strength to which the transcriptional machinery can bind to the promoter to initiate transcription. This helps to control the number of transcripts that are made and how often they get made.
Figure 2. (a) A general diagram of a gene. The gene includes the promoter sequence, an untranslated region (UTR), and the coding sequence. (b) A list of several strong E. coli promoter sequences. The -35 box and -10 box are highly conserved sequences throughout the strong promoter list. Weaker promoters will have more base pair differences when compared to these sequences.
Source: http://www.discoveryandinnovation.co. lecture12.html
What types of interactions are changed between the transcription machinery and the DNA when the nucleotide sequence of the promoter changes? Why would some sequences create a "strong" promoter and why do others create a "weak" promoter?
Bacterial vs. eukaryotic promoters
In bacterial cells, the -10 consensus sequence, called the -10 region, is AT rich, often TATAAT. The -35 sequence, TTGACA, is recognized and bound by the protein &sigma. Once this protein-DNA interaction is made, the subunits of the core RNA polymerase bind to the site. Due to the relatively lower stability of AT associations, the AT-rich -10 region facilitates unwinding of the DNA template, and several phosphodiester bonds are made.
Eukaryotic promoters are much larger and more complex than prokaryotic promoters, but both have an AT-rich region&mdashin eukaryotes, it is typically called a TATA box. For example, in the mouse thymidine kinase gene, the TATA box is located at approximately -30. For this gene, the exact TATA box sequence is TATAAAA, as read in the 5' to 3' direction on the nontemplate strand. This sequence is not identical to the E. coli -10 region, but both share the quality of being AT-rich element.
Instead of a single bacterial polymerase, the genomes of most eukaryotes encode three different RNA polymerases, each made up of ten protein subunits or more. Each eukaryotic polymerase also requires a distinct set of proteins known as transcription factors to recruit it to a promoter. In addition, an army of other transcription factors, proteins known as enhancers, and silencers help to regulate the synthesis of RNA from each promoter. Enhancers and silencers affect the efficiency of transcription but are not necessary for the initiation of transcription or its procession. Basal transcription factors are crucial in the formation of a preinitiation complex on the DNA template that subsequently recruits RNA polymerase for transcription initiation.
Initiation of transcription begins with the binding of RNA polymerase to the promoter. Transcription requires the DNA double helix to partially unwind such that one strand can be used as the template for RNA synthesis. The region of unwinding is called a transcription bubble.
Transcription always proceeds from the template strand, one of the two strands of the double-stranded DNA. The RNA product is complementary to the template strand and is almost identical to the nontemplate strand, called the coding strand, with the exception that RNA contains a uracil (U) in place of the thymine (T) found in DNA. During elongation, an enzyme called RNA polymerase proceeds along the DNA template, adding nucleotides by base pairing with the DNA template in a manner similar to DNA replication, with the difference being an RNA strand that is synthesized does not remain bound to the DNA template. As elongation proceeds, the DNA is continuously unwound ahead of the core enzyme and rewound behind it. Note that the direction of synthesis is identical to that of synthesis in DNA&mdash5' to 3'.
Figure 4. During elongation, RNA polymerase tracks along the DNA template, synthesizing mRNA in the 5' to 3' direction, unwinding and then rewinding the DNA as it is read.
Compare and contrast the energy story for the addition of a nucleotide in DNA replication to the addition of a nucleotide in transcription.
Bacterial vs. eukaryotic elongation
In bacteria, elongation begins with the release of the &sigma subunit from the polymerase. The dissociation of &sigma allows the core enzyme to proceed along the DNA template, synthesizing mRNA in the 5' to 3' direction at a rate of approximately 40 nucleotides per second. As elongation proceeds, the DNA is continuously unwound ahead of the core enzyme and rewound behind it. The base pairing between DNA and RNA is not stable enough to maintain the stability of the mRNA synthesis components. Instead, the RNA polymerase acts as a stable linker between the DNA template and the nascent RNA strands to ensure that elongation is not interrupted prematurely.
In eukaryotes, following the formation of the preinitiation complex, the polymerase is released from the other transcription factors, and elongation is allowed to proceed as it does in prokaryotes with the polymerase synthesizing pre-mRNA in the 5' to 3' direction. As discussed previously, RNA polymerase II transcribes the major share of eukaryotic genes, so this section will focus on how this polymerase accomplishes elongation and termination.
Once a gene is transcribed, the bacterial polymerase needs to be instructed to dissociate from the DNA template and liberate the newly made mRNA. Depending on the gene being transcribed, there are two kinds of termination signals. One is protein-based and the other is RNA-based. Rho-dependent termination is controlled by the rho protein, which tracks along behind the polymerase on the growing mRNA chain. Near the end of the gene, the polymerase encounters a run of G nucleotides on the DNA template and it stalls. As a result, the rho protein collides with the polymerase. The interaction with rho releases the mRNA from the transcription bubble.
Rho-independent termination is controlled by specific sequences in the DNA template strand. As the polymerase nears the end of the gene being transcribed, it encounters a region rich in CG nucleotides. The mRNA folds back on itself, and the complementary CG nucleotides bind together. The result is a stable hairpin that causes the polymerase to stall as soon as it begins to transcribe a region rich in AT nucleotides. The complementary UA region of the mRNA transcript forms only a weak interaction with the template DNA. This, coupled with the stalled polymerase, induces enough instability for the core enzyme to break away and liberate the new mRNA transcript.
The termination of transcription is different for the different polymerases. Unlike in prokaryotes, elongation by RNA polymerase II in eukaryotes takes place 1,000&ndash2,000 nucleotides beyond the end of the gene being transcribed. This pre-mRNA tail is subsequently removed by cleavage during mRNA processing. On the other hand, RNA polymerases I and III require termination signals. Genes transcribed by RNA polymerase I contain a specific 18-nucleotide sequence that is recognized by a termination protein. The process of termination in RNA polymerase III involves an mRNA hairpin similar to rho-independent termination of transcription in prokaryotes.
Termination of transcription in the archaea is far less studied than in the other two domains of life and is still not well understood. While the functional details are likely to resemble mechanisms that have been seen in the other domains of life, the details are beyond the scope of this course.
In bacteria and archaea
In bacteria and archaea, transcription occurs in the cytoplasm, where the DNA is located. Because the location of the DNA, and thus the process of transcription, are not physically segregated from the rest of the cell, translation often starts before transcription has finished. This means that mRNA in bacteria and archaea is used as the template for a protein before the entire mRNA is produced. The lack of spacial segregation also means that there is very little temporal segregation for these processes. Figure 6 shows the processes of transcription and translation occurring simultaneously.
In eukaryotes, the process of transcription is physically segregated from the rest of the cell, sequestered inside of the nucleus. This results in two things: the mRNA is completed before translation can start, and there is time to "adjust" or "edit" the mRNA before translation starts. The physical separation of these processes gives eukaryotes a chance to alter the mRNA in such a way as to extend the lifespan of the mRNA or even alter the protein product that will be produced from the mRNA.
5' G-cap and 3' poly-A tail
When a eukaryotic gene is transcribed, the primary transcript is processed in the nucleus in several ways. Eukaryotic mRNAs are modified at the 3' end by the addition of a poly-A tail. This run of A residues is added by an enzyme that does not use genomic DNA as a template. Additionally, the mRNAs have a chemical modification of the 5' end, called a 5'-cap. Data suggests that these modifications both help to increase the lifespan of the mRNA (prevent its premature degradation in the cytoplasm) as well as to help the mRNA initiate translation.
Splicing occurs on most eukaryotic mRNAs in which introns are removed from the mRNA sequence and exons are ligated together. This can create a much shorter mRNA than initially transcribed. Splicing allows cells to mix and match which exons are incorporated into the final mRNA product. As shown in the figure below, this can lead to multiple proteins being coded for by a single gene.
Regulation in cis
In contrast to trans-acting lncRNAs, which act via their RNA product, cis-acting lncRNAs have the possibility to act in two fundamentally different modes. The first mode depends on a lncRNA product. The major example of general cis-regulation is induction of X inactivation by the Xist lncRNA in female mammals. Xist is expressed from one of the two X chromosomes and induces silencing of the whole chromosome  (Figure 1c). As an example of locus-specific regulation it has been proposed that enhancer RNAs activate corresponding genes in cis via their product . A well-studied cis-acting lncRNA acting through its product is the human HOTTIP lncRNA that is expressed in the HOXA cluster and activates transcription of flanking genes. HOTTIP was shown to act by binding WDR5 in the MLL histone modifier complex, thereby bringing histone H3 lysine-4 trimethylation (H3K4me3) to promoters of the flanking genes . Such a mechanism in which a nascent lncRNA transcript binds and delivers epigenetic modifiers to its target genes while still attached to the elongating RNAPII is generally termed ‘tethering’ and is often used to explain cis-regulation by lncRNAs [23, 27] (Figure 1e). It was also proposed to act in plants. In Arabidopsis thaliana, the COLDAIR lncRNA is initiated from an intron of the FLC pc gene and silences it by targeting repressive chromatin marks to the locus to control flowering time .
In contrast, the second mode of cis regulation by lncRNAs involves the process of transcription itself, which is a priori cis-acting (Figure 1f). Several lines of evidence suggest that the mere process of lncRNA transcription can affect gene expression if RNAPII traverses a regulatory element or changes general chromatin organization of the locus. In this review we discuss this underestimated role for lncRNA transcription in inducing protein-coding gene silencing or activation in cis, and overview possible mechanisms for this action in mammalian and non-mammalian organisms. Finally, we describe experimental strategies to distinguish lncRNAs acting as a transcript from those acting through transcription.
Get full journal access for 1 year
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
DNA methylation promotes transcription
DNA methylation generally represses transcription, but in some instances, it has also been implicated in transcription activation. Harris et al. identified a protein complex in Arabidopsis that is recruited to chromatin by DNA methylation. This complex specifically activated the transcription of genes that are already mildly transcribed but had no effect on transcriptionally silent genes such as transposable elements. The complex thereby counteracts the repression effect caused by transposon insertion in neighboring genes while leaving transposons silent. Thus, by balancing both repressive and activating transcriptional effects, DNA methylation can act to fine-tune gene expression.
Role of RNA polymerase in gene transcription demonstrated
In all organisms, RNA synthesis is carried out by proteins -- known as RNA polymerases (RNAPs) -- that transcribe the genetic information from DNA in a highly-regulated, multi-stage process. RNAP is the key enzyme involved in creating an equivalent RNA copy of a sequence of DNA. This transcription is the first step leading to gene expression. While the major steps in RNA synthesis have been known for several decades, scientists have only recently begun to decipher the detailed molecular steps of the complex transcription process.
In research published in the July 1, 2010 online Early Edition of the Proceedings of the National Academy of Sciences, University of Maryland biophysicists Devarajan (Dave) Thirumalai and Jie Chen, along with Rockefeller University collaborator Seth Darst, provide new insight into how the transcription process is initiated and the role that RNA polymerase plays in making this happen. Because the sequence, structure, and function of multi-subunit RNA polymerase are universally conserved in all organisms -- from bacteria to humans -- understanding the mechanisms of bacterial gene transcription is an important step in deciphering the role of genetics in disease.
"Previously, people didn't know the precise role of RNA polymerase in initiating transcription," explains Distinguished University Professor Dave Thirumalai (Department of Chemistry and Biochemistry and Institute for Physical Science and Technology), "but we showed that it plays an important role in forming the transcription bubble and in the process of bending the DNA to facilitate entry of DNA into the active site. That is the process we described computationally."
Their simulation of the initiation phase of transcription in bacterial RNA polymerase showed a three-step process. It begins when the RNA polymerase binds with transcription promoting regions of DNA. Through interactions with the RNA polymerase, the DNA helix then unwinds, forming an open "bubble" that allows the polymerase access to the exposed DNA sequence to begin transcription. The DNA molecule then bends to relieve stress produced by the opening.
Dr. Jie Chen, who conducted this research while a graduate student in the Chemical Physics program, simulated the transcription bubble formation using a Brownian dynamics-based computer model developed by Dr. Thirumalai's laboratory. "By creating this molecular movie, we can look at the dynamics of RNAP and simulate how it shifts from one structure to another structure," explains Chen. "Our simulation confirms experimental observations, and goes further to establish a clear and active role for RNA polymerase."
Dr. Thirumalai's research group is continuing to study RNA polymerase by looking at the second phase of the transcription process in bacteria and also through models of human transcription.
Materials provided by University of Maryland. Note: Content may be edited for style and length.
RNA lifespan determination during transcription
Mettl3 (magenta) binds its targets co-transcriptionally. Credit: Friedrich Miescher Institute for Biomedical Research
Control of RNA lifespan is vital for the proper functioning of our cells. Marc Bühler's group at the Friedrich Miescher Institute for Biomedical Research (FMI) has discovered a novel mechanism determining the fate of RNA in mammalian cells: two proteins involved in RNA interference - Dgcr8 and Drosha - together with a methyltransferase, Mettl3, mark nascent RNAs for degradation as they are transcribed. This mechanism allows RNA transcripts to "remember" the conditions under which they were synthesized.
Life of an RNA is never easygoing. Its formation, processing, lifespan and degradation are all tightly regulated. This stringent control of RNA metabolism ensures that genes become active at the right time and place, safeguarding cell functions.
In this context, a control mechanism known as RNA interference (RNAi) has attracted a lot of attention. RNAi leads to the fragmentation and inactivation of RNAs in the cytoplasm. Interestingly, in yeast, the RNAi machinery is also active in the nucleus: during RNA synthesis, while the RNA molecules still associate with the DNA, it triggers the degradation of nascent RNAs. Whether the RNAi machinery plays a similar role in mammalian cells has remained unclear.
To address this question, Marc Bühler and his group at the FMI used mouse embryonic stem cells. Bühler comments: "This is a good example of how knowledge gained in a model organism – here in fission yeast – guides our hypotheses and informs our experiments in higher organisms."
In their study, published in Nature Structural & Molecular Biology, the scientists showed that two well known mammalian RNAi factors interact with chromatin: both Dgcr8 (an RNA-binding protein) and Drosha (an RNase) bind to nascent transcripts, thereby silencing genes co-transcriptionally. First author Philip Knuckles, an NCCR-funded postdoctoral fellow in Bühler's lab, explains: "In multicellular organisms, Dgcr8 and Drosha form a complex called the microprocessor (MP). This complex does not exist in yeast, but our results suggest that it takes over the function of the yeast RNAi protein Dicer. The principle is conserved, but the players differ slightly."
The FMI scientists also showed that an enzyme known as Mettl3 is involved in the degradation of nascent RNAs. Mettl3 transfers methyl groups to adenosine residues in RNA, a mark that also influences RNA stability. Knuckles says: "In our experiments, we showed that Mettl3 binds to chromatin while RNA is being transcribed, and that this Mettl3 association stimulates Dgcr8 binding."
The MP/Mettl3 system allows the cell to react rapidly to changing growth conditions. Bühler explains: "During a stress situation induced by high temperature, the heat-shock RNA transcripts produced are concomitantly tagged by adenosine methylation. This marks these RNAs for subsequent degradation, enabling a swift but time-limited response to stress." According to Bühler, both the fast stress response and the rapid clearance of heat-shock transcripts and proteins are important for cells: "The accumulation of stress response proteins is detrimental to the cell and is often observed in cancer.
Epigenetics and Health
Epigenetic changes can affect your health in different ways:
Germs can change your epigenetics to weaken your immune system. This helps the germ survive.
Mycobacterium tuberculosis causes tuberculosis. Infections with these germs can cause changes to histones in some of your immune cells that result in turning &ldquooff&rdquo the IL-12B gene. Turning &ldquooff&rdquo the IL-12B gene weakens your immune system and improves the survival of Mycobacterium tuberculosis (3).
Certain mutations make you more likely to develop cancer. Likewise, some epigenetic changes increase your cancer risk. For example, having a mutation in the BRCA1 gene that prevents it from working properly makes you more likely to get breast and other cancers. Similarly, increased DNA methylation that results in decreased BRCA1 gene expression raises your risk for breast and other cancers (4).While cancer cells have increased DNA methylation at certain genes, overall DNA methylation levels are lower in cancer cells compared with normal cells. Different types of cancer that look alike can have different DNA methylation patterns. Epigenetics can be used to help determine which type of cancer a person has or can help to find hard to detect cancers earlier. Epigenetics alone cannot diagnose cancer, and cancers would need to be confirmed with further screening tests.
Colorectal cancers have increased methylation at the SEPT9 gene. Some commercial epigenetic-based tests for colorectal cancer look at DNA methylation levels at the SEPT9 gene. When used with other diagnostic screening tests, these epigenetic based tests can help find cancer early (5)(6).
- Nutrition During Pregnancy
A pregnant woman&rsquos environment and behavior during pregnancy, such as whether she eats healthy food, can change the baby&rsquos epigenetics. Some of these changes can remain for decades and might make the child more likely to get certain diseases.
People whose mothers were pregnant with them during the famine were more likely to develop certain diseases such as heart disease, schizophrenia, and type 2 diabetes (7). Around 60 years after the famine, researchers looked at methylation levels in people whose mothers were pregnant with them during the famine. These people had increased methylation at some genes and decreased methylation at other genes compared with their siblings who were not exposed to famine before their birth (8)(9)(10). These differences in methylation could help explain why these people had an increased likelihood for certain diseases later in life (7)(10)(11)(12).