We are searching data for your request:
Upon completion, a link will appear to access the found materials.
Once it was proposed, the double-helical structure of DNA immediately suggested a simple mechanism for the accurate duplication of the information stored in DNA. We will consider how this is done only in general terms, in practice this is a complex and highly regulated process involving a number of components.
The first two problems we have to address may seem arbitrary, but they turn out to be common (conserved) features of DNA synthesis. The enzymes (DNA-dependent, DNA polymerases) that catalyze the synthesis of new DNA strands cannot start synthesis on their own, they have to add nucleotides to an existing nucleic acid polymer. In contrast, the catalysts that synthesize RNA (DNA-dependent, RNA polymerases) do not require a pre-existing nucleic acid strand, they can start the synthesis of new RNA strand, based on complementary DNA sequence, de novo. Both DNA and RNA polymerases link the 5’ end of a nucleotide triphosphate molecule to the pre-existing 3’ end of a nucleic acid molecule. This polymerization reaction is said to proceed in the 5’ to 3’ direction. As we will see later on, the molecules involved in DNA replication and RNA synthesis rely on signals within the DNA that are recognized by proteins and which determine where synthesis starts and stops, and when nucleic acid replication occurs, but for now let us assume that some process has determined where replication starts. We begin our discussion with DNA replication.
The first step in DNA replication is to locally open up the dsDNA molecule. A specialized RNA-dependent, DNA polymerase , known as primase, collides with, binds to, and synthesizes a short RNA molecule, known as a primer. Because the two strands of the DNA molecule point in opposite directions (they are anti-parallel), one primase complex associates with each DNA strand, and two primers are generated, one on each strand. Once these RNA primers are in place, the DNA-dependent, DNA polymerases replaces the primase and begins to catalyze the nucleotide-addition reaction; which nucleotide is added is determined by which nucleotide is present in the existing DNA strand. The nucleotide addition reaction involves various nucleotides colliding with the DNA-primer-polymerase complex; only the appropriate nucleotide, complementary to the nucleotide residue in the existing DNA strand is bound and used in the reaction.
Nucleotides exist in various phosphorylated forms within the cell, including nucleotide monophosphate (NMP), nucleotide diphosphate (NDP), and nucleotide triphosphate (NTP). To make the nucleic acid polymerization reaction thermodynamically favorable, the reaction uses the NTP form of the nucleotide monomers, generated through the reaction:
(5’P)NTP(3’OH) + (5’P)NTP(3’OH) + H20 ⟷ (5’P)NTP-NMP(3’OH) + diphosphate.
During the reaction the terminal diphosphate of the incoming NTP is released (a thermodynamically favorable reaction) and the nucleotide mono-phosphate is added to the existing polymer through the formation of a phosphodiester [-C-O-P-O-C] bond. This reaction creates a new 3' OH end for the polymer that can, in turn, react with another NTP. In theory, this process can continue until the newly synthesized strand reaches the end of the DNA molecule. For the process to continue, however, the double stranded region of the original DNA will have to open up further, exposing more single-stranded DNA. Keep in mind that this process is moving, through independent complexes, in both directions along the DNA molecule. Because the polymerization reaction only proceeds by 3’ addition, as new single stranded regions are opened new primers must be created (by primase) and then extended (by DNA polymerase). If you try drawing what this looks like, you will realize that i) this process is asymmetric in relation to the start site of replication; ii) the process generates RNA-DNA hybrid molecules; and iii) that eventually an extending DNA polymerase will run into the RNA primer part of an “upstream” molecule. However, there is a complexity: RNA regions are not found in “mature” DNA molecules, so there mush be a mechanisms that removes them. This is due to the fact that the DNA polymerase complex contains more than one catalytic activity. When the DNA polymerase complex reaches the upstream nucleic acid chain it runs into this RNA containing region; an RNA exonuclease activity removes the RNA nucleotides and replaces them with DNA nucleotides using the existing DNA strand as the primer. Once the RNA portion is removed, a DNA ligase activity acts to join the two DNA molecules. These reactions, driven by nucleotide hydrolysis, end up producing a continuous DNA strand. For a dynamic look at the process check out this video208 which is nice, but “flat” (to reduce the complexity of the process) and fails to start at the beginning of the process.
Evolutionary considerations: At this point you might well ask yourself, why (for heavens sake) is the process of DNA replication so complex. Why not use a DNA polymerase that does not need an RNA primer, or any primer for that matter, since RNA polymerase does not need a primer? Why not have polymerases that add nucleotide equally well to either end of a polymer? That such a mechanism is possible is suggested by the presence of enzymes in eukaryotic cells that can carry out the addition of a nucleotide to the 5’ end of an RNA molecule (the 5’ capping reaction associated with mRNA synthesis) that we will briefly considered later on. But, such activities are simply not used in DNA replication. The real answer is that we are not sure of the reasons. These could be evolutionary relics, a process established within the last common ancestor of all organisms and extremely difficult or impossible to change through evolutionary mechanisms, or simply worth the effort (in terms of its effects on reproductive success). Alternatively, there could be strong selective advantages associated with the system that preclude such changes. What is clear is that this is how the system appears to function in all known organisms, so for practical purposes, we have to remember some of the key details involved, these include the direction of polymer synthesis and the need (in the case of DNA) of an RNA primer.
Mutagenic Replication of the Major Oxidative Adenine Lesion 7,8-Dihydro-8-oxoadenine by Human DNA Polymerases
- Received 16 August 2018
- Published online 28 February 2019
- Published in issue 20 March 2019
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
7.8: DNA Replication - Biology
All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited.
Feature Papers represent the most advanced research with significant potential for high impact in the field. Feature Papers are submitted upon individual invitation or recommendation by the scientific editors and undergo peer review prior to publication.
The Feature Paper can be either an original research article, a substantial novel research study that often involves several techniques or approaches, or a comprehensive review paper with concise and precise updates on the latest progress in the field that systematically reviews the most exciting advances in scientific literature. This type of paper provides an outlook on future directions of research or possible applications.
Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to authors, or important in this field. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.
The replication fork consists of a group of proteins that influence the activity of DNA replication. In order for the replication fork to stall, the cell must possess a certain number of stalled forks and arrest length. The replication fork is specifically paused due to the stalling of helicase and polymerase activity, which are linked together. In this situation, the fork protection complex (FPC) is recruited to help maintain this linkage. 
In addition to stalling and maintaining the fork structure, protein phosphorylation can also create a signal cascade for replication restart. The protein Mrc1, which is part of the FPC, transmits the checkpoint signal by interacting with kinases throughout the cascade. When there is a loss of these kinases (from replication stress), an excess of ssDNA is produced, which is necessary for the restarting of replication. 
DNA interstrand cross-links (ICLs) cause replication stress by blocking replication fork progression. This blockage leads to failure of DNA strand separation and a stalled replication fork. Repair of ICLs can be accomplished by sequential incisions, and homologous recombination. In vertebrate cells, replication of an ICL-containing chromatin template triggers recruitment of more than 90 DNA repair and genome maintenance factors.  Analysis of the proteins recruited to stalled replication forks revealed a specific set of DNA repair factors involved in the replication stress response.  Among these proteins, SLF1 and SLF2 were found to physically link the SMC5/6 DNA repair protein complex to RAD18. The SMC5/6 complex is employed in homologous recombination, and its linkage to RAD18 likely allows recruitment of SMC5/6 to ubiquitination products at sites of DNA damage.
Replication-coupled repair Edit
Mechanisms that process damaged DNA in coordination with the replisome in order to maintain replication fork progression are considered to be examples of replication-coupled repair. In addition to the repair of DNA interstrand crosslinks, indicated above, multiple DNA repair processes operating in overlapping layers can be recruited to faulty sites depending on the nature and location of the damage. These repair processes include (1) removal of misincoporated bases (2) removal of misincorporated ribonucleotides (3) removal of damaged bases (e.g. oxidized or methylated bases) that block the replication polymerase (4) removal of DNA-protein crosslinks and (5) removal of double-strand breaks.  Such repair pathways can function to protect stalled replication forks from degradation and allow restart of broken forks, but when deficient can cause replication stress.
Replication stress is induced from various endogenous and exogenous stresses, which are regularly introduced to the genome.  These stresses include, but are not limited to, DNA damage, excessive compacting of chromatin (preventing replisome access), over-expression of oncogenes,  or difficult-to-replicate genome structures.   Replication stress can lead to genome instability, cancer, and ageing.   Uncoordinated replication–transcription conflicts and unscheduled R-loop accumulation are significant contributors. 
Specific events Edit
The events that lead to genome instability occur in the cell cycle prior to mitosis, specifically in the S phase. Disturbance to this phase can generate negative effects, such as inaccurate chromosomal segregation, for the upcoming mitotic phase.  The two processes that are responsible for damage to the S phase are oncogenic activation and tumor suppressor inactivation. They have both been shown to speed up the transition from the G1 phase to the S phase, leading to inadequate amounts of DNA replication components. These losses can contribute to the DNA damage response (DDR). Replication stress can be an indicative characteristic for carcinogenesis, which typically lacks DNA repair systems.   A physiologically short duration of the G1 phase is also typical of fast replicating progenitors during early embryonic development. 
MATERIALS AND METHODS
Cell culture and treatments
MCF10A cells were cultured in 1:1 mixture DMEM-F12 supplemented with 5% horse serum, 10 μg/ml insulin, 0.5 μg/ml hydrocortisone, 100 ng/ml cholera enterotoxin and 20 ng/ml epidermal growth factor, and incubated at 37°C in humidified atmosphere with 5% CO2 ( 33). Mouse embryonic fibroblasts, MEFs (3T9-MycER), were grown in DMEM medium supplemented with 10% serum, penicillin/streptomycin and 2 mM l -Gln. For UV treatment, exponentially growing cells were irradiated with 254-nm UV light at 40 J/m 2 . For NAC treatment, 1 mM N-acetyl cysteine (A7250, Sigma-Aldrich) was added to the medium for 2 h before being collected as previously described.
8-oxodG enrichment from ssDNA and G4-containing oligomers
The IP was performed as described ( 28) with the following changes: oligomers with 8-oxodG were designed with flanking Renilla primers for qPCR quantification (ssDNA = 5′-GGAATTATAATGCTTATCTACGTGCGACGGCCAGTGTAGTTGGAGCTC/8oxodG/TGGCGTAGGCAAGAGTGTCATAGCTGGTAAAAGGTCTTCATTTTTCGCAAG and G-quadruplex DNA = 5- GGAATTATAATGCTTATCTACGTGCCCCGCCCCCCGGGGCGGGCC/8oxodG/GGGGCGGGGTCCCGGCGGGGCGGAGCCATGTAAAAGGTCTTCATTTTTCGCAAG-3′). Each IP reaction was performed with 3 fmol 8-oxodG-containing oligomer and a large excess (10 4 fold) of random 8-oxodG – free oligomers, with the following antibodies: 4 μl of polyclonal antibody against anti-8-oxodG (AB5830 Millipore) 2 μg monoclonal antibody anti-8-oxodG (Trevigen, 0.5 mg/ml) and 4 μl of IgG. The IP efficiency was calculated by qPCR as % of immuno-precipitated DNA over input. The following primers were used in qPCR: Oligo-Renilla-Fwd GGA ATT ATA ATG CTT ATC TAC GTG C and Oligo-Renilla-Rev CTT GCG AAA AAT GAA GAC CTT TTA C.
Spike-in experiment with 8-oxodG-containing oligomers
The spike-in experiment to test the specificity of the antibodies was performed as described ( 34) with the following changes: 64 pg of both 8-oxodG- (described above) and dG-containing oligonucleotides (ssDNA = GTGGTGTGCAGCGAGAATAGAAACGACGGCCAGTGTAGTTGGAGCTCGTGGCGTAGGCAAGAGTGTCATAGCTGTTTCCTAACGACATCTACAACGAGCG) were mixed together with 1 μg of genomic DNA with undetectable endogenous 8-oxodG levels from MCF10 treated cell with 1 mM NAC for 2 h. IP was performed as indicated for OxiDIP (see next paragraph). The relative enrichment of 8-oxodG was calculated by qPCR as % of immuno-precipitated DNA over input using the following primers: Oligo-Renilla (described above) for 8-oxodG-containing oligonucleotide and Oligo-FireFly for dG-containing oligonucleotide (Fwd-CGCTCGTTGTAGATGTCGTTAG and Rev- GTGGTGTGCAGCGAGAATAG).
The spike-in experiment to test the sensitivity of the antibodies was performed as described ( 35) with the following changes: 1 μg of genomic DNA with undetectable endogenous 8-oxodG levels, from NAC-treated MCF10 cells, was added to increasing amounts (from 0.5 to 64 pg) of 8-oxodG–containing oligonucleotides (above described). IP was performed as indicated for OxiDIP. The IP efficiency was calculated by qPCR as % of immuno-precipitated DNA over input.
Stock solutions (8-oxodG, dG and ( 15 N5) 8-hydroxy-2-deoxyguanosine) were prepared in methanol at a concentration of 50 mg/l of each analyte. Final 1 mg/L individual analyte standard solutions were prepared by serial dilutions from stock solutions at 0.5, 1, 5, 25, 50 pg/μl and used for calibration curves. All standards were kept in the dark, under nitrogen, at −20°C before LC–MS/MS analysis. Genomic DNA from growing MCF10A cells was extracted by using Dneasy Blood&Tissue kit (Cat. no. 69504, QIAGEN). Furthermore, 50 μM N-tert-Butyl-α-phenylnitrone (B7263, Sigma) was added to Dneasy Blood&Tissue to preserve the oxidized state of DNA ( 36). DNA samples were hydrolyzed in HCl 0.1 M for 30 min at 30°C until a clear solution was obtained. Samples were then dried in a SpeedVac and 10 μl of methanol was added into an LC vial for analysis. 4 μl of heavy 8-oxodG were added before any sample treatment. Samples 1 μl were analysed by using an Agilent 6400 Series Triple Quadrupole LC/MS system with a HPLC 1100 series binary pump (Agilent. Waldbronn, Germany). The mobile phase was generated by mixing eluent A (0.1% Formic Acid and eluent B (methanol) at a flow rate of 0.2 ml/min. The elution gradient was from 5% A to 95% B in 6 min. The tandem mass spectrometry analysis was performed in positive MRM mode. A standard solution of 500 pg/μl of dG, 8-oxodG, and (15N5) 8-oxodG were individually infused to establish the optimal instrument settings for each compound. Experimental automatic tuning using MassHunter Optimizer was employed to define ionization polarity, to select the best product ion (Q3 ion) and to optimize both the collision energy (CE) and the declustering potential (DP). Extracted mass chromatogram peaks of the analytes were integrated using Agilent MassHunter Quantitative Analysis software (B.05.00). Peak areas of the corresponding analytes were then used as quantitative measurements for assay performance assessments such as variation, linearity etc. Linearity was determined using standard solutions and matrix matched calibrations. Standard calibration curves were constructed by plotting peak areas against concentration (pg/μl) and linear functions were applied to the calibration curves. Data were integrated by Mass Hunter quantitative software showing a linear trend in the calibration range and a coefficients of determination (R2) greater than 0.99 for all analytes. The limits of detection (LODs) for each species were determined by making 10 replicate measurements of blank samples spiked with low concentrations of each analyte and calculated as LOD = 3*SD. LOQ was determined as the concentration when the S/N ratio was 10. The MRM transitions and all the instrumental and analytical parameters are summarised in Supplementary Table S1 . The possible effect of acid hydrolysis on the extent of oxidation was tested by submitting an aliquot of dG to acidic hydrolysis followed by LC–MRM mass spectrometry analysis. Supplementary Figure S1 shows that no evidence of 8-oxodG was recorded in the TIC chromatograms, only exhibiting the peaks corresponding to the dG transitions. For quantitative analyses, samples were spiked with a known amount of (15N5) 8-oxodG, submitted to acidic hydrolysis as previously described and directly analysed by tandem mass spectrometry in MRM scan mode. The analytes concentrations were calculated in pg/μl and then expressed in ppm (i.e. number of 8-oxodGs per million of dGs). As an example, Supplementary Figure S1 shows the MRM transitions recorded for 8-oxodG before and after UV treatment, indicating an increase of 8-oxodG following UV exposure. Notably, direct comparison between HCl treatment and enzymatic degradation (data not shown) of the genomic DNA for LC–MS/MS quantification of 8-oxodG showed very similar results.
OxiDIP-sequencing and quantitative 8-oxodG immuno-precipitation assays
Genomic DNA from growing MCF10A cells or from growing MEFs was extracted by using Dneasy Blood&Tissue kit (Cat. no. 69504, QIAGEN). 10 μg of genomic DNA per immuno-precipitation were sonicated in 100 μl TE buffer (100 mM Tris–HCl pH 8.0, 0.5 M EDTA pH 8.0) to generate random fragments ranging in size between 200 and 800 bp using Bioruptor Plus UCD-300. 4 μg of fragmented DNA in 500 μl TE Buffer were denatured for 5 min at 95°C and immuno-precipitated over night at 4°C with 4 μl of polyclonal antibodies against 8-Hydroxydeoxyguanosine (AB5830 Millipore) in a final volume of 500 μl IP buffer (110 mM NaH2PO4, 110 mM Na2HPO4 ph 7.4, 0.15 M NaCl, 0.05% Triton X-100, 100 mM Tris–HCl pH 8.0, 0.5 M EDTA pH 8.0) under constant rotation. The immuno-precipitated complex was incubated with 50 μl Dynabeads Protein G (Cat. No. 10003D, ThermoFisher Scientific, previously saturated with 0.5% bovine serum albumine diluted in PBS) for 3 h at 4 °C, under constant rotation, and washed three times with 1 ml Washing buffer (110 mM NaH2PO4, 110mM Na2HPO4 pH 7.4, 0.15 M NaCl, 0.05% Triton X-100). The beads–antibody–DNA complexes were then disrupted by incubation in 200 μl Lysis buffer (50 mM Tris–HCl pH 8, 10 mM EDTA pH 8, 1% SDS, 0.5 mg/ml Proteinase K) for 4 h at 37°C, and 1 h at 52°C following addition of 100 μl Lysis buffer. The immuno-precipitated DNA was purified by using MinElute PCR Purification kit (Cat. No. 28004, QIAGEN) in a final volume of 72 μl EB buffer (provided in the kit). All the steps of OxiDIP-Seq protocol, including the washes of the immunocomplexes, were carried out in low-light conditions. Furthermore, 50 μM N-tert-butyl-α-phenylnitrone (stock solution: 28 mM in H2O B7263, Sigma) was added to each Dneasy Blood&Tissue buffer, IP and washing buffers, to preserve the oxidized DNA ( 36).
Conversion of ssDNA to dsDNA was obtained by Random Primers DNA Labeling System (Cat. No. 18187-013, ThermoFisher Scientific). Library preparation was performed as described ( 37) using 2 ng of DIP or Input DNA. Prior to sequencing, libraries were quantified using Qubit (Invitrogen) and quality-controlled using Agilent Bioanalyzer. 50 bp single-end sequencing was performed using Illumina HiSeq 2000 platform according to standard operating procedures. Reads were quality checked and filtered with NGS-QC Toolkit ( 38). Alignments were performed with Bowtie ( 39) and BWA ( 40) to hg18 or mm9 using default parameters. SAMtools ( 41) and bedtools ( 42) were used for filtering steps and file formats conversion. The peaks were identified from uniquely mapped reads without duplicates using MACS ( 43) (P < 1e–5 and fold enrichment >7). DNA Input was used as control. UCSC genome browser was used for data visualization. For qPCR analysis, 3 μl of 8-oxodG immuno-precipitated DNA (antibody AB5830, Millipore) was analysed in duplicate by quantitative PCR, using SYBR Green 2X PCR Master Mix (Applied Biosystems). The primer sets used in OxiDIP-qPCR from two biological replicates are indicated in Supplementary Table S2 .
Chromatin extracts from MCF10A cells were performed as described ( 33). 10 ng of ChIP (or Input) DNA were used to prepare ChIP-Seq libraries with TruSeq ChIP Sample Prep Kit (Illumina) according to the manufacturer's instructions. 50 bp single-end sequencing was performed using Illumina HiSeq 2000 platform. Reads were quality checked and filtered with ngsqctoolkit. Alignments were performed with Bowtie and BWA to hg18 using default parameters. SAMtools and bedtools were used for filtering steps and file formats conversion. The peaks were identified from uniquely mapped reads without duplicates using SICER ( 44) and FDR 0.01 was used as cutoff. DNA Input was used as control. UCSC genome browser was used for data visualization. Two biological independent experiments were performed and tested for reproducibility with Pearson correlation coefficient analysis (0.91), P < 2.2 × 10 −15 . γH2AX ChIP-Seq in MEFs were from GSE63861. Fastq data were filtered as described above and aligned with BWA to the mm9 using default parameters. HOMER ( 45) was used for peak detection and Input DNA was used as control.
Bioinformatic and statistical analyses
ChIP-Seq data were subjected to unbiased clustering using the SeqMINER 1.3.2 platform ( 46). The clustering was performed using a list of unique genes (hg18 or mm9) and the most expressed transcript (deriving from analysis of the GRO-Seq data) for each known gene symbol. All the gene loci, regardless of their length, were divided in 200 bins (20 from the 5 kb upstream the TSS, 160 from the gene body, and 20 from the 5 kb downstream the TTS), thus allowing direct comparisons. The length for all the bins upstream and downstream the gene body was constant (i.e. 250 bp), while it changed for the 160 bins from the gene body, as it depends on gene size the signal from each bin was expressed as the highest number of overlapping reads within the bin. The 200 density values measured at each gene locus resulted in one vector, representing the relative distribution of the ChIP signal over the whole region (i.e. 5 kb upstream the TSS to 5 kb downstream the TTS). SeqMINER k-means unbiased clustering was performed using distances computed from the sets of vectors defined above (one vector per gene) to identify genes showing similar read densities within the specified genomic window. Thus, each of the four clusters obtained by this procedure represents a group of genes having similar distribution of 8-oxodG read densities over the gene locus. k = 4 was the lowest number of clusters providing the best separation of the 8-oxodGs signals from the analysed genes (n ∼ 20 000). Results did not change when the gene body was divided in <160 bins (down to 16), to accommodate the bins of short genes to the length of the sequenced reads.
Statistical significance of the observed differences in expression levels and gene lengths among the gene clusters was evaluated by one-way ANOVA test followed by pairwise comparison of means (Bonferroni post hoc analysis). Statistical significance of the overlap between human and mouse Cluster #3 genes was evaluated by means of hypergeometric test followed by post hoc analysis.
ChIP-Seq peaks were annotated using PAVIS ( 47). The hg18 genomic coordinates of peaks identified in MCF10A cells were converted to hg38 coordinates before annotation by using the UCSC tool liftover, whereas the mm9 coordinates of peaks identified in MEF cells were used for the annotation. Relative peak enrichment was determined with Fisher test of bedtools suite. Linear correlations between γH2AX and 8-oxodG signals were tested by means of Pearson's correlation test on the list of unique genes.
RNA-Seq were analysed with RAP pipeline ( 48) with default parameters, transcript assembly and abundance estimation were performed with Cufflink and the relative abundance measured in FPKM. Differential expression analyses were performed with HTSeq and DESeq, respectively. Fastq data for MCF10A and MEF RNA-Seq were retrieved as reported in Supplementary Table S11 .
Gene set enrichment analyses were performed using GSEA/MSigDB tool on the 1609 genes of the Cluster #3 that showed the highest 8-oxodG signals both in human and mouse cells.
GRO-Seq and Pol-II-Ser2 in MCF10A were from ArrayExpress (E-MTAB-742) and GEO data NCBI (GSE45715), respectively. FASTQ files were aligned using Bowtie algorithm for identifying uniquely mapping region allowing for a maximum of two mismatches. GRO-Seq read quantifications were performed using HTSeq ( 49) reads mapping −2.5 kb upstream the TSS to the end of the corresponding gene were considered, and transcription levels were expressed as RPKM. GRO-Seq in MEFs was from GEO data NCBI (GSE27037) and analysed as above.
ORI density of each gene was expressed as the number of replication origins (i.e., ORC1 binding sites or Short Nascent Strands peaks identified in human and mouse cells, respectively) found within the body of the gene (from TSS to TTS), per kb. Statistical significance of the observed differences was evaluated by one-way ANOVA test followed by pairwise comparison of means (Bonferroni post hoc analysis).
The numbers of: genes, ORI-containing genes, ORIs within the gene body, ORIs overlapping with 8-oxodG peaks, and genes with ORIs overlapping with 8-oxodG peaks, reported in Supplementary Table S9 , were computed with bedtools suite. Montecarlo approach was devised to test the enrichment of the overlap between 8-oxodG peaks and ORIs within the body of Cluster #3 genes. Montecarlo procedure was built in order to compute empirical P values associated with the number of observed Cluster #3 genes containing at least one ORI overlapping with 8-oxodG peaks. In each realization, the number of ORIs overlapping random 8-oxodG peaks and the number of distinct genes containing these ORIs were drown out under the null hypothesis that 8-oxodG peaks were randomly distributed over the genome. In particular, the following resampling procedure was implemented: (i) select a random permutation of the genomic coordinates of 8-oxodG sites over the corresponding reference genome (ii) take the subset of ORC binding sites containing at least one random 8-oxodG peak from step (i) (iii) count the number of genes from Cluster #3 containing ORC binding sites from step (ii). We repeated these steps 1000 times. We then compared the observed number of genes containing ORIs overlapping with 8-oxodG peaks with the corresponding series of 1000 random realizations from the Montecarlo simulation, and considered the observed value as statistically significant if it was greater than all the simulated values.
Bedtools was used to analyse the overlap between genes in each cluster from MCF10A cells and CFSs mapped at the molecular level ( 50), or cancer deletions (Ref ( 51) and CosmicStructExport v80.tsv).
Density of 8-oxodG peaks previously identified in MEF ( 28) was determined using bedtools suite and measured for each gene as the number of OG-peaks/100 kb. Statistical significance of enrichment of ORI density in Cluster #3 genes was evaluated by means of one-way ANOVA test followed by pairwise comparison of means (Bonferroni post hoc analysis).
G4 analysis within the 8-oxodG peaks has been carried out by applying the Quadron tool ( 52), a machine learning algorithm, using default options. Ten random permutations of the 52 298 8-oxodG peaks were obtained with bedtools suite and analysed in Quadron.
This study was conducted using 0.05 as significance threshold all statistical analyses, except seqMINER, were performed with R (R Development Core Team, 2016).
The origin of DNA replication is one of the most enigmatic subjects in the reconstruction of the early stages in the evolution of life because the replicative DNAPs (as well as primases and the main helicases involved in replication) are not homologous among bacteria, archaea, and eukaryotes. Until recently, this lack of conservation of the key elements of the DNA replication machinery precluded reconstruction of the ancestral state, suggesting multiple origins for DNA replication and even the possibility that LUCA was an RNA-based cell [2, 5]. However, given the universal conservation of other components of the replication apparatus, such as PCNA (sliding clamp), clamp loader ATPase, and ssDNA-binding protein, along with the inferred relatively high complexity of LUCA, comparable to that of modern prokaryotes, such scenarios appear unlikely. The line of reasoning developed here, based primarily on the recently discovered evolutionary connection between PolD and the universally conserved RNAP, allows inference of the ancestral DNAP. Under this scenario, the first transcriptase (RNAP) and the first replicative DNAP evolved from a common ancestor that probably functioned as an RdRP. Thus, the replicative DNAP of the LUCA was the direct ancestor of the extant archaeal replicative DNAP, PolD. The proposed evolutionary scenario appears parsimonious in that the two key processes associated with the advent of DNA genomes, replication and transcription, derive from a common ancestor. An alternative candidate for the role of the replicative DNAP of LUCA potentially could be PolB. However, a PolB-centered scenario for the evolution of replication lacks the symmetry in the evolution of replication and transcription. Besides, PolB is the replicative DNAP only in Crenarchaeota, eukaryotes, and in diverse viruses infecting hosts in all three cellular domains which seem to be best compatible with an origin in viruses or mobile genetic elements.
The proposed scenario traces two lines of descent from primordial RNA-binding domains, DPBB and RRM, to RdRPs (RTs) to RNAPs and DNAPs (Fig. 2a). Among these evolutionary lineages, the DPBB one is associated with the evolution of cells and the RRM one, with the evolution of viruses and mobile genetic elements. The causes of such asymmetry between hosts and parasites remain enigmatic. A notable aspect of the emerging picture of the evolution of replication and transcription is the switch between RNA and DNA template and products that, clearly, occurred on multiple occasions in evolution. Although highly challenging, validation of the current evolutionary scenario by experimental reconstruction of ancestral forms of RNA and DNA polymerases does not seem to be out of the question.
Materials and Methods
Culture of MEFs
Mouse embryonic fibroblasts (MEFs) were cultured in high glucose Dulbecco's modified Eagle's medium (DMEM Sigma, Gillingham, UK) with 10% foetal bovine serum (FBS Sigma), 2 mM L-glutamine (Invitrogen Life Technologies, Paisley, UK), 1% non-essential amino acids (NEAA Invitrogen Life Technologies) and 1% penicillin/streptomycin solution (Sigma). Inactivation of MEFs for use as mESC feeders was carried out by γ irradiation (22.62 grays) or mitomycin C (Sigma) treatment. Confluent MEFs were treated with mitomycin C (10 μg/ml) for 2 hours at 37°C/5% CO2, washed three times with PBS (Sigma) and cultured overnight or frozen before use as feeder cells for undifferentiated R1 or D3 mESCs.
Culture of mESCs
R1 mESCs (Nagy et al., 1993) were cultured in knockout DMEM with 15% serum replacement, 1% penicillin/streptomycin solution, 2 mM L-glutamine, 1% NEAA (all Invitrogen Life Technologies), 0.1 mM β-mercaptoethanol (Sigma) and 1000 U/ml leukaemia inhibitory factor (LIF Chemicon, Temecula, Ca, USA). D3 (Doetschman et al., 1985) and CCE/R mESCs were cultured in high glucose DMEM (Sigma) with 15% ESC screened FBS (Hyclone, Utah, US), 1% penicillin/streptomycin solution (Invitrogen Life Technologies), 2 mM L-glutamine (Invitrogen Life Technologies), 1% NEAA (Invitrogen Life Technologies), 0.1 mM β-mercaptoethanol (Sigma) and 1000 U/ml LIF.
Differentiation of mESCs
Undifferentiated R1 and D3 mESCs were induced to differentiate using the hanging-drop method (Keller, 1995). Briefly, mESCs (day 0) were dissociated with 0.25% trypsin-EDTA (Sigma) for 2 minutes, resuspended into mESC medium (minus LIF) and plated as 20 μl droplets (approximately 450 cells per drop) on the lid of an inverted Petri dish (Sterilin, Staffordshire, UK) for 48 hours (days 1-2) to promote the formation of EBs. EBs were then placed into suspension for a further 5 (days 3-7, D3 mESCs) or 6 (days 3-8, R1 mESCs) days at 37°C/5%CO2. EBs were then plated onto gelatin (0.1% Sigma)-coated six-well plates (Nunc, Roskilde, Denmark) and cultured up to day 20 of differentiation.
RA-induced differentiation of mESCs (D3)
mESCs were treated as for spontaneous differentiation, except that, during the hanging-drop stage (days 1-2), RA (10 –7 M) was added to the medium.
The following FAM-labelled siRNAs (Ambion, Huntingdon, UK) were used: a negative control siRNA complex with no homology to mouse, rat or human genomes (supplied by Ambion) and an siRNA complex targeting exons 13 and 14 of Polg mRNA (siRNA sequence 5′-GGACGGUAACAACUACAAUtt-3′ and 5′-AUUGUAGUUGUUACCGUCCtt-3′). Undifferentiated CCE/R mESCs underwent transfection with either of the siRNA complexes (75 nM) using Lipofectamine 2000 (Invitrogen Life Technologies), or were not transfected (control), for 24 hours in R1 mESC media (15% knockout serum). After 24 hours, fresh CCE/R media (15% Hyclone FBS) was added and the cells were maintained in culture for a further 72 hours.
Fluorescent-activated cell sorting (FACS)
All cells were sorted on a MoFlo Cell Sorter (Dakocytomation, Cambridgeshire, UK) and analysed using the CellQuest/Summit software. Undifferentiated R1 and D3 mESCs were sorted with the stage-specific embryonic antigen 1 (SSEA-1). Undifferentiated R1 and D3 mESCs were incubated with 0.25% trypsin-EDTA for 2 minutes, titrated into a single cell suspension and incubated with the SSEA-1 antibody (1:50 Developmental Studies Hybridoma Bank, University of Iowa, USA) for 30 minutes at 37°C in mESC media plus LIF. The cells were then washed three times by centrifugation, incubated with the Alexa-Fluor-488 anti-mouse (Molecular Probes, Paisley, UK) for 30 minutes at 37°C and washed three times by centrifugation. The SSEA-1-positive R1 and D3 mESCs were then sorted. The undifferentiated CCE/R cells were washed three times with PBS, trypsinised and resuspended in CCE/R mESC media 96 hours after transfection. The FAM-labelled CCE/R and non-transfected cells were sorted.
Total RNA was extracted from undifferentiated, spontaneously differentiated and RA-induced (days 1-7) D3 mESCs, and from undifferentiated and spontaneously differentiated (days 1-13) R1 mESCs. Total RNA was extracted using the RNAqueous-4PCR kit (Ambion) according to the manufacturer's instructions. Samples were treated with DNase I (4 units, Ambion) for 2 hours at 37°C, after which it was inactivated with DNase inactivation reagent (Ambion). The RNA was reverse transcribed using the Reverse Transcription System (Promega, Southampton, UK). Reactions contained 800 ng/μl of RNA, 2 μl of 10× RT-Buffer, 5 mM of MgCl2, 1 mM of dNTP mixture, 0.25 μg Oligo (dT) primer, 1 U/μl of RNasin Ribonuclease Inhibitor, 15 U/μg of AMV Reverse Transcriptase and ultrapure H2O (Sigma) up to 20 μl. Reactions were incubated at 42°C for 2 hours followed by 95°C for 10 minutes to denature the AMV Reverse Transcriptase.
Total DNA was extracted from undifferentiated, spontaneously differentiated and RA-induced (days 1-7) D3 mESCs, and undifferentiated and spontaneously differentiated (days 1-13) R1 mESCs. mESCs (10.0 to 7.5×10 4 cells/μl) were resuspended in ultrapure H2O, freeze-thawed twice and vigorously pipetted to release DNA into the solution (Lloyd et al., 2006).
Each real-time PCR reaction (15 μl total volume) contained 2 μl of template (cDNA or DNA), 7.5 μl of 2× Sensi Mix (Quantace, London, UK), 0.3 μl of 50× SYBR Green I Solution (Quantace), 0.33 μM of each of the forward and reverse primers (Invitrogen Life Technologies) and ultrapure H2O (Sigma). A 293 bp product of Dppa5 cDNA was amplified with Dppa 5-F (5′-GCTTGATCTCGTCTTCCCTG-3′) and Dppa5-R (5′-TCCATTTAGCCCGAATCTTG-3′) 336 bp product of Pramel7 cDNA with Pramel7-F (5′-AGAGAACCCACATGGCTTTG-3′) and Pramel7-R (5′-GGATTTGGCTTGGCATACAT-3′) 475 bp product of Ndp52l1 cDNA with Ndp52l1-F (5′-TTGATGCTCTTGCACAGGAC-3′) and Ndp52l1-R (5′-TCACTGTTAGCACTGCCTG-3′) 188 bp of Polg cDNA with Polg-F (5′-GGACCTCCCTTAGAGAGGGA-3′) and Polg-R (5′-AGCATGCCAGCCAGAGTCACT-3′) 214 bp of Polg2 cDNA with Polg2-F (5′-ACAGTGCCTTCAGGTTAGTC-3′) and Polg2-R (5′-ACTCCAATCTGAGCAAGACC-3′) 165 bp of Tfam cDNA with Tfam-F (5′-GCATACAAAGAAGCTGTGAG-3′) and Tfam-R (5′-GTTATATGCTGAACGAGGTC-3′) 366 bp of the Gapdh gene or cDNA with Gapdh-F (5′-GGGAAGCCCATCACCATCTTC-3′) and Gapdh-R (5′-AGAGGGGCCATCCACAGTCT-3′) and 211 bp product of the tRNA-Tyr and mt-Co1 genes of the mtDNA, amplified with tRNA-Tyr/mt-Co1-F (5′-CAGTCTAATGCTTACTCAGC-3′) and tRNA-Tyr/mt-Co1-R (5′-GGGCAGTTACGATAACATTG-3′). Reactions were performed in a Rotorgene-3000 real-time PCR machine (Corbett Research, Cambridge, UK). Initial denaturation was performed at 95°C for 10 minutes, followed by 50 cycles of: denaturation at 95°C for 10 seconds annealing at 53°C (mtDNA and Tfam), 55°C (Polg2) or 60°C (Dppa5, Pramel7, Ndp52l1, Polg and Gapdh) for 15 seconds and extension at 72°C for 15 (Polg, Tfam and mtDNA), 20 (Dppa5, Pramel7 and Polg2), 23 (GAPDH) or 30 (Ndp52l1) seconds. For the standards, a series of tenfold dilutions (2×10 –2 ng/μl to 2×10 –9 ng/μl) of the target-specific PCR product were generated.
Data were acquired in the FAM/SYBR channel during the extension phase. In order to eliminate primer dimerisation from the analysis, a fourth step with data acquisition was added. The temperature at which the specific product started to melt was determined using dissociation curves (76°C for mtDNA, 78°C for Polg2 and Tfam, 81°C for Ndp52l1 or 82°C for Polg and Gapdh) and a second acquisition phase of 15 seconds in the FAM/Sybr channel was programmed to allow measurements of fluorescence from specific product only. Melt-curve analysis was conducted by ramping from 62°C to 99°C (rising 1°C every 5 seconds) and data were acquired from the FAM/Sybr channel. Analyses were performed using the Rotor Gene software (version 7 Corbett Research). Each sample was run in triplicate in two separate reactions generating six readings per gene. From those values, only the middle four were considered in order to diminish pipetting errors (Bustin, 2000). Data were expressed as mean ± s.e.m.
The Pfaffl method of relative expression was used to compare the expression of Polg, Polg2, Tfam, Dppa5, Pramel7 and Ndp52l1 in differentiating mESCs to the undifferentiated mESCs (Pfaffl, 2001). This method provides real-time RT-PCR relative quantification of samples with an unknown number of cells against a separate reaction for the housekeeping gene (Gapdh).
Undifferentiated, spontaneously differentiated and RA-induced (days 1-7 and day 20) D3 mESCs were incubated with 0.25% trypsin-EDTA for 2 minutes and titrated into a single cell suspension. D3 mESCs were then resuspended in D3 mESC differentiation media (D3 mESC media containing LIF was used for undifferentiated mESCs), plated onto gelatin-coated coverslips and incubated overnight at 37°C/5%CO2. D3 mESCs were fixed in 2% formaldehyde (Sigma) for 1 hour, permeabilised for 30 minutes in 1% (v/v) Triton X-100 (Sigma), placed in blocking solution for at least 30 minutes and incubated with the primary antibodies. MT-CO1 (1:100 Molecular Probes) and the neuroectoderm markers vimentin (1:50 Sigma), nestin (1:200 BD Pharmingen, Oxford, UK) and β-tubulin III (1:100 Sigma) were co-labelled with the mitochondrial replication factors POLG (1:100 Abcam, Cambridge, UK) and TFAM (1:200 Santa Cruz Biotechnology, Santa Cruz, California) by incubating for 2 hours at 37°C. The mesodermal marker brachyury (1:400 R&D Systems, Abingdon, UK) was co-labelled with POLG and TFAM by incubating overnight at 4°C. The cells were washed three times with 0.1% Triton X-100 for 5 minutes at room temperature then incubated with two of 459 anti-rabbit (POLG and TFAM), 488 anti-mouse (MT-CO1, vimentin, nestin and β-tubulin III) or 488 anti-goat (brachyury) secondary antibody (Molecular Probes) for 1 hour at 37°C. The cells were further washed three times with 0.1% Triton X-100 for 5 minutes and placed onto slides using mounting medium containing DAPI (Vectashield Vector Labs, Peterborough, England).
For BrdU and MitoTracker labelling, D3 mESCs were plated onto gelatin-coated coverslips and incubated overnight in D3 mESC media containing 10 μM of BrdU (Roche Applied Sciences, Sussex, UK) at 37°C/5%CO2. D3 mESCs were then washed with pre-warmed PBS and incubated with 25 nM MitoTracker (Molecular Probes) in D3 mESC media for 45 minutes at 37°C/5%CO2. Cultures were fixed for 15 minutes in 4% formaldehyde and permeabilised in 1% (v/v) Triton X-100 for 15 minutes. D3 mESCs were incubated with recommended concentrations of the BrdU antibody (1:10, Roche Applied Sciences) and the anti-mouse antibody (1:10, Roche Applied Sciences) at 37°C for 30 minutes. A series of three washes using washing buffer (Roche Applied Sciences) was performed after the antibody incubations. D3 mESCs were placed onto slides using mounting medium containing DAPI.
Microscopy and imaging methods
Fluorescent microscopy was performed using a Nikon Eclipse E600 (Nikon, Surrey, UK). mESCs were visualised using a 40× magnification oil lens and captured on a Hamamatsu Digital Camera C4742-95. Data were collected using the IPlab 3.7 software (Nikon). For fluorescence microscopy, DAPI was excited at 400 nm and detected at 420 nm, FITC was excited at 505 nm and detected between 515 and 555 nm, and rhodamine was excited at 595 nm and detected between 600 and 660 nm. Negative controls were used to determine the appropriate exposure time and prevent the generation of false positives in the samples.
Cell lysates were prepared from non-transfected and siRNA-transfected sorted CCE/R cells, human embryonic kidney cells and MEFs. To obtain sufficient protein, lysates were pooled from separate experiments. Equal amounts of protein (100 μg) were electrophoresed on a 9% SDS-PAGE gel then semi-dry blotted onto a polyvinyl difluoride membrane by applying a current of approximately 0.8 mA/cm 2 of gel for 70 minutes. The membrane was blocked with 5% dried milk powder (Marvel, Premier Brands, UK) in Tris buffered saline solution (TBS) containing 0.1% Tween 20 (Sigma), for at least 1 hour at room temperature. The membrane was then probed with antibodies against either β-actin (1:10,000 Sigma), POLG (1:100 Abcam), OCT4 (1:10,000 Abcam) or brachyury (1:250 R&D Systems) overnight at 4°C. The membrane was washed with TBS/0.1% Tween 20 for 25 minutes with a change of wash solution every 5 minutes. The membrane was probed with secondary antibodies conjugated with horseradish peroxidase (Vector Labs), anti-rabbit (1:5000), anti-mouse (1:5000), anti-goat (1:5000) or anti-rabbit (1:5000), for 1 hour at room temperature. The membrane was washed with TBS/0.1% Tween 20 for 25 minutes with a change of wash solution every 5 minutes and the bands detected by enhanced chemiluminescence reagent (Perbio Science, Cramlington, UK). Between probing with each antibody the blot was stripped using Restore Western Blot Stripping solution (Perbio Science). Densitometry analysis was performed using ImageJ software (http://rsb.info.nih.gov/ij).
One-way ANOVA was used to determine differences in Polg, Polg2 and Tfam expression, and for mtDNA copy number, in spontaneously differentiated R1 and D3 mESCs and RA-induced D3 mESCs. Differences in the percentages of vimentin-, nestin-, β-tubulin III- and brachyury-labelled cells (vimentin + , nestin + , β-tubulin III + and brachyury + cells, respectively) during spontaneously differentiated D3 and RA-induced D3 mESCs, and differences in Dppa5, Pramel7 and Ndp52l1 expression during spontaneous differentiation of R1 mESCs were also determine by one-way ANOVA. Two-way ANOVA was used to determine differences in Polg, Polg2 and Tfam expression, mtDNA copy number, and for the percentages of vimentin + , nestin + , β-tubulin III + and brachyury + cells between the spontaneously and RA-induced differentiation protocols. Data were transformed logarithmically where appropriate. Statistical differences were determined using unpaired t-tests and groups were considered significantly different when P<0.05.