Spotlight #33 – Cell Biology & Molecular Machines: DNA Replication, Protein Folding, ATP Synthase and CRISPR-Cas9

Life is chemistry running on nanometre-scale machines that evolution has refined for three billion years. ATP synthase spins at 100–400 rpm, converting a proton gradient into a universal energy currency. Ribosomes decode mRNA into polypeptides at ~20 amino acids per second. Kinesin walks hand-over-hand along microtubules carrying cargo precise distances. This post maps the physics — thermodynamics, statistical mechanics, polymer dynamics — behind five types of molecular machinery that define what it means to be a living cell.

Molecular cell biology is biophysics at its most detailed: single molecules exerting forces of a few piconewtons, diffusing and reacting within a compartment barely a micron across, following thermodynamic laws that tolerate no exceptions. Single-molecule techniques — optical tweezers, FRET, cryo-EM, atomic force microscopy — have made it possible to watch individual molecular machines operate in real time, revealing mechanisms that are simultaneously elegant and astonishing.

1. DNA Replication and the Replisome

The double helix of DNA stores genetic information in the sequence of base pairs: adenine pairs with thymine (2 hydrogen bonds), guanine with cytosine (3 hydrogen bonds). Before a cell divides, the entire genome must be copied with extraordinary fidelity. In humans, ~6 billion base pairs are replicated in ~8 hours by a coordinated molecular assembly called the replisome, achieving an error rate of roughly 1 in 10^{10} after proofreading and mismatch repair.

DNA Replication Kinetics and Fidelity

Watson-Crick base pair stability (ΔG at 37°C, aqueous):
  AT pair: ΔG ≈ −1 kcal/mol (2 H-bonds)
  GC pair: ΔG ≈ −3 kcal/mol (3 H-bonds)
  Base-stacking of adjacent pairs contributes ~−1 to −3 kcal/mol

DNA duplex melting temperature T_m:
  T_m = 81.5 + 16.6 log[Na+] + 0.41(%GC) − 675/n  (Wallace rule, n = length in bp)
  Oligo T_m: often 55–65 °C for 20-mer primers in PCR

Replication machinery (E. coli replisome):
  Helicase (DnaB): unwinds dsDNA at ~1000 bp/s, uses ATP hydrolysis
  Primase (DnaG):  synthesises short RNA primers (~10 nt) for each Okazaki fragment
  DNA Pol III:     processive synthesis, ~1000 nt/s, 3'→5' proofreading exonuclease
  Leading strand (continuous): ~1000 nt/s; lagging strand (discontinuous): Okazaki fragments 1–2 kb
  Ligase (LigA): seals nicks after Okazaki fragment maturation

Replication fidelity:
  Bare polmerase insertion error rate: ~10^{-5} per base
  3'→5' proofreading exonuclease: reduces to ~10^{-7}
  Mismatch repair (MutS/MutL/MutH): reduces to ~10^{-10}
  Human genome 3×10^9 bp: ~0.3 errors per division (beneficial for evolution!)

Human replication:
  ~30 000 replication origins per haploid genome
  Average interorigin distance: ~100 kb
  Fork speed: ~1 kb/min (vs 60 kb/min in bacteria)
  S-phase duration: ~8 h

2. Protein Folding and the Energy Landscape

A polypeptide chain of N amino acids has ~3^N possible backbone conformations — for a 100-residue protein, more conformations than atoms in the observable universe. Yet most proteins fold spontaneously to a unique native state in microseconds to seconds. Levinthal’s paradox (1969) pointed out that random search is impossible; there must be a directed folding mechanism. The modern answer is the energy funnel: conformational space is shaped like a rough funnel, and partial structures that are partially folded are lower in free energy than fully unfolded ones.

Protein Folding Thermodynamics and AlphaFold

Free energy of folding:
  ΔG_fold = ΔH_fold − TΔS_fold
  ΔH_fold: H-bonds, van der Waals, disulphide bonds (− a few kcal/mol each)
  ΔS_fold: chain entropy loss (unfavourable) + hydrophobic effect (favourable)
  Net ΔG_fold ≈ −5 to −15 kcal/mol for typical stable protein
  Thermal energy k_B T ≈ 0.6 kcal/mol at 300 K

Hydrophobic effect (dominant driving force):
  Nonpolar residues buried in core → ordered water shell released → large entropy gain
  Transfer of nonpolar surface area: ΔG ≈ 25 cal/(mol Ų)
  Core ~50% hydrophobic residues (Leu, Val, Ile, Phe, Met, Ala)

Folding rates (Thirumalai 1995, Plaxco 1998 correlation):
  Two-state folders: ln k_f correlates with σ (relative contact order)
  Fast folders: μs–ms (e.g., villin headpiece 35AA folds in ~5 μs)
  Slow folders: minutes–hours (prions, large β-sheet proteins)

Molecular chaperones:
  Hsp70 (DnaK): binds exposed hydrophobic segments, ATP-dependent
  GroEL/GroES (Hsp60): cage sequesters misfolded protein, 2 ATP hydrolysis cycles
  ~30% of newly synthesised E. coli proteins interact with chaperones

AlphaFold2 (Jumper et al., Nature 2021):
  Transformer architecture trained on 170 000 PDB structures
  Predicted ~200 M structures in AlphaFold DB (covering most known protein families)
  Accuracy: median TM-score ~0.92 on CASP14 targets
  Key innovations: Evoformer (coevolutionary attention), invariant point attention
  Remaining challenges: intrinsically disordered regions, protein-RNA complexes, membrane dynamics

Protein misfolding is the basis of amyloid diseases: Alzheimer’s (Aβ/tau), Parkinson’s (α-synuclein), type 2 diabetes (IAPP), and prion diseases (PrP). In each case a normally soluble protein adopts a β-sheet-rich fibrillar state that is thermodynamically more stable than the native fold but biochemically toxic.

3. ATP Synthase: A Proton-Driven Rotary Motor

ATP synthase (F_0F_1-ATPase) is the most important enzyme in biology: it synthesises ~90% of the ATP used by virtually every eukaryotic and most bacterial cells. It is driven by the proton-motive force — a combination of pH gradient and membrane potential across the inner mitochondrial membrane — first described by Peter Mitchell in his chemiosmotic hypothesis (Nobel Prize 1978). The enzyme is a literal rotary motor: the rotor c-ring spins at 100–400 rpm, and Paul Boyer (Nobel 1997) showed that all three catalytic sites in the F_1 head operate sequentially through a binding change mechanism.

ATP Synthase Thermodynamics and Rotation

Proton-motive force (PMF):
  ΔG_pmf = ΔG_p + ΔG_pH
  ΔG_p = F Δψ  (electrical component; Δψ ≈ −150 to −180 mV in mitochondria)
  ΔG_pH = −2.3 RT ΔpH  (ΔpH ≈ 0.5–1.0 unit ≈ 30–60 mV equivalent)
  Total PMF ≈ −200 mV ≈ −4.6 kcal/mol per proton

ATP synthesis thermodynamics:
  ADP + Pi → ATP + H_2O
  ΔG°' = +7.3 kcal/mol (standard), in vivo ΔG ≈ +12–14 kcal/mol
  Protons required per ATP: n = ΔG_ATP / ΔG_pmf ≈ 3 (mammalian: c-ring has 8 subunits, 8/3 ≈ 2.7 H+/ATP)

F_0 rotor stoichiometry (c-ring subunit count, varies by organism):
  Bovine mitochondria: 8 c-subunits (2.67 H+/ATP)
  Chloroplast CF_1CF_0: 14 c-subunits (4.67 H+/ATP)
  E. coli: 10 c-subunits (3.33 H+/ATP)

Step-wise rotation (single-molecule experiments, Noji 1997):
  F_1 rotates in 120° steps (3-fold symmetry of α_3β_3 ring)
  Each 120° step = one ATP synthesised (or hydrolysed in reverse)
  Substep: 80° (ATP binding) + 40° (Pi release)
  Stall torque: ~40 pN·nm; efficiency ~100% (near-thermodynamic reversibility)

ATP production per glucose (complete oxidation):
  Glycolysis:       2 ATP (net)
  Pyruvate decarboxylation + Krebs cycle: 2 ATP + 8 NADH + 2 FADH_2
  Oxidative phosphorylation: 32 ATP (theoretical; ~30 observed)
  Total: ~36–38 ATP per glucose

4. Cytoskeletal Dynamics: Actin and Microtubules

The cytoskeleton is the cell’s structural and mechanical scaffold. It consists of three types of filaments: actin microfilaments (~7 nm), intermediate filaments (~10 nm, largely structural), and microtubules (~25 nm). Actin and microtubules are in a state of perpetual dynamic equilibrium — polymerising at one end and depolymerising at the other — in a phenomenon called treadmilling and (for microtubules) dynamic instability.

Polymerisation Kinetics and Dynamic Instability

Actin polymerisation kinetics:
  k_on (barbed end) = 11.6 μM^-1s^-1;   k_off = 1.4 s^-1
  Critical concentration (barbed): c_crit = k_off/k_on ≈ 0.12 μM (ATP-actin)
  Treadmilling: net addition at barbed (+) end, net loss at pointed (−) end
  ATP hydrolysis after polymerisation: t_½ ≈ 2 s → ADP-actin weaker
  Profilin accelerates barbed-end addition; cofilin severs ADP-actin

Microtubule dynamic instability (Mitchison & Kirschner 1984):
  GTP-tubulin polymerises at plus end, forms GTP cap
  GTP hydrolysis to GDP: delay ~100 s (stochastic)
  Loss of GTP cap → catastrophe (rapid depolymerisation at ~~30 nm/s)
  Rescue: transitions back to growth (→ ~1 nm/s)
  Catastrophe rate: ~0.05 s^-1;  rescue rate: ~0.15 s^-1

Force generation by polymerisation (Brownian ratchet model):
  Actin filament force: F = k_B T / δ · ln(k_on [G-actin] / k_off)
  δ = monomer size (~2.7 nm for actin)
  Stall force:  ~1–2 pN     (actin leading edge, lamellipodia)
  Microtubule:  up to ~5 pN (against kinetochore during chromosome segregation)

Kinesin step size and force:
  Kinesin-1 walks hand-over-hand: 8 nm steps along microtubule (one tubulin dimer)
  Speed: ~800 nm/s consuming 1 ATP per step
  Stall force: ~7 pN (measured by optical tweezers, Svoboda et al. 1993)
  Processivity: ~1 km on a highway (hundreds of consecutive steps before detachment)

5. CRISPR-Cas9: Precision Genome Editing

CRISPR-Cas9 (Clustered Regularly Interspaced Short Palindromic Repeats) was discovered as a bacterial adaptive immune system that stores fragments of viral DNA and uses them to recognise and cleave returning viral invaders. Jennifer Doudna and Emmanuelle Charpentier adapted it into a programmable genome editor (Nobel Prize 2020) requiring only a guide RNA (gRNA) to direct the Cas9 endonuclease to any sequence matching a 20-nucleotide protospacer adjacent to a PAM motif.

CRISPR-Cas9 Mechanism and Search Kinetics

Guide RNA structure:
  crRNA (CRISPR RNA): 20nt spacer + repeat scaffold
  tracrRNA: activating scaffold fused to crRNA in sgRNA design
  sgRNA (single guide RNA): fused crRNA+tracrRNA, ~100 nt total

PAM recognition (NGG for SpCas9):
  3D diffusion: Cas9-sgRNA searches ~10^7 PAM sites in human genome (~60 million NGG)
  1D sliding: Cas9 interrogates 2–3 bp adjacent to each PAM
  Dwell time at off-target: ~0.1 s;  on-target: ~hours (stable R-loop formed)
  Total search time: ~6 h (modelled by Sternberg et al. 2014)

R-loop formation (strand invasion):
  sgRNA pairs with complementary non-template strand (3'→5')
  Displaced strand forms R-loop; HNH domain cleaves template strand
  RuvC domain cleaves non-template strand
  Cut produces blunt-ended DSB (double strand break) 3 bp upstream of PAM

DNA repair outcomes:
  NHEJ (Non-Homologous End Joining): rapid but error-prone; insertions/deletions (indels) → gene knockout
  HDR (Homology-Directed Repair): requires donor template; precise edits, single nucleotides
  HDR efficiency: 0.1–5% in human cells (favoured in S/G2 phase)

Next-generation CRISPR tools:
  Base editors (Komor 2016): deaminase fused to dCas9; C→T or A→G without DSB
  Prime editing (Anzalone 2019): reverse transcriptase + pegRNA; insertions, deletions, all 12 transversions/transitions
  CRISPRi / CRISPRa: dCas9 fused to repressor/activator; modulate expression without editing
  Therapeutic approvals: exagamglogene autotemcel (exa-cel) for sickle cell disease & β-thalassemia (FDA 2023)

The first CRISPR-based therapy approved by the FDA (December 2023, Casgevy) treats sickle cell disease and β-thalassemia by editing haematopoietic stem cells to reactivate foetal haemoglobin. The 2023 treatment data showed 28 of 29 sickle cell patients were free of vaso-occlusive crises 12 months post-treatment — a landmark in the history of medicine.

Related Simulations