Skip to main contentSkip to navigation
ThisIsHowItWorks.in

Where complex ideas unfold at human pace

Primary

  • Atrium
  • Map
  • Pieces
  • Series
  • Search

Secondary

  • Archive
  • Index
  • Library
  • Fragments

Meta

  • About
  • Principles
  • Lexicon
  • Questions
  • Resources

Connect

  • Instagram
  • Discord
  1. Home
  2. /The Hardening of Knowledge
  3. /38 · Molecular Biology: When Life Became Information
Map

Molecular Biology: When Life Became Information


Cold Spring Harbor, New York, 1953. James Watson and Francis Crick have just published their DNA structure paper in Nature.

The double helix. Complementary base pairing. The elegant mechanism for copying genetic information.

But they buried the most revolutionary claim in a single, understated sentence:

"It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material."

This might be the greatest understatement in scientific literature.

What they actually meant: "We've figured out how life copies itself. Heredity is just chemistry. The secret of life is a code written in molecules."

Within 20 years, this insight—that life is information stored in chemicals—transformed biology from natural history into an information science.

DNA is a text. Genes are instructions. Proteins are machines assembled according to those instructions. Cells are factories running programs written in a four-letter chemical alphabet (A, T, G, C).

Life became software.

Not metaphorically. Literally. DNA stores information digitally (discrete bases, not continuous). That information gets copied, transmitted, translated, and executed—exactly like computer code.

This was the molecular biology revolution: Reducing life to chemistry, chemistry to information, information to code that could be read, written, edited, and—eventually—synthesized from scratch.

But here's the paradox: We can now read the entire human genome (3 billion letters). We can edit genes with precision (CRISPR). We can synthesize DNA from scratch.

And we still don't fully understand how a fertilized egg becomes a human.

Molecular biology explained mechanisms (how DNA copies, how proteins fold, how cells divide).

But it struggled with emergenceWhen a system shows properties that cannot be reduced to any single part. Emergence is not magic, it is a mismatch between local rules and global behavior. (how cells become organs, how genomes become organisms, how networks create behavior).

Let's examine how biology became information science, what this revealed about life's machinery, why the information metaphor was both powerful and limiting, and what remains irreducible even after the molecular revolution.


BEFORE MOLECULAR BIOLOGY: Life Was a Black Box

CLASSICAL BIOLOGY (Pre-1950s)

WHAT THEY KNEW: ┌─────────────────────────────────────────┐ │ • Organisms inherit traits │ │ • Cells are basic units of life │ │ • Chromosomes carry heredity │ │ • Proteins are important │ │ ↓ │ │ But: HOW? Mechanisms unknown │ └─────────────────────────────────────────┘

THE GENE CONCEPT: ┌─────────────────────────────────────────┐ │ Mendel (1866): Discrete factors │ │ ("genes") │ │ ↓ │ │ Morgan (1910s): Genes on chromosomes │ │ ↓ │ │ But: What IS a gene physically? │ │ ↓ │ │ Nobody knew │ └─────────────────────────────────────────┘

THE PROTEIN HYPOTHESIS: ┌─────────────────────────────────────────┐ │ Many thought genes were proteins │ │ ↓ │ │ Reasoning: Proteins complex enough to │ │ encode information │ │ ↓ │ │ DNA: Too simple (only 4 bases) │ │ ↓ │ │ Wrong guess │ └─────────────────────────────────────────┘

THE QUESTION: ┌─────────────────────────────────────────┐ │ How does genetic information: │ │ 1. Store information? │ │ 2. Copy itself? │ │ 3. Direct organism development? │ │ ↓ │ │ Complete mystery │ └─────────────────────────────────────────┘

Biology knew WHAT happened (inheritance, development). Not HOW.


THE BREAKTHROUGH: DNA Structure Reveals Function

WATSON & CRICK (1953)

THE DISCOVERY: ┌─────────────────────────────────────────┐ │ DNA = Double helix │ │ ↓ │ │ Two strands, antiparallel │ │ ↓ │ │ Sugar-phosphate backbones outside │ │ ↓ │ │ Bases paired inside: │ │ • A with T (2 hydrogen bonds) │ │ • G with C (3 hydrogen bonds) │ │ ↓ │ │ Complementary base pairing │ └─────────────────────────────────────────┘

WHY STRUCTURE MATTERS: ┌─────────────────────────────────────────┐ │ Shape immediately suggests function: │ │ ↓ │ │ REPLICATION: │ │ • Strands separate │ │ • Each serves as template │ │ • New bases added (A→T, T→A, G→C, C→G) │ │ • Result: Two identical double helices │ │ ↓ │ │ INFORMATION STORAGE: │ │ • Sequence of bases = code │ │ • Linear information (like text) │ │ • Can be arbitrarily long │ │ ↓ │ │ Structure explains both copying and │ │ coding │ └─────────────────────────────────────────┘

THE INSIGHT: ┌─────────────────────────────────────────┐ │ Life is INFORMATION │ │ ↓ │ │ Information stored in chemical sequence │ │ ↓ │ │ Chemistry enables biology │ └─────────────────────────────────────────┘

One discovery explained everything about heredity.


THE CENTRAL DOGMA: Information Flow in Life

FRANCIS CRICK (1958)

THE DOGMA: ┌─────────────────────────────────────────┐ │ DNA → RNA → Protein │ │ ↓ │ │ Information flows in one direction │ │ ↓ │ │ (Mostly—exceptions discovered later) │ └─────────────────────────────────────────┘

DNA: THE MASTER COPY ┌─────────────────────────────────────────┐ │ • Stores genetic information │ │ • Stable (in nucleus) │ │ • Copied during cell division │ │ • Permanent record │ └─────────────────────────────────────────┘

RNA: THE WORKING COPY ┌─────────────────────────────────────────┐ │ • Messenger RNA (mRNA): Carries │ │ instructions from DNA │ │ • Transfer RNA (tRNA): Delivers amino │ │ acids │ │ • Ribosomal RNA (rRNA): Part of │ │ ribosome │ │ ↓ │ │ Temporary, disposable │ └─────────────────────────────────────────┘

PROTEIN: THE WORKER ┌─────────────────────────────────────────┐ │ • Does cell's work │ │ • Enzymes (catalyze reactions) │ │ • Structure (cytoskeleton, muscle) │ │ • Signaling (hormones, receptors) │ │ ↓ │ │ Information becomes function │ └─────────────────────────────────────────┘

THE FLOW: ┌─────────────────────────────────────────┐ │ TRANSCRIPTION (DNA → RNA): │ │ • RNA polymerase reads DNA │ │ • Makes complementary RNA strand │ │ • U replaces T (RNA uses U not T) │ │ ↓ │ │ TRANSLATION (RNA → Protein): │ │ • Ribosome reads mRNA │ │ • tRNA brings amino acids │ │ • Amino acids linked into protein chain │ │ ↓ │ │ Genetic information → Molecular machine │ └─────────────────────────────────────────┘

Life is information processing.

Not metaphor. Actual information—stored, copied, transmitted, executed.


THE GENETIC CODE: Cracking Life's Language

THE CODE (1961-1966)

THE PROBLEM: ┌─────────────────────────────────────────┐ │ DNA: 4 bases (A, T, G, C) │ │ Proteins: 20 amino acids │ │ ↓ │ │ How does 4-letter code specify 20 │ │ amino acids? │ └─────────────────────────────────────────┘

THE SOLUTION: TRIPLET CODE ┌─────────────────────────────────────────┐ │ Three bases = One codon │ │ ↓ │ │ 4³ = 64 possible codons │ │ ↓ │ │ 64 codons → 20 amino acids │ │ ↓ │ │ Code is REDUNDANT │ │ (multiple codons for same amino acid) │ └─────────────────────────────────────────┘

EXAMPLES: ┌─────────────────────────────────────────┐ │ UUU, UUC → Phenylalanine │ │ UCU, UCC, UCA, UCG → Serine │ │ AUG → Methionine (also START codon) │ │ UAA, UAG, UGA → STOP codons │ │ ↓ │ │ 61 codons for amino acids │ │ 3 STOP codons │ └─────────────────────────────────────────┘

UNIVERSALITY: ┌─────────────────────────────────────────┐ │ Code is (nearly) UNIVERSAL │ │ ↓ │ │ Same in bacteria, plants, animals, │ │ humans │ │ ↓ │ │ Strong evidence for common ancestry │ │ ↓ │ │ All life uses same programming language │ └─────────────────────────────────────────┘

THE IMPLICATIONS: ┌─────────────────────────────────────────┐ │ • Can read DNA sequence │ │ • Can predict protein sequence │ │ • Can engineer new proteins │ │ • Can transfer genes between organisms │ │ ↓ │ │ Life became programmable │ └─────────────────────────────────────────┘

Once you crack the code, you can read—and write—genetic programs.


READING THE GENOME: DNA Sequencing

SANGER SEQUENCING (1977)

THE METHOD: ┌─────────────────────────────────────────┐ │ Chain termination method │ │ ↓ │ │ DNA synthesis stops at specific bases │ │ ↓ │ │ Fragments separated by length │ │ ↓ │ │ Read sequence from fragment pattern │ │ ↓ │ │ Slow (hundreds of bases/day) │ └─────────────────────────────────────────┘

HUMAN GENOME PROJECT (1990-2003): ┌─────────────────────────────────────────┐ │ Goal: Sequence entire human genome │ │ (3 billion base pairs) │ │ ↓ │ │ Cost: $3 billion │ │ Time: 13 years │ │ Institutions: 20+ worldwide │ │ ↓ │ │ Completed 2003 │ │ ↓ │ │ First complete human genetic code │ └─────────────────────────────────────────┘

NEXT-GENERATION SEQUENCING (2005+): ┌─────────────────────────────────────────┐ │ Massive parallelization │ │ ↓ │ │ Millions of sequences simultaneously │ │ ↓ │ │ Speed: Billions of bases/day │ │ ↓ │ │ Cost: $1,000 per genome (2020s) │ │ ↓ │ │ 3,000,000x cheaper than HGP │ └─────────────────────────────────────────┘

WHAT WE LEARNED: ┌─────────────────────────────────────────┐ │ Human genome: │ │ • ~20,000 protein-coding genes │ │ • Only ~1.5% codes for proteins │ │ • Rest: Regulatory, structural, "junk"? │ │ ↓ │ │ Surprise: Humans have fewer genes than │ │ expected │ │ ↓ │ │ Complexity ≠ gene number │ └─────────────────────────────────────────┘

THE PARADIGM SHIFT: ┌─────────────────────────────────────────┐ │ Can now read genetic blueprints │ │ ↓ │ │ Personalized medicine possible │ │ ↓ │ │ Ancestry testing │ │ ↓ │ │ Disease risk prediction │ │ ↓ │ │ But: Reading ≠ Understanding │ └─────────────────────────────────────────┘

We can read the code. But reading isn't understanding.


WRITING THE GENOME: Genetic Engineering

RECOMBINANT DNA (1970s)

THE TECHNOLOGY: ┌─────────────────────────────────────────┐ │ Cut DNA (restriction enzymes) │ │ ↓ │ │ Paste DNA (ligase) │ │ ↓ │ │ Insert into organisms (vectors/viruses) │ │ ↓ │ │ Can move genes between species │ └─────────────────────────────────────────┘

APPLICATIONS: ┌─────────────────────────────────────────┐ │ MEDICINE: │ │ • Human insulin from bacteria (1982) │ │ • Growth hormone │ │ • Vaccines │ │ ↓ │ │ AGRICULTURE: │ │ • Bt corn (insect resistance) │ │ • Golden rice (vitamin A) │ │ • Herbicide-resistant crops │ │ ↓ │ │ RESEARCH: │ │ • Knockout mice (delete genes) │ │ • Fluorescent proteins (GFP) │ │ • Model organisms │ └─────────────────────────────────────────┘

PCR - POLYMERASE CHAIN REACTION (1983): ┌─────────────────────────────────────────┐ │ Kary Mullis invention │ │ ↓ │ │ Amplify DNA exponentially │ │ ↓ │ │ 1 molecule → Millions in hours │ │ ↓ │ │ Applications: │ │ • Forensics (DNA fingerprinting) │ │ • Diagnostics (detect pathogens) │ │ • Research (amplify rare sequences) │ └─────────────────────────────────────────┘

CRISPR (2012-present): ┌─────────────────────────────────────────┐ │ Clustered Regularly Interspaced Short │ │ Palindromic Repeats │ │ ↓ │ │ Bacterial immune system → Gene editing │ │ tool │ │ ↓ │ │ Guide RNA directs Cas9 enzyme to │ │ specific DNA sequence │ │ ↓ │ │ Cuts DNA precisely │ │ ↓ │ │ Cell repairs cut (can insert new │ │ sequence) │ │ ↓ │ │ PRECISION GENE EDITING │ └─────────────────────────────────────────┘

CRISPR REVOLUTION: ┌─────────────────────────────────────────┐ │ • Fast (weeks vs. years) │ │ • Cheap ($75 vs. thousands) │ │ • Precise (specific gene targeting) │ │ ↓ │ │ Applications: │ │ • Disease treatment (sickle cell, │ │ cancer) │ │ • Agriculture (drought resistance) │ │ • Basic research (any gene, any │ │ organism) │ │ ↓ │ │ Ethical concerns: │ │ • Germline editing (heritable changes) │ │ • Designer babies? │ │ • Unintended consequences? │ └─────────────────────────────────────────┘

We can now edit genomes like text documents.

Find → Replace → Save.

Life is software we can debug.


THE INFORMATION METAPHOR: Powerful but Limited

WHAT THE METAPHOR EXPLAINS:

DNA AS CODE: ┌─────────────────────────────────────────┐ │ ✓ Discrete information (A, T, G, C) │ │ ✓ Linear sequence (like text) │ │ ✓ Copied with high fidelity │ │ ✓ Can be read, written, edited │ │ ✓ Information → Function (code → │ │ program) │ └─────────────────────────────────────────┘

CELLS AS COMPUTERS: ┌─────────────────────────────────────────┐ │ ✓ Process information (gene expression) │ │ ✓ Make decisions (signaling pathways) │ │ ✓ Execute programs (development) │ │ ✓ Respond to inputs (environment) │ └─────────────────────────────────────────┘

WHAT IT MISSES:

3D STRUCTURE MATTERS: ┌─────────────────────────────────────────┐ │ ✗ Proteins fold into complex 3D shapes │ │ ✗ Shape determines function │ │ ✗ Can't predict folding from sequence │ │ alone │ │ ↓ │ │ (AlphaFold made progress 2020, but not │ │ perfect) │ └─────────────────────────────────────────┘

CONTEXT MATTERS: ┌─────────────────────────────────────────┐ │ ✗ Same gene → Different effects in │ │ different cells │ │ ✗ Depends on: Epigenetics, environment, │ │ developmental stage │ │ ✗ Gene networks, not individual genes │ └─────────────────────────────────────────┘

EMERGENCE: ┌─────────────────────────────────────────┐ │ ✗ Genome doesn't "encode" organism │ │ directly │ │ ✗ Development is emergent process │ │ ✗ Can't predict phenotype from genotype │ │ alone │ │ ↓ │ │ One genome → Many cell types (how?) │ └─────────────────────────────────────────┘

REGULATION: ┌─────────────────────────────────────────┐ │ ✗ When/where genes expressed matters │ │ more than sequence │ │ ✗ Regulatory networks incredibly │ │ complex │ │ ✗ "Junk DNA" isn't junk (regulatory │ │ elements) │ └─────────────────────────────────────────┘

The information metaphor was revolutionary.

But biology isn't just information. It's information + context + structure + networks + emergence.


WHAT WE STILL DON'T UNDERSTAND

THE PROTEIN FOLDING PROBLEM: ┌─────────────────────────────────────────┐ │ Amino acid sequence determines 3D │ │ structure │ │ ↓ │ │ But: Can't reliably predict structure │ │ from sequence │ │ ↓ │ │ Levinthal's paradox: Too many possible │ │ configurations to search │ │ ↓ │ │ How do proteins fold so fast? │ │ ↓ │ │ AlphaFold (2020): AI made huge progress │ │ ↓ │ │ But: Still not perfect, still don't │ │ understand mechanism │ └─────────────────────────────────────────┘

THE DEVELOPMENT PROBLEM: ┌─────────────────────────────────────────┐ │ One cell (fertilized egg) → Organism │ │ ↓ │ │ Same genome in every cell │ │ ↓ │ │ But: Hundreds of cell types │ │ ↓ │ │ How does position determine cell fate? │ │ ↓ │ │ How do organs form? │ │ ↓ │ │ Molecular biology explains mechanisms │ │ (gene regulation) │ │ ↓ │ │ But: Can't predict development from │ │ genome │ └─────────────────────────────────────────┘

GENOTYPE → PHENOTYPE GAP: ┌─────────────────────────────────────────┐ │ Can read genome completely │ │ ↓ │ │ But: Can't predict what organism will │ │ look/behave like │ │ ↓ │ │ Too many factors: │ │ • Gene interactions │ │ • Epigenetics │ │ • Development │ │ • Environment │ │ ↓ │ │ Complexity irreducible │ └─────────────────────────────────────────┘

CONSCIOUSNESS: ┌─────────────────────────────────────────┐ │ Neurons = cells (molecular biology │ │ applies) │ │ ↓ │ │ Neurotransmitters = molecules │ │ ↓ │ │ But: How does this create │ │ consciousness? │ │ ↓ │ │ Molecular biology explains mechanisms │ │ ↓ │ │ Doesn't explain subjective experience │ └─────────────────────────────────────────┘

AGING: ┌─────────────────────────────────────────┐ │ Why do organisms age? │ │ ↓ │ │ Theories: │ │ • DNA damage accumulation │ │ • Telomere shortening │ │ • Cellular senescence │ │ ↓ │ │ But: No unified understanding │ │ ↓ │ │ Is aging programmed or damage │ │ accumulation? │ └─────────────────────────────────────────┘

We understand mechanisms. We struggle with systems.


CONCLUSION: Life Is Information—And More

Molecular biology reduced life to chemistry, chemistry to information.

THE TRANSFORMATION: ┌─────────────────────────────────────────┐ │ Before molecular biology: │ │ • Life was mysterious │ │ • Heredity unknown mechanism │ │ • Vitalism still debated │ │ ↓ │ │ After molecular biology: │ │ • Life is chemistry │ │ • DNA = code │ │ • Heredity = molecular copying │ │ • No vitalism needed │ └─────────────────────────────────────────┘

THE SUCCESS: ┌─────────────────────────────────────────┐ │ Explained: │ │ ✓ How genes store information (DNA │ │ sequence) │ │ ✓ How genes copy (base pairing) │ │ ✓ How genes make proteins (genetic code)│ │ ✓ How mutations occur (DNA errors) │ │ ✓ How evolution works (molecular level) │ │ ↓ │ │ Enabled: │ │ ✓ Genome sequencing │ │ ✓ Genetic engineering │ │ ✓ Gene therapy │ │ ✓ Personalized medicine │ │ ✓ CRISPR │ └─────────────────────────────────────────┘

THE LIMITS: ┌─────────────────────────────────────────┐ │ Still don't fully understand: │ │ ✗ Protein folding │ │ ✗ Development (genome → organism) │ │ ✗ Genotype → Phenotype │ │ ✗ Consciousness │ │ ✗ Aging │ │ ✗ Complex diseases (cancer, etc.) │ │ ↓ │ │ Mechanisms understood │ │ Systems less so │ └─────────────────────────────────────────┘

What molecular biology reveals about science:

1. ReductionismThe practice of explaining a system solely in terms of its parts. Useful for isolated domains, misleading when interactions produce emergent effects. works—to a point. Can reduce life to chemistry, chemistry to information. This enabled huge progress.

2. But emergence is real. Genome doesn't "encode" organism the way software encodes program output. Development is emergent.

3. Understanding parts ≠ Understanding whole. Know every gene, every protein. Still can't predict organism.

4. The information metaphor was powerful. Treating DNA as code enabled genetic engineering. But metaphors have limits.

5. New tools create new science. DNA sequencing, CRISPR, gene synthesis—technology drives discoveries.

The paradox:

Can read entire genomes ($1,000, one day).

Can edit genes with precision (CRISPR).

Can synthesize DNA from scratch.

But:

Can't predict what organism genome will produce.

Can't design organisms from scratch (too complex).

Can't fix most genetic diseases (too many genes involved).

Life is information.

But it's also:

  • Chemistry (proteins fold, reactions occur)
  • Physics (forces, energies, constraints)
  • Networks (genes interact, regulate each other)
  • Development (context-dependent processes)
  • Evolution (historical contingency)
  • Emergence (properties not in individual parts)

Molecular biology explained life's mechanisms.

But life is more than mechanisms.

Reading the code was revolutionary.

Understanding what the code does—in context, over development, in evolution—remains incomplete.

We can debug individual genes.

We can't yet write life from scratch.

Because life isn't just code.

It's code + context + history + emergence.

And emergence, as always, resists reduction.


[Cross-references: For DNA structure discovery details, see Biology Companion #96-98. For Central Dogma and genetic code, see Biology Companion #97-99. For Human Genome Project, see Biology Companion #104. For CRISPR and gene editing, see Biology Companion #105 and "CRISPR and Synthetic Biology: Engineering Life" (Core #47). For protein folding problem and AlphaFold, see Biology Companion #115. For how chemistry invaded biology, see "When Chemistry Invaded Biology: Molecular Biology" (Core #29). For limits of reduction, see "The Limits of Reduction: What Physics Can't Explain" (Core #30). For systems biology approaches, see Biology Companion #114.]

PreviousRelativity: When Space and Time Became FlexibleNextThe Computer Revolution: When Machines Became Scientists

The Suitcase

Take this piece with you—works offline, no internet needed.

↩ Return to The Hardening of Knowledge⌂ Ascend to The Observatory