Molecular biology studies how DNA, RNA and proteins control the behaviour of cells. It asks how genetic information is stored, copied, transcribed and translated, then traces how those molecular instructions shape health and disease. Where biochemistry surveys all classes of biomolecules and their reactions, molecular biology narrows its focus to the informational molecules and the machinery that translates code into function. Genetics charts inheritance patterns at organism or population level; molecular biology supplies the chemical script that makes those patterns possible. In pharmaceutical research, the three fields operate as a single engine, with molecular tools answering genetic questions through biochemical methods.
From observation to engineering
The story of modern drug discovery is inextricably linked to key molecular breakthroughs.
Seeds of the idea 1860s–1940s
Mendel’s 1865 pea experiments introduced discrete hereditary “units”. Miescher isolated “nuclein” in 1869. Mitosis was described in 1879, the Chromosome Theory of Inheritance was proposed in 1902, and in 1941, Beadle and Tatum linked one gene to one enzyme, neatly defining a drug target as a protein encoded by a specific gene.
DNA takes centre stage 1940s–1950s
Avery’s 1944 work and the Hershey–Chase experiment in 1952 proved DNA carries genetic information. Watson and Crick’s 1953 double helix model, enabled by Rosalind Franklin’s and Maurice Wilkins’ X‑ray data, showed how sequence encodes and replicates information. Crick soon articulated the “central dogma”: DNA to RNA to protein. Physics and chemistry techniques, such as X-ray crystallography, were essential, foreshadowing today’s integration of computing and AI.
Writing the code 1970s–1980s
Recombinant DNA arrived in 1973 when Boyer, Cohen and Berg cut and pasted genes into plasmids, creating engineered organisms. By 1982, recombinant human insulin produced in E. coli had reached the market, marking the founding of the biotech industry.
Copying DNA on demand 1980s
Kary Mullis devised PCR in 1983, turning trace DNA into amplifiable material through temperature cycling. Biology split into “before PCR” and “after PCR” as diagnostics, forensics and cloning were transformed.
Precision editing 2012 to now
CRISPR-Cas9, adapted by Emmanuelle Charpentier and Jennifer Doudna in 2012, became a programmable gene editor. With a guide RNA and the Cas9 nuclease, scientists cut specific sequences, disable faulty genes or correct them via homology-directed repair. The pair received the 2020 Nobel Prize in Chemistry. CRISPR placed true gene correction within clinical reach.
Fun Fact: The first approved recombinant drug, human insulin made in E. coli in 1982, replaced insulin extracted from pigs and cows, dramatically improving purity and supply.
PCR and its quantitative offspring
Detecting and measuring nucleic acids is now routine.
Conventional PCR cycles through denaturation (about 95 °C), annealing (about 55–65 °C) and extension (about 72 °C). Primers flank the target, Taq polymerase extends new strands and 25–35 cycles yield billion-fold amplification.
RT‑PCR converts RNA to cDNA with reverse transcriptase before amplification, vital for gene expression studies, RNA virus detection and cDNA library construction.
qPCR (real-time PCR) tracks amplification with fluorescent dyes such as SYBR Green or sequence-specific probes such as TaqMan. The Cq value (quantification cycle) is inversely proportional to the starting template concentration. Pharma uses qPCR to validate targets, quantify viral load and build diagnostics.
Seeing and reading nucleic acids
Gel electrophoresis separates DNA by size through an agarose matrix under an electric field. The phosphate backbone gives DNA a uniform negative charge, so small fragments run farther. Fluorescent dyes reveal bands for downstream cloning or QC.
Sanger sequencing uses fluorescently labelled ddNTPs to terminate extension, generating fragments that differ by a single base. Capillary electrophoresis and laser detection read the sequence with around 99.99% accuracy. It remains the gold standard for plasmid verification and clinical variant confirmation.
Next-generation sequencing (NGS) performs massively parallel sequencing-by-synthesis. Fragmented DNA anchors to a flow cell, clusters are amplified, and cyclic fluorescent incorporation is imaged. Short reads (50–300 bp) enable whole-genome sequencing (WGS), whole-exome sequencing (WES) and RNA‑Seq. NGS offers discovery breadth; Sanger provides validation depth.
| Feature | Sanger sequencing | Next-generation sequencing |
| Principle | Chain termination with ddNTPs | Massively parallel synthesis |
| Read length | >500 bp | 50–300 bp |
| Accuracy | Very high (~99.99%) | High, platform dependent |
| Throughput | Low | Ultra high |
| Cost per base | High | Very low |
| Main pharma uses | Targeted genes, plasmid checks, variant validation | WGS, WES, RNA‑Seq, biomarker discovery, metagenomics |
*Table Comparison of sequencing technologies in pharmaceutical R&D.
Engineering DNA for function
Recombinant DNA cloning uses restriction enzymes to cut DNA, ligase to join inserts to vectors, and hosts such as E. coli or CHO cells to propagate and express proteins. Therapeutic proteins including insulin, erythropoietin and monoclonal antibodies depend on this workflow.
CRISPR-Cas9 improves precision. A guide RNA directs Cas9; double-strand breaks trigger error-prone NHEJ (creating indels for knockouts) or HDR if a repair template is supplied. The method speeds disease model creation, target validation and gene therapy development.
Reading expression: RNA and proteins
RNA interference (RNAi) uses siRNAs loaded into RISC to degrade complementary mRNA, reducing protein output. It is a staple of functional genomics and itself a therapeutic platform.
Transcriptomics (RNA‑Seq) profiles all RNA species, quantifies expression and detects isoforms and non-coding RNAs, highlighting dysregulated networks and drug responses.
Proteomics maps protein complements via 2D gels and mass spectrometry. Because proteins are the primary drug targets, proteomics guides target discovery, evaluates drug effects and tracks post-translational modifications that control activity.
Molecular biology across the R&D pipeline
Drug discovery now starts with a molecular hypothesis rather than serendipitous screening.
Target identification and validation
The Human Genome Project delivered a catalogue of potential targets. GWAS compare genomes of large cohorts to find disease-linked variants. RNA‑Seq and proteomics contrast diseased and healthy tissues to reveal dysregulated pathways. Functional genomics screens using RNAi or CRISPR knockouts in disease-relevant cells confirm which genes are essential for the phenotype of interest.
Biomarkers as decision tools
Biomarkers can be genomic (e.g. BRCA1/2 mutations predicting response to PARP inhibitors), transcriptomic (expression signatures measured by qPCR or RNA‑Seq) or proteomic (PSA in blood). They classify patients (diagnostic), predict outcomes (prognostic) and forecast treatment benefit (predictive), de-risking trials and focusing therapy.
Biologics reshape the therapeutic arsenal
Large-molecule drugs engineered with molecular tools now dominate many pipelines.
| Modality | Core technology | Mechanism | Key examples (target, indication) |
| Recombinant proteins | Recombinant DNA | Replace or supplement missing protein | Human insulin (diabetes), erythropoietin (anaemia) |
| Monoclonal antibodies | Recombinant DNA, hybridoma | Bind antigens to block signals or recruit immunity | Trastuzumab (HER2, breast cancer), Adalimumab (TNF‑α, autoimmune disease) |
| RNA therapeutics (ASO/siRNA) | RNA interference | Silence gene expression | Patisiran (TTR, hATTR amyloidosis), Nusinersen (SMN2, SMA) |
| RNA therapeutics (mRNA) | In vitro transcription, LNP delivery | Provide template for antigen or protein | Comirnaty (SARS‑CoV‑2 spike, COVID‑19) |
| Gene therapy | Viral vectors, CRISPR-Cas9 | Deliver or correct genes | Zolgensma (SMN1, SMA), Casgevy (BCL11A, sickle cell disease) |
*Table Major therapeutic modalities enabled by molecular biology.
Monoclonal antibodies
Mouse antibodies are “humanised” by grafting antigen-binding regions onto human frameworks to cut immunogenicity. Produced at scale in CHO cells, mAbs such as Herceptin and Humira altered oncology and immunology care.
Gene therapy
Gene transfer uses vectors like AAV to deliver functional genes. Genome editing therapy uses CRISPR to correct mutations. Delivery can be ex vivo (cells edited outside the body then reinfused) or in vivo.
RNA medicines
ASOs and siRNAs bind mRNA to trigger degradation. mRNA therapies deliver instructions for protein production, as demonstrated by COVID‑19 vaccines.
Small molecules and antibodies often manage disease chronically. Gene therapies promise one‑time cures, creating scientific opportunity and challenging conventional commercial models.


Precision medicine comes of age
NGS-based genomic profiling guides pharmacogenomics and targeted therapy selection. Companion diagnostics (CDx), frequently PCR, FISH or NGS assays, are now mandatory for many drugs. For Herceptin, a test must confirm HER2 overexpression before treatment.
Case studies that changed practice
mRNA vaccines against COVID‑19
Comirnaty and Spikevax reached patients in under a year. Synthetic mRNA encoding the SARS‑CoV‑2 spike is packaged in lipid nanoparticles, enters cells, and is translated into spike protein, provoking antibody and T cell responses. The mRNA never reaches the nucleus. Karikó and Weissman’s nucleoside modifications (pseudouridine) reduced innate immune activation and boosted translation, while LNP delivery enabled stability and uptake. Clinical data showed strong protection against severe disease, validating mRNA as a broad therapeutic platform and establishing a rapid manufacturing paradigm from plasmid DNA to LNP formulation.
Herceptin and HER2-positive breast cancer
Approved in 1998, Herceptin targets the HER2 receptor, overexpressed in 20–30% of breast cancers. The antibody blocks signalling, flags cells for ADCC and may curb receptor shedding. Adding Herceptin to chemotherapy improved disease-free and overall survival, cementing personalised oncology and the drug–diagnostic co-development model.
Casgevy and CRISPR-based gene editing
Casgevy (exagamglogene autotemcel), approved in 2023 in the UK, US and EU, is the first CRISPR therapy on the market. Patients’ haematopoietic stem cells are collected, edited ex vivo to inactivate BCL11A, which normally suppresses fetal haemoglobin (HbF) after birth. Reinfused cells repopulate marrow and produce HbF, preventing sickling. In the CLIMB‑121 trial, 96.7% of evaluable sickle cell patients avoided severe vaso‑occlusive crises for at least 12 months, and none were hospitalised for VOCs in that period. Effects have lasted more than five years in follow-up. The therapy validates CRISPR clinically but raises challenges related to delivery, conditioning, and pricing, with costs exceeding $2 million per patient.
These examples trace a continuum: transient instruction (mRNA), pathway modulation (mAbs) and permanent correction (CRISPR). Each step reflects increasing control over biology.
Editing without cutting: base and prime editors
CRISPR-Cas9’s double‑strand breaks trigger error-prone repair and possible off-target changes. Next-generation editors aim for precision without breaks.
Base editing uses a Cas9 nickase fused to a deaminase to convert single bases (C•G to T•A or A•T to G•C) within a narrow window. About 30% of pathogenic variants are single-base substitutions, making base editing attractive. Trials are underway for disorders such as familial hypercholesterolaemia.
Prime editing couples a Cas9 nickase to reverse transcriptase and employs a pegRNA that encodes both targeting and the desired edit. It can perform all 12 base substitutions and small indels without donor templates or double-strand breaks.
| Feature | CRISPR-Cas9 | Base editing | Prime editing |
| Core mechanism | Double-strand break with NHEJ or HDR | Deamination with nickase, no DSB | Reverse transcription with nickase, no DSB |
| Edit types | Indels, large insertions/deletions, precise fixes via HDR | Transition point mutations | All point mutations, small insertions/deletions |
| Precision | High, but indels/off-target risk | Very high for single bases | Highest versatility |
| HDR reliance | Yes for corrections | No | No |
| Key advantage | Robust knockouts, broad use | Clean correction of common single-base changes | Search-and-replace flexibility |
| Main use | Gene disruption, ex vivo repair | Correcting point mutations | Broad mutation correction in vivo |
*Table Advanced gene-editing technologies compared.
Single-cell sequencing reveals hidden heterogeneity
Bulk data conceal cellular diversity. Single-cell RNA‑Seq profiles thousands of individual cells to map heterogeneity.
Applications in pharma include:
- Pinpointing which cell subtype expresses a target
- Building atlases of healthy and diseased tissues to find rare drivers or resistance cells
- Dissecting mechanism of action and off-target effects across cell types
- Discovering cell-type-specific biomarkers and detecting minimal residual disease
Synthetic biology as an engineering discipline
Borrowing modular design ideas from engineering, synthetic biology builds new circuits and pathways.
- Biomanufacturing: Rewired yeast or bacteria produce complex natural products such as artemisinin or opioids more efficiently than crops or chemistry.
- Discovery platforms: Engineered mammalian cells act as pathway biosensors for high-throughput compound screens.
- Living medicines: Designed microbes could sense inflammation in the gut and secrete anti-inflammatory drugs on site. Programmable cells may one day patrol tissues and respond to disease signals.
Data at scale demands AI and bioinformatics
Multi‑omics and proteomics flood labs with data. Machine learning integrates patterns across datasets to spot targets and pathways humans would miss. AlphaFold-style deep learning predicts protein structures, accelerating structure-based design. Generative models propose new small molecules or biologics optimised for affinity and safety.
Ethics, regulation and intellectual property
Powerful tools bring tough questions.
- Ethics: Off-target edits, long-term effects, and the line between therapy and enhancement demand careful governance.
- Regulation: Agencies such as FDA and EMA are crafting frameworks for Advanced Therapy Medicinal Products, accounting for patient-specific manufacture and permanent effects.
- IP: Foundational technologies like CRISPR have triggered prolonged patent disputes, shaping investment and commercial strategies.
Technical and analytical bottlenecks
Delivery remains the hardest problem
Nucleic acid drugs are large, charged and fragile. LNPs preferentially target the liver, thereby limiting organ specificity. Novel polymers, exosomes, and improved viral vectors are under development to target the brain, muscle, or marrow while minimising immune reactions. For in vivo editing, absolute precision is essential to avoid oncogenic off-target events, and bacterial proteins can provoke immunity.
Making sense of data
Single-cell and spatial datasets are large, sparse and noisy. Integrating modalities while controlling batch effects requires sophisticated algorithms. Correlation is not causation: functional experiments must confirm that dysregulated genes drive disease.
Society, privacy and equity
Genomic data are intensely personal. Strong legal and technical protections are needed to stop misuse by insurers or employers and to ensure informed consent. With gene and cell therapies priced in the millions, equitable access is a pressing concern. Precision editing also raises fears of non-therapeutic enhancement, challenging notions of fairness and identity.
What the next decade may bring
Base and prime editing should mature clinically, enabling precise in vivo repair of many mutations. AI will be embedded throughout R&D, from generative molecule design to adaptive trial analysis. Spatial and multi‑omics assays are likely to become diagnostic staples, allowing systems-level patient profiling.
Economically, the cost to launch a drug now often exceeds $2 billion. The blockbuster era of broad small molecules is giving way to high-value biologics and curative therapies for smaller populations. The more effective the therapy, the more it strains annuity-based revenue models. Pricing, reimbursement and care delivery must evolve to handle high upfront costs.
Strategy, skills and collaboration
Invest in platforms and data infrastructure
Modern labs are also data centres. Firms must balance spending on NGS, CRISPR and mass spectrometry with cloud storage, high-performance computing and LIMS. Platform technologies that work across diseases offer better return than single-use tools.
Build cross-disciplinary teams
Molecular biologists now share benches with data scientists and bioengineers. Python skills, bioinformatics, regulatory fluency and awareness of commercial pathways complement wet-lab expertise. Continuous education through universities, Coursera and supplier-run courses is essential.
Open innovation wins
Internal-only R&D is inefficient. Partnered assets consistently outperform purely internal ones. Academic discoveries such as BCR‑ABL, mRNA modification and CRISPR reached patients through industry alliances. Biotech firms take early risks; big pharma brings scale, regulation and capital. Success depends on rigorous evaluation and seamless integration of external assets alongside internal programmes.
Conclusion
Molecular biology has turned pharmaceuticals from empirical tinkering into precise code editing. PCR, NGS, recombinant DNA and CRISPR are not just laboratory tricks, they are production lines for monoclonal antibodies, RNA therapeutics and curative gene therapies. The mRNA vaccines that blunted a pandemic, the antibody that tamed aggressive breast cancer, and the CRISPR therapy that freed sickle cell patients from crises prove what targeted molecular control can deliver.
Yet every breakthrough exposes new obstacles. Delivery to the right cell, interpretation of vast datasets and fair access to million‑pound cures are the defining problems of the next decade. Winning companies will be those that treat biology and computation as one craft, invest in adaptable platforms and talent, and cultivate robust external partnerships. As an old saying goes, a stitch in time saves nine: by repairing faults at their molecular source, the industry can prevent years of suffering downstream. The task now is to stitch with speed, accuracy and fairness.






