CASE STUDY — ADMET TOXICITY SCREENING

30 of 34 clinical failures detected. Zero false positives. 10.5 seconds.

A blind retrospective screen of 50 compounds — 34 drugs withdrawn from market or failed in clinical trials, plus 16 safe blockbuster controls — using FluxMateria’s physics-based ADMET profiler. No training on these outcomes. No post-hoc tuning.

88.2%
Detection rate (30/34)
100%
Specificity (0/16 false positives)
50
Compounds screened
10.5s
Total computation time
9
Toxicity categories tested

The challenge

Drug attrition due to safety failures remains the most expensive problem in pharmaceutical R&D. Between 1980 and 2020, dozens of drugs were withdrawn from global markets or terminated in late-stage clinical trials because of toxicities — hepatotoxicity, cardiotoxicity, cardiovascular events, nephrotoxicity — that were not predicted during preclinical screening. The human cost is measured in patient deaths. The financial cost runs into billions per failed program.

Current ADMET prediction tools rely heavily on machine-learning models trained on historical data, which means they are constrained by the chemical space of their training sets and can produce confident but wrong predictions on novel scaffolds. They also struggle to provide mechanistic explanations for their flags, making it difficult for medicinal chemists to design around liabilities.

The question

Can a physics-based ADMET screener — with no machine learning and no training on clinical outcome data — retrospectively detect the drugs that actually failed, while avoiding false alarms on drugs with established safety profiles?

Study design

The screen was designed as a blinded retrospective validation. 50 compounds were assembled into a single batch and run through FluxMateria’s ADMET profiler in fast mode — no parameter tuning, no outcome-aware adjustments, no special handling for any compound.

1

Assemble

34 drugs with documented clinical failures or market withdrawals (1982–2020) plus 16 FDA-approved blockbuster controls with established safety profiles.

2

Classify

Failed drugs categorized into 9 toxicity categories: hepatotoxicity, cardiotoxicity/hERG, cardiovascular events, cardiovascular thrombotic, cardiovascular valvulopathy, proarrhythmic, CYP inhibition/DDI, nephrotoxicity, and other toxicity.

3

Screen

All 50 compounds screened through FluxMateria ADMET profiler in fast mode. 7 endpoints evaluated: hepatotoxicity, hERG, metabolic stability, PPB, permeability, P-gp efflux, druglikeness.

4

Evaluate

Compounds flagged if ANY endpoint exceeded safety thresholds. Multi-signal detections recorded. Results compared against known clinical outcomes.

Failed drug dataset (34 compounds)

  • FDA market withdrawals (1982–2010)
  • Phase II/III failures (1993–2020)
  • Black-box restricted drugs (1998–2005)
  • 9 distinct toxicity mechanisms
  • Spanning NSAIDs, antipsychotics, antihistamines, statins, antidepressants, and more

Safe control dataset (16 compounds)

  • FDA-approved blockbusters with long safety track records
  • Covering 7 therapeutic areas: cardiovascular, endocrine, GI, psychiatry, pain, respiratory, infectious disease
  • Range of chemical scaffolds and mechanisms
  • No post-marketing safety withdrawals

Results overview

FluxMateria correctly identified 30 of 34 drugs with known clinical safety failures while generating zero false positives across 16 safe controls. Total screening time for all 50 compounds: 10.5 seconds (209ms average per compound).

88.2%
Sensitivity
30 of 34 failures detected
100%
Specificity
0 of 16 safe drugs falsely flagged
209ms
Per compound
10.5 seconds for 50 compounds
Total compounds screened 50
Known clinical failures 34
Correctly flagged 30
Missed (honest gaps) 4
False positives 0

All results from FluxMateria ADMET profiler (fast mode). No post-hoc threshold adjustments. Multi-signal detections counted once per compound.

Hepatotoxicity detection: 11/11 (100%)

Every drug in the hepatotoxicity category was correctly identified. All 11 compounds triggered the hepatotoxicity score threshold (>4.0), with scores ranging from 7.45 to 7.50. These drugs collectively caused hundreds of liver failures and dozens of patient deaths before being withdrawn.

Drug Class Hepatotox Score Clinical outcome
Bromfenac NSAID 7.50 Severe hepatic failure, withdrawn 1998
Trovafloxacin Fluoroquinolone 7.50 Acute liver failure, restricted 1999
Sitaxentan Endothelin receptor antagonist 7.50 Fatal liver injury, withdrawn 2010
Ximelagatran Direct thrombin inhibitor 7.48 ALT >3x ULN in 7.9%, never FDA-approved, 2006
Lumiracoxib COX-2 NSAID 7.48 8 serious liver cases (2 fatal, 2 transplants), withdrawn 2007
Tolcapone COMT inhibitor (Parkinson’s) 7.48 3 fatal hepatic failures, restricted 1998
Pemoline CNS stimulant 7.48 21 liver failures (13 fatal), withdrawn 2005
Fialuridine Nucleoside analogue 7.48 Fatal lactic acidosis in Phase II at NIH, 5 of 15 patients died, 1993
Troglitazone Thiazolidinedione 7.45 94 liver failure cases, 63 deaths, withdrawn 2000
Nefazodone Antidepressant 7.45 55 liver failures, 11 deaths, withdrawn 2004
Benoxaprofen NSAID 7.45 3,500 adverse reactions (76 fatal), withdrawn 1982

100% hepatotoxicity detection

Every drug in this category was flagged with hepatotoxicity scores well above the 4.0 threshold. These drugs collectively caused over 200 liver failures and more than 100 patient deaths before regulatory action. FluxMateria’s physics-based DILI scoring — which evaluates reactive metabolite formation potential, mitochondrial liability, and biliary transport interference from molecular structure alone — detected all of them.

Cardiotoxicity detection: 7/9 (78%)

FluxMateria’s hERG channel affinity predictor correctly identified 7 of 9 drugs withdrawn or restricted due to QT prolongation and fatal arrhythmias. All 7 detected drugs had predicted hERG IC50 values below 10 µM, consistent with significant channel blockade.

Drug Class hERG IC50 Status Clinical outcome
Astemizole Antihistamine 0.01 µM FLAGGED Fatal torsades de pointes, withdrawn 1999
Sertindole Antipsychotic 0.01 µM FLAGGED 27 unexplained deaths, suspended 1998
Droperidol Antiemetic 0.03 µM FLAGGED Black box warning for QT, effectively withdrawn 2001
Cisapride Prokinetic 0.08 µM FLAGGED 341 arrhythmia reports (80 deaths), withdrawn 2000
Terfenadine Antihistamine 0.21 µM FLAGGED Fatal torsades de pointes, withdrawn 1998
Thioridazine Antipsychotic 0.23 µM FLAGGED Dose-dependent QT prolongation, restricted 2005
Terodiline Anticholinergic 5.75 µM FLAGGED Fatal torsades de pointes, withdrawn 1991
Sparfloxacin Fluoroquinolone 18.2 µM MISSED QT issues at therapeutic concentrations despite moderate IC50
Grepafloxacin Fluoroquinolone 50.1 µM MISSED QT prolongation at therapeutic levels despite moderate channel affinity

Why the fluoroquinolones were missed

Both grepafloxacin and sparfloxacin cause clinically significant QT prolongation, but their hERG channel affinity is moderate (IC50 > 10 µM). These drugs appear to cause arrhythmia through mechanisms beyond simple hERG blockade — likely involving multi-channel effects at therapeutic concentrations. This represents a genuine gap in any hERG-focused screening approach, not a platform-specific limitation.

Multi-signal cardiovascular detection

Several of the most consequential drug failures were detected not by a single endpoint, but by multiple independent ADMET signals firing simultaneously. This multi-signal pattern — where hepatotoxicity, hERG, lipophilicity, and plasma protein binding flags converge — is a hallmark of drugs with systemic safety liabilities.

Cardiovascular events: 3/3 (100%)

Sibutramine

3 SIGNALS

Weight-loss drug withdrawn after SCOUT trial showed increased cardiovascular events (heart attacks, strokes).

hepatotox: 4.58 hERG: 14.16 µM < 15 logP: 4.74 > 4.5

Torcetrapib

2 SIGNALS

CETP inhibitor. Phase III ILLUMINATE trial terminated after 60% increase in mortality. Pfizer’s largest-ever clinical failure.

hERG: 6.99 µM < 15 logP: 8.20 > 4.5

Tegaserod

2 SIGNALS

IBS drug withdrawn after increased cardiovascular ischemic events.

hepatotox: 4.43 hERG: 13.97 µM < 15

Cardiovascular thrombotic: 1/1 (100%)

Rofecoxib (Vioxx)

2 SIGNALS

COX-2 selective NSAID. The largest drug safety disaster in modern history: an estimated 88,000–140,000 excess cardiac events in the US alone. Merck withdrew it in 2004 after the APPROVe trial confirmed prothrombotic risk.

hepatotox: 4.25 > 3.5 PPB: 93.1% > 92%

Cardiovascular valvulopathy: 1/3 (33%)

Pergolide

FLAGGED

Ergot dopamine agonist. Cardiac valve fibrosis via 5-HT2B receptor agonism.

hERG: 2.00 µM < 15

Fenfluramine & Dexfenfluramine — MISSED

The fen-phen disaster drugs. Both cause cardiac valve fibrosis through 5-HT2B receptor agonism — a target-mediated tissue remodeling mechanism that is fundamentally outside the scope of standard ADMET endpoints. Fenfluramine: hepatotox 2.98, hERG 103.8 µM. Dexfenfluramine: identical scores (enantiomer pair). No current ADMET tool — ML-based or physics-based — detects this mechanism.

Full results by category

Complete breakdown across all 9 toxicity categories. Seven categories achieved 100% detection. Two categories had partial or low detection, with clearly explained physics-based reasons for each miss.

Category Detected Rate Key drugs Primary signal
Hepatotoxicity 11/11 100% Troglitazone, Bromfenac, Nefazodone, Fialuridine hepatotox_score > 4.0
Cardiotoxicity/hERG 7/9 78% Terfenadine, Cisapride, Astemizole, Sertindole hERG IC50 < 10 µM
Cardiovascular events 3/3 100% Sibutramine, Torcetrapib, Tegaserod Multi-signal (hepatotox + hERG + logP)
Cardiovascular thrombotic 1/1 100% Rofecoxib (Vioxx) hepatotox + PPB
Cardiovascular valvulopathy 1/3 33% Pergolide (detected); Fenfluramine, Dexfenfluramine (missed) hERG (Pergolide); 5-HT2B not covered
CYP inhibition/DDI 1/1 100% Mibefradil hERG (1.35 µM) + logP (5.27)
Proarrhythmic 1/1 100% Encainide hERG (2.96 µM); CAST trial 2.5x mortality
Nephrotoxicity 1/1 100% Phenacetin hepatotox (7.45); analgesic nephropathy
Other toxicity 4/4 100% Cerivastatin, Valdecoxib, Lorcaserin, BIA 10-2474 hepatotox scores 4.20–5.62

Other toxicity: 4/4 (100%)

Drug Hepatotox Score Clinical outcome
Lorcaserin 5.62 Withdrawn 2020 after cancer risk signal in long-term trial
Valdecoxib 4.58 Stevens-Johnson syndrome and cardiovascular risk, withdrawn 2005
BIA 10-2474 4.37 Fatal Phase I neurotoxicity in Rennes (France), 2016. One death, five hospitalized.
Cerivastatin 4.20 Fatal rhabdomyolysis, 52 deaths worldwide, withdrawn 2001

Safe controls: 16/16 clean (100% specificity)

None of the 16 FDA-approved blockbuster control drugs triggered any safety flag. This is critical: a screening tool that catches failures is only useful if it does not simultaneously flag safe drugs, which would erode trust and stall medicinal chemistry programs with false leads.

Metformin Amlodipine Lisinopril Omeprazole Sertraline Ibuprofen Levothyroxine Metoprolol Montelukast Amoxicillin Albuterol Losartan Simvastatin Escitalopram Rosuvastatin Pantoprazole
Cardiovascular
Amlodipine, Lisinopril, Metoprolol, Losartan
GI & Endocrine
Omeprazole, Pantoprazole, Levothyroxine, Metformin
Psychiatry & Pain
Sertraline, Escitalopram, Ibuprofen, Montelukast
Other
Amoxicillin, Albuterol, Simvastatin, Rosuvastatin

Zero false positives

In drug safety screening, specificity is as important as sensitivity. A tool that flags safe drugs as toxic wastes resources, delays programs, and erodes chemist confidence. FluxMateria correctly cleared all 16 control drugs across 7 therapeutic areas, with no borderline cases. This is not luck — it reflects the physics-based approach: predictions are driven by molecular structure and ADMET-relevant physics, not statistical patterns that can hallucinate signal in safe compounds.

Honest assessment

FluxMateria detected 30 of 34 clinical failures with zero false positives. But 4 drugs were missed, and we believe understanding why they were missed is as important as the detections.

What worked (30/34 detected)

  • 100% hepatotoxicity detection (11/11)
  • 78% hERG cardiotoxicity detection (7/9)
  • 100% on cardiovascular events via multi-signal convergence
  • Rofecoxib (Vioxx) correctly flagged via hepatotox + PPB
  • 100% on CYP inhibition, proarrhythmic, nephrotoxicity, other toxicity
  • Zero false positives across 16 safe controls
  • Multi-signal detection provided independent corroboration for the most dangerous drugs

What was missed (4/34 not detected)

  • Grepafloxacin — hERG IC50 50.1 µM. Genuine borderline: causes QT prolongation at therapeutic concentrations despite moderate channel affinity.
  • Sparfloxacin — hERG IC50 18.2 µM. Same pattern: moderate IC50 but clinically significant QT effects.
  • Fenfluramine — 5-HT2B receptor agonism causing cardiac valve tissue remodeling. This is target-mediated organ toxicity, not a standard ADMET liability. No current ADMET tool detects this.
  • Dexfenfluramine — Enantiomer of fenfluramine, identical mechanism, identical ADMET scores.

Understanding the miss categories

The 4 misses fall into exactly two mechanistic categories, each with a clear explanation:

Multi-channel QT (2 drugs)

Grepafloxacin and sparfloxacin cause QT prolongation at therapeutic concentrations despite moderate hERG affinity. This suggests multi-channel cardiac ion channel effects (INa, ICa, IKs) that amplify the arrhythmia risk beyond what hERG IC50 alone predicts. A comprehensive cardiac ion channel panel would address this gap.

Target-mediated organ toxicity (2 drugs)

Fenfluramine and dexfenfluramine cause cardiac valve fibrosis through 5-HT2B receptor agonism — chronic serotonergic stimulation driving fibroblast proliferation and extracellular matrix deposition. This is not an ADMET liability; it is an on-target pharmacological toxicity. No current ADMET tool, whether ML-based or physics-based, detects this mechanism.

We report these misses transparently because honest benchmarking is the foundation of trust. A tool that claims 100% on everything is either overfitting to its training data or not disclosing its false negatives. FluxMateria’s 88.2% sensitivity with 100% specificity reflects genuine predictive performance, not statistical optimization.

Technical details

Engine and methodology

  • Engine: FluxMateria ADMET profiler (fast mode)
  • Physics basis: All predictions from molecular structure via first-principles FLUX physics
  • ML components: None. Zero trained parameters.
  • Computation: 10.5 seconds total, 209ms average per compound
  • Input: SMILES strings only

Endpoints evaluated

  • Hepatotoxicity (DILI score)
  • hERG cardiotoxicity (IC50)
  • Metabolic stability (CLint)
  • Plasma protein binding (PPB %)
  • Permeability (Caco-2 logPapp)
  • P-gp efflux liability
  • Druglikeness / lipophilicity (logP)

Failure categories (9)

Hepatotoxicity Cardiotoxicity/hERG Cardiovascular events Cardiovascular thrombotic Cardiovascular valvulopathy Proarrhythmic CYP inhibition/DDI Nephrotoxicity Other toxicity

Multi-signal detection

A key strength of the ADMET profiler is that compounds are evaluated across all endpoints simultaneously. When multiple independent signals fire on the same compound, it provides higher confidence and mechanistic corroboration. In this study, several drugs were flagged by 2–3 independent endpoints:

Drug Signals Endpoints triggered
Sibutramine 3 Hepatotox (4.58), hERG (14.16 µM), logP (4.74)
Torcetrapib 2 hERG (6.99 µM), logP (8.20)
Tegaserod 2 Hepatotox (4.43), hERG (13.97 µM)
Rofecoxib 2 Hepatotox (4.25), PPB (93.1%)
Mibefradil 2 hERG (1.35 µM), logP (5.27)

Dataset provenance

The failed drug dataset comprises 34 compounds drawn from FDA market withdrawals and documented Phase II/III clinical failures between 1982 and 2020. The 16 safe control drugs are FDA-approved blockbusters with long post-marketing track records and no safety-related withdrawals or restrictions. No parameter tuning, threshold adjustment, or outcome-aware processing was performed at any stage.

What this means for drug development

Preclinical
Screen compound libraries before committing to in-vivo studies

At 209ms per compound, FluxMateria can screen a 10,000-compound library in under 35 minutes. Flag hepatotoxicity and hERG liabilities before expensive animal studies begin.

Lead opt.
Guide medicinal chemistry away from toxic scaffolds

Multi-signal detection identifies compounds with systemic liability profiles (multiple independent red flags), not just single-endpoint violations. This helps chemists prioritize which liabilities to design around.

IND-enabling
De-risk the portfolio before committing to clinical investment

With 88.2% sensitivity and 100% specificity on historical failures, FluxMateria provides a rapid second opinion on safety. Each late-stage clinical failure avoided saves $800M–$2.6B in development costs.

Regulatory
Physics-based predictions are interpretable and auditable

Unlike ML black-box models, FluxMateria’s predictions trace back to molecular physics. Every flag has a mechanistic explanation (reactive metabolite potential, hERG channel geometry, lipophilicity-driven tissue accumulation) that can be reviewed by toxicologists and regulators.

Without early ADMET screening

  • $800M–$2.6B lost per late-stage failure
  • 5–10 years of development before toxicity discovered
  • Patient safety events in Phase II/III or post-marketing
  • Regulatory withdrawals and liability lawsuits
  • 34 drugs in this study alone: hundreds of patient deaths

With FluxMateria ADMET screening

  • 88.2% of these failures caught in 10.5 seconds
  • Zero safe drugs falsely flagged
  • Multi-signal detection for systemic risk profiles
  • Physics-based explanations for every flag
  • 209ms per compound — screen thousands in minutes

Beyond 50 compounds: validated at scale

This case study screened 50 compounds. But the same ADMET engine has been exhaustively validated via leave-one-out testing on 178,000+ compound-endpoint predictions — every single compound predicted using only the remaining data. Three endpoints rank #1 SOTA on public leaderboards; DILI reaches AUROC 0.9597 on the comparable TDC binary benchmark while returning mechanism-level output; and Caco-2 permeability now reaches MAE 0.277 matching the public TDC reference SOTA at 0.276 from pure physics with zero training labels.

Endpoint Validation Set Metric FluxMateria TDC #1 Status
Solubility 9,982 MAE ↓ 0.06 0.741 #1 SOTA
PPB 14,288 MAE ↓ 2.24% 7.44% #1 SOTA
Metabolism 38,576 Spearman ↑ 0.692 0.536 #1 SOTA
BBB 7,807 Accuracy 93.3% 0.924 Near-SOTA
Permeability 41,175 Pearson r 0.837 Competitive
hERG 8,879 AUROC 0.850 0.880 Near-SOTA
DILI TDC binary AUROC 0.9597 0.956 SOTA mechanistic
CYP Panel 62,794 AUPRC 0.798 ~0.86 Near-SOTA

TDC = Therapeutics Data Commons ADMET leaderboard. LOO = leave-one-out (each compound predicted using only the remaining data). All FluxMateria predictions use zero fitted parameters.

~350
molecules/sec (full panel)
178K+
LOO-validated predictions
1 CPU
single-threaded, no GPU

At ~350 molecules per second, a full 8-endpoint ADMET panel on a 1-million compound library takes under 48 minutes on a single CPU core. No GPU. No cloud compute queue. No training run. The same physics engine that caught 30/34 clinical failures in this study runs on every molecule in your library at the same accuracy.

Screen your compound library

FluxMateria’s ADMET profiler is available for pilot access. Upload SMILES, get multi-endpoint safety profiles in seconds. No ML training required — it works on any chemical scaffold from day one.

Try the demo

Run ADMET profiles on sample compounds. See hepatotoxicity, hERG, metabolism, and PPB predictions in real time.

Run Demo →

Pilot access

Full ADMET profiler with batch screening, multi-signal analysis, and exportable safety reports for your pipeline.

Request Access →