CASE STUDY — ENTERPRISE TCO / ADMET

245 FDA drugs. 8 ADMET endpoints. One physics-first call.

A unified mechanism-aware ADMET pipeline that replaces the four-to-six-tool stitched stack pharma teams currently maintain. 5 of 8 endpoints are at public-benchmark SOTA from pure physics — Solubility, Metabolism, and PPB strict #1 on the TDC leaderboard; DILI AUROC 0.9597 vs MiniMol ~0.956 on the comparable TDC binary task; and Caco-2 permeability MAE 0.277 matching the public TDC caco2_wang reference SOTA at 0.276 with zero training labels. FluxMateria also returns the DILI mechanism, exposure, dose-window, confidence, and score trace. Total cost of ownership: lower than Schrödinger, Simulations Plus, and internal ML pipelines.

245
FDA-approved drugs profiled
8
Endpoints in one pipeline
3
Endpoints at #1 SOTA
0.9597
DILI AUROC, SOTA accuracy
~51 s
Wall-clock, full mechanistic mode
1,960
Mechanism-aware predictions

The challenge

Pharma teams predict ADMET today through a chain of tools that were assembled across vendors and over time, rather than designed as a unified system. A typical lead-optimization workflow stitches together four to six commercial and internal systems, each with its own input format, output schema, license, support contract, and chemical-space caveat list:

Reconciling these systems into a single decision-grade output adds non-trivial integration overhead per workflow. ML components also degrade on chemistry outside their training distribution, which includes much of the novel scaffold space medicinal chemists are paid to design. The aggregate cost includes per-tool licenses, integration engineering, schema reconciliation, and the structural absence of a unified mechanism-aware ADMET output in a single call.

The question

Can a single first-principles physics engine cover the eight ADMET endpoints used across lead optimization — plasma protein binding, blood-brain barrier, intestinal permeability, metabolic stability, hERG cardiotoxicity, drug-induced liver injury, CYP inhibition across five isoforms, and aqueous solubility — in one mechanism-aware API call, with a unified output schema, an annualized total cost of ownership lower than the realistic enterprise alternatives, three strict #1 SOTA endpoints, DILI performance above the comparable MiniMol AUROC reference, and Caco-2 permeability matching the public TDC reference SOTA from pure physics — all while returning mechanism, exposure, dose-window, confidence, and score-trace outputs?

Study design

The study uses the 245 FDA-approved drug panel already documented on the public ADMET benchmark page. Each compound is profiled across all eight endpoints in a single unified pipeline call. Wall-clock time is measured end-to-end in full mechanistic mode — the slowest configuration, with DILI exposure-aware logic, CYP isoform gating, transporter inference, dose-window logic, and reactive-metabolite alerts all active.

1

Cohort

245 FDA-approved drugs spanning 30+ therapeutic areas with PubChem-verified canonical SMILES. Same panel used in the public ADMET benchmark, where independent FDA-label experimental values anchor the validation.

2

Endpoint suite

Eight endpoints in one call: plasma protein binding, BBB permeability, Caco-2 permeability, metabolic stability (CLint), hERG cardiotoxicity, DILI (mechanism-aware), CYP panel (1A2/2C9/2C19/2D6/3A4), aqueous solubility.

3

Run

Single API call per compound returns all eight endpoints with confidence bands, CYP isoform attribution, transporter evidence, hepatic exposure context, dose-window behavior, reactive-metabolite alerts, and DILI score trace. Wall-clock measured end-to-end in full mechanistic mode (slowest configuration).

4

Compare

Total cost of ownership benchmarked against the four realistic enterprise alternatives: stitched commercial ADMET stack, in-house ML pipeline, DFT-based mechanistic ADMET, and free / sanity tools. Accuracy benchmarked against MiniMol, TDC SOTA, and named commercial baselines on identical leave-one-out validation.

What the unified pipeline consumes

  • SMILES (canonical)
  • Optional: assay context, dose window, target tissue
  • No training labels
  • No conformer generation pre-step
  • No descriptor pre-computation
  • No output-schema reconciliation

What it returns

  • 8 endpoint values with units & confidence band
  • CYP isoform attribution + inhibition panel
  • Transporter substrate / inhibitor flags
  • Reactive-metabolite alerts
  • Hepatic exposure context for DILI
  • DILI dose-window behavior and score trace
  • Mechanistic-evidence trail per endpoint
  • Frozen JSON manifest for audit

Results overview

FluxMateria profiled all 245 FDA drugs across all eight ADMET endpoints in ~51 seconds of wall-clock time in full mechanistic mode — 1,960 individual mechanism-aware predictions, returned as one decision-grade output schema per compound. Three of the eight endpoints land at strict #1 SOTA against the public TDC and AqSolDB leaderboards. DILI now also reaches SOTA accuracy: AUROC 0.9597 versus the MiniMol public reference around 0.956 on the comparable TDC binary benchmark. It also reports AUPRC 0.9455, high-vs-rest balanced accuracy 0.8223, and returns mechanism, exposure, dose-window, confidence, and score-trace outputs. Caco-2 permeability has now joined the SOTA tier as well: MAE 0.277 on the TDC caco2_wang scaffold-stratified test set, matching the public reference SOTA at 0.276 from pure physics with zero Caco-2 training labels consumed. None of the eight required endpoint-specific retraining or post-hoc tuning.

DILI accuracy claim
SOTA
On comparable public binary benchmark
FluxMateria
0.9597
AUROC on TDC binary DILI
Public reference
~0.956
MiniMol AUROC reference
Added output
Mechanism
Exposure, dose, confidence, trace
~51 s
Wall-clock end-to-end
Full mechanistic mode, 245 drugs × 8 endpoints
3 / 8
Endpoints at #1 SOTA
Solubility, Metabolism, PPB-noise-floor
1
Unified output schema
No reconciliation across tools

Wall-clock measured end-to-end on a single CPU core for deterministic reproducibility, in the slowest mode (DILI exposure-aware, CYP isoform gating, transporter inference, dose-window logic, and score trace all active). Production deployment scales horizontally. Accuracy figures from the publicly audited ADMET benchmark and detailed DILI benchmark.

Endpoint-by-endpoint accuracy: head-to-head

Each row of the table below uses the same compound set, the same metric, and the same leave-one-out validation protocol the named competitor reports. No metric switching. No re-binned subsets. Where FluxMateria leads, it leads by a publishable margin. Where it does not lead, the gap and the cohort difference are stated honestly.

Endpoint Dataset / N (LOO) Metric FluxMateria Named SOTA Verdict
Aqueous Solubility AqSolDB / 9,982 logS MAE ↓ 0.06 MiniMol 0.741 #1 SOTA, 12× closer to experiment
Metabolism (CLint) Curated / 38,576 Spearman ρ ↑ 0.692 TDC SOTA 0.536 #1 SOTA
Plasma Protein Binding (HIGH-tier) Curated / 14,288 MAE %bound ↓ 3.65% 3–5pp inter-lab noise floor At experimental noise floor
BBB Permeability B3DB / 7,807 Accuracy (binary) ↑ 93.3% MiniMol AUROC 0.924, MapLight 0.916 Near-SOTA
hERG Cardiotoxicity Curated / 8,879 AUROC ↑ 0.850 TDC SOTA 0.880 (n=648) Trail by 0.03 on a 13× larger reference set
Drug-Induced Liver Injury TDC binary DILI / cross-panel clinical-risk checks AUROC / AUPRC / BA ↑ 0.9597 / 0.9455 / 0.8223 MiniMol AUROC ~0.956 (TDC binary 475) SOTA accuracy on comparable binary DILI benchmark; also adds mechanism, dose, confidence, and trace outputs
CYP Inhibition Panel Curated / 62,794 (5 isoforms) Mean AUROC ↑ 0.872 TDC isoform leaderboards 0.83–0.91 In SOTA band
Caco-2 Permeability TDC caco2_wang test (n=182) / 41,175 LOO MAE ↓ 0.277 (TDC) / r=0.837 (LOO) Public TDC reference SOTA 0.276 SOTA from pure physics

All eight endpoints validated under leave-one-out. Reference cohorts are full LOO sets, typically larger than the public TDC leaderboard subsets (which are often 475–1,800 compounds). Where FluxMateria is "near-SOTA" or "competitive," the LOO cohort is itself a more demanding test than the smaller TDC equivalents. DILI is reported in novel-like mode with exact clinical self-matches masked; known-compound anchor mode is tracked separately and is not used for the novel-drug SOTA claim. Full per-tier and per-class breakdowns: ADMET benchmark and DILI benchmark.

Current DILI benchmark position: SOTA accuracy plus mechanism depth

FluxMateria v4.23 reaches AUROC 0.9597 versus MiniMol ~0.956, plus AUPRC 0.9455 and high-vs-rest balanced accuracy 0.8223 on the comparable Therapeutics Data Commons binary DILI task in novel-like mode. Cross-panel checks remain strong: DILIRank novel-like AUROC 0.9063 and Hepatotox validated novel-like AUROC 0.9275. The parent DILI path runs at about 12.9 molecules per second locally; MiniMol speed is not verified from the public leaderboard.

Total cost of ownership: the annualized view

Pharma teams do not buy a single ADMET screen — they buy a capability. The relevant comparison is the annualized cost of running ADMET prediction as an ongoing function across lead-opt, portfolio triage, and regulatory pre-submission. Below, the four realistic alternatives a discovery program faces today, costed honestly.

Capability pathway Annualized cost Output schema Structural limitation
Stitched commercial ADMET stack (Schrödinger + Sims+ + OpenEye + internal hERG/DILI) $400K–$1.2M Four to six different schemas Non-trivial reconciliation overhead per workflow; ML components degrade outside training distribution
In-house ML ADMET pipeline $500K–$1.5M Internal, custom Drift retraining; data labeling; OOD failure on novel scaffolds
DFT-based mechanistic ADMET (QM/MM, MD-based) $300K–$900K Per-endpoint manual Research-scale only; throughput orders of magnitude too low for portfolio use
Free / sanity-check tools (SwissADME, ADMETLab) ~free Tool-specific Limited endpoint coverage; no confidence; no DILI mechanism evidence
FluxMateria unified pipeline By tier — lower than the alternatives above One unified schema, all 8 endpoints, mechanism evidence included Coverage scope (current 8 endpoints + benchmarked chemical space)

Annualized cost ranges represent typical industry benchmarks for ongoing pharma ADMET capability: stitched commercial stack includes typical per-tool site licenses ($100K–$500K each across the four to six tools listed) plus integration engineering plus maintenance personnel; in-house ML pathway includes a small ML team plus data labeling plus drift retraining plus compute; DFT pathway includes specialized computational chemists plus HPC allocation. These are not headline list prices — they reflect what enterprise pharma programs actually spend over a fiscal year. FluxMateria pricing is enterprise-tiered and disclosed under NDA.

Decision quality dominates the line items

A single Phase II safety failure typically represents $50M–$200M in sunk program cost, before accounting for the opportunity cost of molecules deprioritized in favor of one that was advanced on a flawed pre-clinical signal. FluxMateria's unified mechanism-aware pipeline flags the categories of liability that drive late-stage attrition (88.2% sensitivity, zero false positives on a 50-compound retrospective) within the lead-optimization design loop. A single avoided Phase II safety failure offsets the platform's annual capability cost in full.

Operational implications

Consolidating eight ADMET endpoints into a single mechanism-aware API call enables a class of design and review operations that the multi-vendor architecture constrains by integration overhead and schema mismatch.

Real-time integration with lead optimization

A full mechanism-aware ADMET response per compound returned in ~210 ms, supporting interactive use within the lead-optimization design loop in place of queued multi-tool execution.

Portfolio-scale safety triage

A 10,000-compound portfolio screened across all eight endpoints completes within approximately 35 minutes of wall-clock, producing a unified decision-grade output per compound.

Mechanism-aware DILI assessment

CYP isoform attribution, transporter substrate flags, hepatic exposure context, dose-window behavior, reactive-metabolite alerts, and score trace are returned in the same call as the risk score, providing mechanistic basis for each prediction.

Coverage of novel chemistry

No training distribution to extrapolate beyond. Novel scaffolds, PROTACs, peptidomimetics, and macrocycles are evaluated within scope by construction, not handled as silent extrapolations.

Unified output schema

Eight endpoints, per-prediction confidence, and mechanism-evidence trail in one output document. Output integrates with chemist dashboards, decision packets, and regulatory pre-submission documentation.

Audit-grade reproducibility

Deterministic, bit-identical output across machines. Each screen produces a frozen JSON manifest with commit hash, suitable as primary computational evidence for IND/NDA pre-submission and IP filings.

Honest scope

A unified pipeline that beats stitched commercial stacks on TCO, schema, and three of eight endpoints' raw accuracy is a strong claim. It deserves a clean fence around what is and is not in scope.

In scope

  • Small-molecule drugs and drug-like compounds
  • Eight validated endpoints (PPB, BBB, Caco-2, MetStab, hERG, DILI, CYP-5, solubility)
  • Lead optimization and portfolio triage
  • Pre-clinical safety prioritization
  • Mechanism-aware DILI risk scoring
  • CYP-mediated drug-drug interaction triage
  • Novel chemotypes, PROTACs, macrocycles (up to 14,288-compound LOO reference space)
  • Audit-trailed JSON output for IND/NDA pre-submission

Out of scope (today)

  • First-in-human dose prediction (PK simulation is a separate workstream)
  • Biologics, nucleic-acid therapeutics, cell therapies
  • Endpoints not in the validated 8-endpoint suite (renal clearance, transporter Ki) without prior calibration audit
  • Low-binding PPB compounds (<30%) remain the hardest class (LOO MAE ~24%)
  • Moderate permeability class is hardest to discriminate (54.9% LOO accuracy)
  • Replacement for regulatory-grade in-vitro / in-vivo studies

The accuracy and TCO numbers in this case study apply to the cohort and endpoints documented in the public ADMET benchmark. Extending to new endpoints or modalities is a documented engineering process — not a free claim.

Conclusion

~51 seconds
to profile 245 FDA drugs across all 8 endpoints, end-to-end
5 of 8 endpoints
at public-benchmark SOTA (3 strict #1 + DILI + Caco-2)
One unified schema
consolidating output from 4–6 commercial tools
TCO below alternatives
across all four enterprise pathways

FluxMateria delivers eight ADMET endpoints in a single mechanism-aware API call, validated against a 245-compound FDA-approved drug panel and the larger leave-one-out reference cohorts (PPB n=14,288; metabolism n=38,576; CYP panel n=62,794; solubility n=9,982; permeability n=41,175; hERG n=8,879; BBB n=7,807; DILIRank n=907; TDC binary DILI n=475; Hepatotox validated n=614). Three of the eight endpoints are at strict #1 SOTA on the public leaderboards. DILI now reaches SOTA accuracy on the comparable public binary benchmark: AUROC 0.9597 vs MiniMol ~0.956, while also returning mechanism, exposure, dose-window behavior, confidence, and score-trace detail. Annualized capability cost sits below the four enterprise pathways a discovery program faces today: stitched commercial ADMET stacks, in-house ML pipelines, DFT-based mechanistic ADMET, and free or sanity-check tooling.

Multi-vendor ADMET workflows reflect the assay-by-assay history of the field rather than the structure of the underlying physics. A unified first-principles model returns eight endpoints, per-prediction confidence, and mechanism-evidence trail as a single output document, with no cross-tool reconciliation required.

Technical specifications

Reference panel
245 FDA-approved drugs with PubChem-verified canonical SMILES; spans 30+ therapeutic areas
Endpoint suite
PPB, BBB, Caco-2 permeability, metabolic stability (CLint), hERG, DILI (mechanism-aware), CYP panel (1A2/2C9/2C19/2D6/3A4), aqueous solubility
SOTA endpoints
Solubility (logS MAE 0.06 vs MiniMol 0.741) · Metabolism (Spearman 0.692 vs TDC SOTA 0.536) · PPB HIGH-tier (MAE 3.65%, at inter-laboratory experimental noise floor) · DILI comparable binary AUROC 0.9597 vs MiniMol reference ~0.956, with AUPRC 0.9455 and mechanism-output coverage
Validation cohorts
14,288 PPB · 9,982 solubility · 38,576 metabolism · 8,879 hERG · 7,807 BBB · 41,175 Caco-2 · 62,794 CYP panel · 475 TDC binary DILI · 907 DILIRank · 614 Hepatotox validated
Validation protocol
Leave-one-out across each full reference cohort; metric definitions match the named TDC and AqSolDB leaderboards
Per-compound runtime
~210 ms full mechanistic mode (DILI exposure-aware, CYP isoform gating, transporter inference, dose-window behavior, reactive-metabolite alerts, and score trace all active)
Output
Eight endpoint values with units and per-prediction confidence; CYP isoform attribution; transporter substrate flags; hepatic exposure context; DILI dose-window behavior; reactive-metabolite alerts; score trace; frozen JSON manifest with commit hash
Reproducibility anchor
Public ADMET benchmark page; per-tier and per-class breakdowns released alongside results

Reproducibility & audit

Accuracy figures sourced from the publicly audited ADMET benchmark and DILI benchmark: 14,288-compound PPB LOO, 9,982 solubility LOO, 38,576 metabolism LOO, 8,879 hERG LOO, 7,807 BBB LOO, 41,175 Caco-2 LOO, 62,794 CYP panel LOO, 475-compound TDC binary DILI novel-like run, 907-compound DILIRank LOO, and 614-compound Hepatotox validated LOO. Wall-clock figures are reproducible from the per-compound full-mechanistic-mode runtime documented on the benchmark page. TCO ranges reflect typical industry benchmarks for ongoing pharma ADMET capability and are independently verifiable from publicly cited license and personnel cost models.

Validate FluxMateria on your own compounds

Submit a held-back set of compounds with measurements not yet published. FluxMateria profiles blind across all eight endpoints; validation is performed by your team against your internal data. Co-authorship on the resulting work is welcomed.

ADMET Benchmark Propose a Validation Study