← Benchmarks | ADMET

ADMET Predictions BENCHMARK

Absorption, Distribution, Metabolism, Excretion, and Toxicity predictions validated against experimental datasets and commercial tools.

178K
Compounds Validated
LOO across 8 endpoints
93.3%
BBB Accuracy
7,807 compounds (v8 Hybrid)
0.06
Solubility MAE
AqSolDB (9,982+ LOO)
Deterministic
Hybrid Engine
physics-guided + reference-grounded

Approach

How FluxMateria frames ADMET prediction

Deterministic Hybrid ADMET Engine

FluxMateria predicts ADMET properties with a deterministic hybrid engine that combines physics-grounded molecular reasoning with evidence from large validated reference sets. Each prediction includes a confidence tier, the system has been evaluated across 178K compound-endpoint leave-one-out validations, and 3 endpoints are currently #1 SOTA.

  • No endpoint-specific retraining: Predictions come from a fixed inference framework rather than a black-box model rebuilt for each assay
  • Novelty-aware: The system distinguishes well-covered chemistry from true extrapolation cases
  • Interpretable: Outputs track chemically meaningful factors instead of opaque latent features
  • Fast: ~350 mol/sec throughput for full ADMET panel
  • Broad coverage: 8 validated endpoints spanning BBB, solubility, PPB, metabolism, permeability, hERG, DILI, and CYP inhibition

Results by Endpoint

Large-scale leave-one-out validation across diverse ADMET endpoints

BBB Permeability v8 Hybrid — Near-SOTA

B3DB Benchmark (7,807 LOO)
  • Accuracy: 93.3%
  • Dataset: Blood-Brain Barrier Database
  • Validation: Leave-one-out
TDC Comparison
  • TDC #1 (MiniMol): AUROC 0.924
  • TDC #2 (MapLight): AUROC 0.916
  • TDC #3 (ContextPred): AUROC 0.898

Aqueous Solubility v14 Hybrid — #1 SOTA

AqSolDB Benchmark (9,982+ LOO)
  • logS MAE: 0.06
  • Pass rate: 98.7%
  • Validation: Leave-one-out
TDC Comparison
  • TDC #1 (MiniMol): MAE 0.741
  • FluxChem: MAE 0.06 — #1 SOTA

Plasma Protein Binding v49.2 Hybrid — #1 SOTA

Deterministic hybrid inference with confidence-aware estimates and explicit handling for compounds far from validated reference space. 14,288-compound leave-one-out validation. ~153 mol/sec.

2.24%
LOO MAE
8.50%
LOO MAE (novel)
14.3K
reference compounds
153
mol/sec
Confidence Tiers (14,288-compound leave-one-out validation)
Tier Criteria n LOO MAE
EXACT Near-identical reference support 758 (5.4%) 4.64%
HIGH Strong analog support 8,177 (58.8%) 6.44%
MEDIUM Partial analog support 3,080 (22.1%) 10.49%
LOW Sparse analog support / extrapolative regime 1,903 (13.7%) 15.63%
Per-Class Breakdown (14,288 LOO validation)
PPB Class n LOO MAE
X — Very High (≥95%) 7,038 4.51%
H — High (80–95%) 3,631 7.59%
M — Moderate (50–80%) 1,870 14.32%
L — Low (30–50%) 530 22.31%
Z — Minimal (<30%) 849 23.97%
Dataset: 14,288 curated PPB measurements aggregated from public experimental, regulatory, and literature sources. PPB is expressed as percent bound (0–100%). Leave-one-out (LOO) validation excludes each compound from its own prediction, approximating unseen-compound performance.
Approach: Deterministic hybrid inference combining physics-grounded molecular features with reference-based evidence. Confidence tiers indicate how strongly each query is supported by validated nearby chemistry.

Metabolism (Intrinsic Clearance) v1 Hybrid — #1 SOTA

Deterministic hybrid inference benchmarked on 38,576 curated compounds. Spearman rho = 0.692 (TDC SOTA: 0.536). Leave-one-out validation.

0.367
MAE (log CLint)
+0.730
Pearson r
0.692
Spearman ρ (#1 SOTA)
82.8%
3-Class Accuracy
Per-Class Breakdown (38,576 LOO)
Class n MAE (log) Accuracy
High Stability (<18 µL/min/mg) 2,175 0.449 67.0%
Moderate (18–102) 4,859 0.373 53.9%
Low Stability (>102) 31,542 0.360 88.4%

Permeability (Caco-2) v1 Hybrid — Competitive

Deterministic hybrid inference benchmarked on 41,175 curated compounds. Leave-one-out validation.

0.502
MAE (log Papp)
r=0.837
Pearson correlation
73.1%
3-Class Accuracy
41,175
LOO compounds
Per-Class Breakdown (41,175 LOO)
Class n Accuracy
High (>-5.4) 18,272 82.3%
Moderate (-6 to -5.4) 8,995 54.9%
Low (≤-6.0) 13,908 72.8%

hERG Cardiotoxicity v1 Hybrid — Near-SOTA

Deterministic hybrid inference benchmarked on 8,879 compounds. AUROC 0.850 (TDC SOTA: 0.880 on 648 compounds). Leave-one-out validation.

0.850
AUROC (binary)
0.420
pIC50 MAE
+0.770
Pearson r
76.6%
Binary Accuracy
Per-Class Breakdown (8,879 LOO)
Class MAE Accuracy
High Risk (pIC50 >6) 0.672 60.6%
Moderate (pIC50 5-6) 0.361 69.2%
Low Risk (pIC50 <5) 0.349 66.9%

Hepatotoxicity (DILI) v1 Hybrid — Near-SOTA

Drug-induced liver injury risk assessment using deterministic hybrid inference. 907-compound leave-one-out validation. AUROC 0.878 (high vs rest). DILI-concern AUROC 0.924.

0.878
AUROC (High vs rest)
74.0%
3-Class Accuracy
907
LOO compounds
Per-Class Breakdown (907 LOO)
Class n Accuracy
Low Risk 365 84.7%
Moderate Risk 336 69.6%
High Risk 206 62.1%

CYP Inhibition Panel v5 Hybrid — Near-SOTA

Five CYP isoform inhibition predictions validated via leave-one-out on 62,794 compounds

Mean AUPRC 0.798 | AUROC 0.872 | 80.9% Accuracy

Deterministic hybrid inference across the five major CYP enzymes responsible for most small-molecule metabolism. Confidence-calibrated predictions support drug-drug interaction risk assessment.

CYP Isoform N (LOO) Accuracy AUPRC AUROC Role
CYP1A2 12,579 83.9% 0.894 0.913 Caffeine, theophylline metabolism
CYP2C9 12,092 79.9% 0.764 0.867 Warfarin, NSAIDs metabolism
CYP2C19 12,665 79.6% 0.839 0.872 PPIs, clopidogrel metabolism
CYP2D6 13,130 81.9% 0.660 0.836 Antidepressants, beta-blockers
CYP3A4 12,328 79.2% 0.832 0.872 Largest fraction of drug metabolism

Throughput & Commercial Comparison

Performance benchmarks against commercial ADMET tools

~350
mol/sec throughput
178K
compounds validated (LOO)
8
validated endpoints

Scope & Limitations

Where predictions are most and least reliable

Strengths

  • 3 endpoints are #1 SOTA: Solubility (MAE 0.06), Metabolism (Spearman 0.692), PPB (MAE 2.24%)
  • BBB 93.3% accuracy on 7,807 compounds (LOO)
  • 178K total compounds validated via leave-one-out across all 8 endpoints
  • hERG AUROC 0.850 on 8,879 compounds; CYP AUPRC 0.798 on 62,794 compounds
  • Deterministic, confidence-aware inference rather than endpoint-by-endpoint black-box retraining
  • Graceful handling of novel scaffolds beyond well-covered reference chemistry
  • Full interpretability — predictions map to chemically meaningful drivers
  • High throughput: PPB at ~153 mol/sec; permeability at ~500 mol/sec

Known Limitations

  • Low-binding PPB compounds (<30%) remain the hardest class (LOO MAE ~24%)
  • Predictions are strongest when compounds fall near well-validated chemical neighborhoods
  • Moderate permeability class (−6 to −5.4 log Papp) is hardest to distinguish (54.9% LOO accuracy)
  • DILI has limited gold-standard clinical data (907 LOO compounds)
  • Predictions are for screening prioritization, not regulatory submission

Try the ADMET module

Run predictions on your own molecules and see full interpretability for every result.

← Back to Module Request Access