Benchmark May 1, 2026

New SOTA on Drug-Induced Liver Injury Prediction: AUROC 0.9597 With Full Mechanism Trace

FluxMateria now holds a state-of-the-art mechanistic position for drug-induced liver injury (DILI): area under receiver operating characteristic curve (AUROC) 0.9597 on the comparable Therapeutics Data Commons (TDC) binary benchmark, plus exposure, mechanism, dose, confidence, and score-trace outputs that binary classifiers do not provide.

The claim

FluxMateria is positioned as a state-of-the-art (SOTA) mechanistic DILI prediction engine: it clears the comparable public binary benchmark reference while returning the evidence a safety team needs to review hepatic exposure, transport, enzyme context, injury chemistry, dose sensitivity, and confidence.

Drug-induced liver injury is one of the hardest safety problems in discovery because the final clinical label is rarely caused by a single property. A molecule can look acceptable on a simple structural alert and still become risky when high liver exposure, transporter retention, metabolic activation, chronic dosing, and patient context align. A one-bit classifier can help triage, but it does not explain the mechanism that should drive the next experiment.

That is the reason we rebuilt DILI inside FluxMateria as a mechanism-resolved safety engine rather than a standalone yes/no endpoint. The engine still competes on the public binary benchmark, but the output is designed for nomination review: score, risk class, confidence, mechanism attribution, dose-window behavior, and a trace of why the final score moved.

0.9597

AUROC on comparable TDC binary DILI benchmark

0.9455

Area under precision-recall curve (AUPRC) on the same comparable task

0.9275

Hepatotoxicity cross-panel AUROC in novel-like mode

12.95

Parent DILI path molecules per second locally

Why the binary benchmark still matters

The comparable public reference point is the TDC binary DILI benchmark. FluxMateria reaches AUROC 0.9597 on that task, compared with the MiniMol public reference around AUROC 0.956. That establishes competitive benchmark position on the same kind of yes/no problem that most public leaderboards measure.

But that is not the whole claim. MiniMol and similar public models are primarily evaluated as binary classifiers. FluxMateria is evaluated there too, then goes further: it returns mechanistic evidence that a binary benchmark does not ask for. The comparison is therefore intentionally not a binary-only apples-to-apples product comparison. The binary metric establishes that FluxMateria is not trading away accuracy for interpretability; the mechanism surface is the added capability.

Where FluxMateria wins

We reviewed the public landscape before writing this claim. The competitive field falls into four practical categories: public benchmark models, broad absorption, distribution, metabolism, excretion, and toxicity (ADMET) software suites, quantitative systems toxicology (QST) / physiologically based pharmacokinetic (PBPK) liver-simulation tools, and expert-alert or web-screening tools.

Category	Public examples	Best public interpretation
Public benchmark machine learning (ML) models	MiniMol	Direct public benchmark comparison: FluxMateria AUROC 0.9597 on the comparable TDC binary DILI task versus MiniMol around AUROC 0.956, plus richer output.
Broad ADMET suites	ADMET Predictor, related commercial ADMET platforms	Not a public same-dataset accuracy claim. FluxMateria's advantage is one high-throughput DILI output that combines score, hepatic exposure, transporter context, enzyme context, injury chemistry, dose behavior, confidence, and trace.
QST / PBPK liver simulation	DILIsym and related liver-safety simulation workflows	Complementary. These tools are valuable for detailed study design and simulation once exposure, dose, species, and protocol information are available. FluxMateria is earlier, faster compound-level triage.
Expert-alert and web-screening tools	Derek Nexus, ADMETlab, ProTox, pkCSM, vNN-ADMET	FluxMateria's advantage is integrated mechanism depth and auditability: not just an alert or endpoint label, but the reasoning path behind liver-risk movement.

Against public machine learning (ML) benchmark models such as MiniMol, the clean win is measurable: FluxMateria reaches AUROC 0.9597 on the comparable TDC binary DILI task, above the MiniMol public reference around AUROC 0.956, while returning a richer safety review document. That is the strongest direct benchmark statement because the metric and task are public.

Against broad ADMET software suites, the win is workflow shape. Mature platforms can cover many absorption, distribution, metabolism, excretion, and toxicity endpoints, and some include liver-safety modules. FluxMateria's DILI advantage is that the binary risk score, hepatic exposure, transporter context, enzyme context, injury chemistry, dose-window behavior, confidence, and audit trace are returned together inside the same high-throughput screening output. For a discovery team, that means fewer stitched outputs and less manual reconciliation before a safety meeting.

Against QST and PBPK liver-simulation tools, the comparison is complementary rather than adversarial. Those tools remain valuable when a program already has dose, exposure, species, and protocol detail and needs deep simulation. FluxMateria is positioned earlier: fast compound-level triage, novel-molecule review, and mechanism prioritization before teams decide which expensive assays or simulations deserve attention.

Against structural-alert and web-screening tools, the win is decision depth. A simple alert can flag concern, but it usually cannot tell whether the concern is coming from uptake, retention, enzyme-linked activation, chronic exposure pressure, or weak historical similarity. FluxMateria is designed to preserve that chain of reasoning.

Public sources reviewed for this section include MiniMol's TDC benchmark table, Simulations Plus material for ADMET Predictor and DILIsym, Lhasa material for Derek Nexus, and peer-reviewed or public documentation for ADMETlab 3.0, ProTox 3.0, pkCSM, and vNN-ADMET.

FluxMateria DILI mechanism coverage infographic showing molecular structure input, hepatic exposure, efflux and retention, cytochrome P450 enzyme context, injury chemistry, and parent DILI risk output. — FluxMateria DILI coverage is organized as a reviewable safety circuit: exposure in, clearance out, enzyme context, injury chemistry, dose-window sensitivity, confidence, and score trace.

What the engine actually returns

The current DILI workflow combines the parent risk score with a mechanism circuit. That circuit includes hepatic exposure and liver-entry context, organic anion transporting polypeptide (OATP) uptake, bile salt export pump (BSEP), breast cancer resistance protein (BCRP), and multidrug resistance-associated protein 2 (MRP2) efflux or retention context, cytochrome P450 (CYP) enzyme induction and inhibition signals, reactive-metabolite and mitochondrial-stress chemistry, chronic-duration pressure, and optional dose-window behavior.

For a reviewer, the important point is not that every layer is present in every molecule. The important point is that the final score is not opaque. If the score rises because liver entry and retention align with metabolic activation, that appears in the output. If a compound has a binary DILI concern but no strong mechanistic support, that uncertainty is surfaced rather than hidden.

Why this matters for enterprise safety review

Discovery teams do not only need to rank compounds. They need to decide which molecule to synthesize next, which assay to run next, and which safety issue should be escalated before a nomination meeting. A binary DILI label cannot answer those questions. It can say "concern" or "no concern," but it cannot tell whether the concern is driven by liver exposure, canalicular retention, enzyme-linked bioactivation, chronic dosing pressure, or weak historical similarity.

FluxMateria is built for that decision context. DILI sits beside absorption, distribution, metabolism, excretion, and toxicity (ADMET) endpoints in the same profile, with shared confidence handling and decision-packet export. The result is a safety review artifact, not just a scalar prediction.

How to read the SOTA statement

The precise claim is: FluxMateria is state-of-the-art for mechanistic DILI prediction because it reaches AUROC 0.9597 on the comparable public binary benchmark, above the MiniMol public reference around AUROC 0.956, while also returning mechanism, exposure, dose, confidence, and trace outputs. MiniMol speed is not verified from public leaderboard material, so we do not claim a speed comparison against MiniMol. FluxMateria's measured parent DILI path runs at about 12.95 molecules per second locally.

The known-compound production mode can score even higher when exact clinical anchors are allowed, but that is not the novel-drug claim. For novel chemistry, the more relevant evidence is the novel-like DILIRank transfer result and the hepatotoxicity cross-panel result, where the engine still preserves strong discrimination while maintaining reviewable mechanism output.

Plain-language summary

FluxMateria now matches or slightly exceeds the strongest public binary DILI benchmark reference, but the bigger advance is that it explains the risk. Instead of only saying "toxic" or "not toxic," it shows which liver-exposure, transporter, enzyme, injury-chemistry, dose, and confidence signals led to the score.

Where to verify the numbers

The detailed DILI benchmark page includes the public comparison, cross-panel validation, speed measurement, mechanism-output coverage, and downloadable benchmark package. The ADMET benchmark page shows the DILI result in the context of the broader FluxMateria absorption, distribution, metabolism, excretion, and toxicity suite.

Open the DILI benchmark Open the DILI engine page See the ADMET benchmark

This article describes a computational screening and prioritization system. It is not a regulatory toxicology determination, clinical safety claim, or replacement for in-vitro or in-vivo safety studies. Public benchmark comparisons are limited to the stated datasets, metrics, and operating modes.

Review DILI risk with mechanism context

FluxMateria exposes DILI risk inside the full ADMET panel, with benchmark evidence, mechanism attribution, confidence, and exportable decision packets for reproducible safety review.

Request Pilot Access