# FluxMateria DILI Benchmark Methodology

Date: 2026-05-01

## Scope

This package documents the public-facing benchmark evidence for the FluxMateria drug-induced liver injury (DILI) mechanistic risk engine.

The benchmark claim is:

- FluxMateria reaches area under receiver operating characteristic curve (AUROC) 0.9597 on the comparable Therapeutics Data Commons (TDC) binary DILI benchmark.
- The MiniMol public reference is approximately AUROC 0.956 on the same binary task family.
- FluxMateria additionally returns mechanism attribution, hepatic exposure context, optional dose-window behavior, confidence, and an auditable score trace.

## Evaluation Modes

Two modes are reported:

- Novel-like leave-one-out: exact clinical self-matches are masked. This is the relevant mode for new drug-candidate behavior and the public benchmark claim.
- Known-compound production: exact clinical anchors are allowed. This is useful for reference-drug reproducibility and product behavior, but it is not the novel-drug state-of-the-art claim.

## Panels

The package includes:

- Therapeutics Data Commons DILI public panel: 475 rows.
- Therapeutics Data Commons DILI raw panel: 475 rows.
- DILIRank clinical-risk transfer panel: 907 rows.
- Hepatotox validated panel: 614 rows.

## Metrics

Reported metrics include:

- AUROC: area under receiver operating characteristic curve.
- AUPRC: area under precision-recall curve.
- Balanced accuracy: average of sensitivity and specificity.
- Three-class accuracy: low, moderate, and high clinical-risk stratification when available.
- Molecules per second: local runtime throughput for the parent DILI path.

## Interpretation

The primary public claim is not binary-only apples-to-apples. The binary benchmark establishes state-of-the-art position on the comparable public task. The additional FluxMateria outputs establish the broader enterprise value: scientific interpretability, mechanism attribution, exposure reasoning, dose-window sensitivity, confidence, and review trace.

## Use Boundary

FluxMateria DILI predictions are designed for screening, prioritization, and scientific review. They are not a substitute for regulated toxicology studies.
