Technical March 25, 2026

How to Read Confidence Indicators Without Fooling Yourself

A guide to interpreting confidence signals and using them to make better screening decisions.

A prediction without a confidence indicator is an assertion. It might be right. It might be wildly wrong. You have no way to know without running the experiment. A prediction with a confidence indicator is a statement about how much the tool trusts its own output — and that information should change how you act on the result.

But confidence indicators are easy to misread. This article covers the common pitfalls and how to avoid them.

What confidence indicators are

In FluxMateria, every prediction comes with a confidence level: high, medium, or low. These levels are derived from the physics of the computation itself — how well the input chemistry maps to the regions where the physics kernel has strong coverage, and how sensitive the output is to structural assumptions.

They are not statistical confidence intervals derived from a training set. That distinction matters. A statistical confidence interval tells you "historically, predictions in this range have been accurate X% of the time." A physics-based confidence signal tells you "the physical model is well-constrained for this input" or "the physical model is extrapolating."

The five pitfalls

Pitfall 1: Treating "high confidence" as "certainly correct"

High confidence means the physics kernel is well-suited to the input and the prediction is internally consistent. It does not mean the prediction is guaranteed to match experiment. All computational predictions are approximations. High confidence narrows the expected error range — it does not eliminate it.

The right response: treat high-confidence predictions as reliable for ranking and shortlisting. Use them to prioritize experimental resources. Do not skip experimental validation for critical decisions.

Pitfall 2: Discarding "low confidence" results entirely

Low confidence does not mean "wrong." It means "uncertain." A low-confidence prediction of 90% protein binding might turn out to be accurate — the tool simply cannot confirm that from the physics alone. Discarding all low-confidence results throws away potentially valuable candidates.

The right response: flag low-confidence predictions for experimental follow-up rather than eliminating them. In a triage workflow, they go into the "verify experimentally" bucket, not the "reject" bucket.

Pitfall 3: Ignoring confidence when results look plausible

A prediction that says "band gap = 1.5 eV, confidence: low" looks plausible for a semiconductor. It falls in the expected range. The temptation is to trust it because the number looks reasonable. But the confidence indicator is telling you something the number alone cannot: the physics kernel is not confident in this specific prediction, even though the output happens to fall in a normal range.

The right response: always check confidence, even when the number looks right. A plausible-looking number with low confidence is less reliable than a surprising number with high confidence.

Pitfall 4: Comparing across confidence levels

Suppose candidate A has a predicted solubility of 85 mg/L (high confidence) and candidate B has a predicted solubility of 120 mg/L (low confidence). Is B more soluble than A? You cannot say. The high-confidence prediction is reliable for ranking. The low-confidence prediction is not. Comparing them as if they have equal reliability leads to bad shortlisting decisions.

The right response: rank candidates within the same confidence tier. Compare high-confidence predictions to each other, and treat low-confidence predictions as unranked with respect to high-confidence ones.

Pitfall 5: Assuming confidence is uniform across properties

A molecule might have high confidence for solubility, medium confidence for permeability, and low confidence for hepatotoxicity. The confidence level is per-property, not per-molecule. A candidate that looks excellent on its high-confidence properties might have unresolved risks on its low-confidence properties.

The right response: evaluate confidence at the property level, not the molecule level. Build your experimental plan around the low-confidence properties, not the high-confidence ones — the high-confidence predictions are already informative.

A practical framework

Confidence	Use for ranking?	Use for decisions?	Experimental follow-up?
High	Yes	Yes, for triage and shortlisting	Confirmatory (lower priority)
Medium	Yes, with caution	Yes, but weight lower in composite scores	Recommended
Low	No	Only to flag for experimental testing	Required before advancing

The bigger point

Confidence indicators are not decorative. They are the mechanism by which a computational tool communicates the limits of its own knowledge. A tool that gives you numbers without confidence is a tool that lies by omission — it hides its uncertainty behind a veneer of precision.

Using confidence indicators well means accepting that not every prediction is equally reliable, and allocating your experimental resources accordingly. The goal is not to trust the computer blindly. It is to know where to trust it and where to verify.

Every FluxMateria prediction includes a per-property confidence indicator. Try the ADMET demo to see confidence signals in action.

See confidence indicators in action

Paste a SMILES string and see per-property confidence on every prediction.

Try the Demo