# FluxMateria Battery Electrochemistry Benchmark Summary

## What was benchmarked

The FluxMateria battery electrochemistry layer was benchmarked in two complementary ways:

- against a calibrated holdout set of known battery-material families
- across multiple battery decision scenarios that asked different engineering questions

This matters because a serious battery workflow should not produce the same winner for every problem framing. The right material for high energy, fast transport, cycle life, or immediate prototyping can be different.

## Holdout benchmark snapshot

Latest calibrated holdout summary:

- Holdout references: `6`
- Family accuracy: `1.0`
- Capacity MAE: `0.812 mAh/g`
- Voltage MAE: `0.149 V`
- Transport MAE: `0.1427`
- Cycle-life MAE: `0.0642`
- Electrolyte-stability MAE: `0.09`
- Interface-stability MAE: `0.0883`
- Cost MAE: `0.0372`
- Manufacturing MAE: `0.0765`
- Energy-rank Spearman: `0.9429`

## Scenario benchmark snapshot

The same layer was then stress-tested across five decision scenarios. All five aligned with the intended outcome.

- `Energy-dense cobalt-free screen`
  - Top result: `LiMnO2`
- `High-voltage frontier screen`
  - Top result: `LiNiPO4`
- `Fast-charge transport screen`
  - Top result: `LiMn2O4`
  - Near-tied transport leader: `Li4Ti5O12`
- `Cycle-life conservative screen`
  - Top result: `Li4Ti5O12`
- `Immediate build handoff`
  - Top result: `Li4Ti5O12`

Scenario alignment:

- Scenario count: `5`
- Aligned scenarios: `5`
- Alignment rate: `1.0`

## What this shows

The battery layer is not behaving like a single-score ranker. It changes its answer when the engineering question changes.

- When energy is the priority, layered manganese oxide rises.
- When voltage is the question, very-high-voltage phosphate chemistry rises.
- When transport and long life matter more, spinel and titanate families rise.
- When the goal is to hand a design to a build team now, the prototype-handoff layer prefers the lowest-risk immediate package instead of the highest-upside chemistry.

That is the behavior expected from a real battery decision engine rather than a generic materials screen.
