← Benchmarks | Materials

Battery Electrochemistry BENCHMARK

This page reports the public benchmark scope for FluxMateria's battery-native materials layer: calibrated holdout accuracy, scenario alignment across distinct battery questions, and the end-to-end workflow outcome from the latest local battery study.

1.0
Family accuracy
6 calibrated holdout references
0.149 V
Voltage MAE
calibrated holdout
5 / 5
Scenario alignment
public battery decision cases
26.8 s
End-to-end workflow
latest local battery study

Methodology

Two benchmark layers were used because a serious battery engine should be accurate and also change its answer when the engineering question changes.

1. Calibrated holdout benchmark

The battery layer was evaluated against known cathode-family references using calibration rows plus an untouched holdout slice.

  • 16 total reference materials
  • 10 calibration rows
  • 6 holdout rows
  • Tracked capacity, voltage, transport, cycle, electrolyte, interface, cost, and manufacturing errors

2. Scenario stress test

The same layer was then tested across different battery decision frames instead of only a single blended score.

  • Energy-dense cobalt-free search
  • High-voltage frontier screen
  • Fast-charge transport screen
  • Cycle-life conservative screen
  • Immediate build handoff

Important scope note: this benchmark validates FluxMateria's battery-native decision layer as a screening, triage, and prototype-handoff engine. It does not claim complete electrochemical or wet-lab validation. The point is to reduce the search space to better-supported build candidates and make the next physical experiment sharper.

Calibrated Holdout Summary

The holdout result checks whether the battery layer is directionally and quantitatively aligned with known battery families before it is used for novel ranking.

Metric Result Interpretation
Family accuracy 1.0 Every holdout reference was assigned to the correct high-level battery family.
Capacity MAE 0.812 mAh/g Specific-capacity estimates stayed close to nominal literature-aligned reference values.
Voltage MAE 0.149 V Average-voltage prediction remained within a tight screening-grade range on holdout materials.
Transport MAE 0.1427 Transport and rate-readiness signals were directionally consistent with known family behavior.
Cycle-life MAE 0.0642 Cycle/degradation heuristics stayed close to the nominal reference scoring used for calibration.
Electrolyte MAE 0.09 Electrolyte compatibility stayed well aligned with the known chemistry tradeoffs in the holdout slice.
Interface MAE 0.0883 Interface-readiness scoring tracked the reference set closely enough for shortlist triage.
Cost / manufacturing MAE 0.0372 / 0.0765 Cost and practical build signals stayed stable enough to support the handoff layer.
Energy-rank Spearman 0.9429 The model preserved the energy-ordering structure of the holdout references.

Scenario Alignment

The battery layer was then stressed across different engineering questions. All five public scenarios aligned with the intended family or material outcome.

Scenario Primary metric Observed top result Margin Why it matters
Energy-dense cobalt-free screen battery_readiness_score LiMnO2 5.2 The energy-focused screen lifted layered manganese oxide instead of collapsing back to cobalt-heavy chemistry.
High-voltage frontier voltage_surrogate_V LiNiPO4 0.2 V The voltage layer correctly pushed very-high-voltage phosphate chemistry to the top when voltage itself was the question.
Fast-charge transport rate_capability_proxy LiMn2O4 0.006 The screen surfaced a 3D spinel transport leader, with Li4Ti5O12 essentially tied as the other valid 3D transport winner.
Cycle-life conservative cycle_life_proxy Li4Ti5O12 0.124 The long-life framing favored the most stability-oriented chemistry rather than the highest-energy candidate.
Immediate build handoff prototype_handoff_priority_score Li4Ti5O12 4.7 The handoff layer separated the best immediate prototype package from the highest-upside chemistry.

Pipeline Outcome

The same battery workflow produced different leaders as more battery-native logic was added. That is a feature, not a bug.

Bulk: LiNiO2 Interface: LiMnPO4 Battery-native: LiMnO2 Build: Li4Ti5O12

What this means: FluxMateria is not behaving like a single-score materials ranker. The battery layer changes its answer when the engineering question changes. That is exactly what a usable battery decision engine should do. Bulk energy density, interface readiness, balanced electrochemistry, and immediate prototyping are related questions, but they are not the same question.

Download Benchmark Package

Public benchmark materials for independent review and reuse.

Battery benchmark summary JSON
Headline holdout metrics, scenario alignment, and pipeline winner summary.
Download JSON
Public benchmark summary
Reader-facing markdown summary of the holdout and scenario benchmark.
Open summary
Battery case study
End-to-end application of the same battery layer inside a real candidate-ranking workflow.
Open case study

Scope and Limitations

What this benchmark supports

  • Battery-family-aware triage instead of generic materials ranking
  • Directionally useful screening across energy, voltage, transport, cycle, and build-readiness questions
  • Fast shortlist compression for prototype planning
  • Decision-layer validation before lab work

What this benchmark does not claim

  • It is not a substitute for real electrochemical testing.
  • It does not prove commercial superiority of any one candidate.
  • It does not replace cell build, cycling, safety, or manufacturing validation.
  • The public benchmark is intentionally narrower than the internal workflow details.

Explore the battery layer in context

Review the full battery case study, then see how the battery layer fits inside the broader FluxMateria materials stack.

Battery case study Back to Materials module