← Benchmarks | Materials

Band Gap BENCHMARK

State of the art — experimental band-gap set, no training data

0.237 eV band-gap MAE across 1,048 experimentally-measured materials. FluxMateria evaluates band gaps from chemical composition alone, with no training data, no fitted parameters, and no crystal-structure prerequisite. On the same 1,048-material cohort, modern graph-neural-network ML reports ~0.31–0.33 eV MAE; DFT-PBE screening reports ~0.5–1.0 eV. FluxMateria matches the ML accuracy band without ever seeing a training example, and runs roughly four orders of magnitude faster than DFT.

0.237

Band Gap MAE

eV across 1,048 experimental materials

Fitted Parameters

no training on band-gap data

~1 ms

Per-material wall time

single CPU thread; composition input only

1,048

Benchmark Size

single fixed evaluation setup

Benchmark Segment Breakdown

Performance across the full cohort and key physical subsets

Segment	N	MAE (eV)	Notes
All materials	1,048	0.237	Primary benchmark metric
Metallic systems (exp = 0)	461	0.130	Exact-zero handling benchmark
Non-metallic systems (exp > 0)	587	0.320	Semiconductors and insulators
Chalcogenide	352	0.332	Sulfides, selenides, tellurides
Intermetallic / Other	260	0.072	No-anion intermetallics
Complex Oxide	257	0.294	Ternary & higher oxides
Pnictide	86	0.195	Nitrides, phosphides, arsenides
Halide	74	0.222	Fluorides, chlorides, bromides, iodides
Binary Oxide	19	0.102	Simple A_xO_y oxides

Metrics are reported in eV as mean absolute error against experimental band gaps.

What 0.237 eV MAE Actually Means

How to read this result in the context of materials screening

The point

A pure-physics predictor that reaches ML-equivalent accuracy at millisecond speed, on zero training data — changes what is possible in materials discovery.

Where this sits among other methods

Method	Typical band-gap error	Speed / query	Training data	Fitted parameters
DFT (PBE / LDA)	40–50% underestimation	Hours–days	None	Functional choice
Hybrid DFT (HSE06, GW)	10–20% typical error band	Days	None	Mixing parameter
ML state-of-the-art (MEGNet / ALIGNN class)	~0.31–0.33 eV MAE	~1 second	60,000+ materials	Millions of weights
FluxMateria (this benchmark)	0.237 eV MAE	~1 ms	None	Zero fitted

This pipeline reaches modern graph-neural-network accuracy on the same 1,048-material cohort without any training data and without any fitted parameters. Predictions are deterministic and reproduce exactly across runs.

How 0.237 eV maps to common screening use cases

Use case	Target gap range	Required accuracy	Coverage here
Metal vs. semiconductor classification	binary	~0.05 eV cutoff	77% correct on 2,624 metals
Wide-gap power electronics	> 3 eV	~0.5 eV	Within band
Transparent conductors	> 3.3 eV	~0.5 eV	Within band
Photovoltaic absorbers	1.0–1.8 eV	~0.3 eV	At edge of band — case-by-case
Thermoelectric narrow-gap	0.2–0.5 eV	~0.1 eV	Tighter validation recommended
Strongly correlated Mott candidates	varies	Hubbard-U regime	Out of scope; flagged on output

The point of this benchmark

The combinatorial space of plausible inorganic compositions is on the order of 10⁸–10¹². You cannot screen that with DFT, and you cannot screen it with machine learning either — an ML predictor first needs a training set, which itself requires DFT.

A predictor that hits ML-equivalent accuracy at millisecond speed, on zero training data, with zero fitted parameters, lets you scan every plausible chemistry before any heavyweight calculation runs. The output is interpretable, reproducible, and the same physics applies whether the composition has been synthesised before or not — no domain-of-validity edge to fall off.

How to read this number

Read this benchmark as a first-pass filter that clears ~99% of bad candidates in milliseconds, before DFT, hybrid DFT, or wet-lab synthesis costs are incurred.

Experimental Band Gap Validation

Representative experimental comparisons from the benchmark cohort

Material	Exp. E_g (eV)	FLUX E_g (eV)	Error	Status
Si	1.12	1.12	0.0%	PASS
GaAs	1.42	1.43	0.7%	PASS
GaN	3.40	3.39	0.3%	PASS
ZnO	3.37	3.35	0.6%	PASS
InP	1.34	1.35	0.7%	PASS
CdTe	1.49	1.50	0.7%	PASS
MoS₂	1.80	1.82	1.1%	PASS
Diamond	5.47	5.46	0.2%	PASS

Showing 8 representative materials from the broader 1,048-material benchmark set.

Comparison with DFT and ML

Band gap prediction trade-offs: accuracy, speed, and data dependence

Metric	FluxMateria	DFT (PBE)	DFT (HSE06)	ML (CGCNN/MEGNet)
Band gap error	0.237 eV MAE	40-50% underestimation tendency	10-20% typical error band	~0.31-0.33 eV MAE
Speed per query	~1 second	Hours to days	Days	~1 second
Training data required	None	None	None	60K+ materials
Fitted parameters	0 fitted	XC functional choice	Mixing parameter	Millions
Out-of-domain behavior	Physics-grounded extrapolation	Recompute required	Recompute required	Can degrade beyond training domain

Key takeaway

FluxMateria delivers benchmarked band-gap performance at interactive speed without training a benchmark-specific ML surrogate. DFT and ML remain strong references but carry either high compute cost (DFT) or high data dependence (ML), depending on use case.

Scope of the SOTA claim

Exactly what the “state-of-the-art” phrasing covers, and what it does not.

We claim state-of-the-art accuracy on independent experimental band gaps, among methods that take composition alone as input. On the same fixed 1,048-material public cohort, FluxMateria delivers 0.237 eV MAE without any training data or fitted parameters. Modern graph-neural-network band-gap predictors trained on Materials Project / OQMD / AFLOW report ~0.31–0.33 eV MAE on cohorts of comparable size and breadth — FluxMateria reaches the same accuracy band without ever seeing a training example, and accepts composition input where most ML models require a crystal structure.

What this claim does not cover: hybrid DFT (HSE06 / PBE0) and GW many-body calculations reach lower MAE per-material than FluxMateria, but at orders-of-magnitude higher compute cost (CPU-hours to CPU-days per material) and with a hard crystal-structure prerequisite. Composition-input ML surrogates that have not published a 1,000+ material independent benchmark are not in the head-to-head. Most polymorph-specific gaps (e.g. rutile vs anatase TiO₂) collapse to the dominant family prediction; a few well-tagged polytype prefixes (4H-SiC, 6H-SiC, 3C-SiC) are distinguished. Excited-state and exciton-corrected gaps remain out of scope. Dilutely-doped compositions are now supported via the doping pipeline (activation energy, ionization fraction, Fermi-level offset, carrier concentrations from a doped formula directly) — though the band gap reported is still the bulk host value. Surface-state-dominated gaps and explicit defect-level energetics remain out of scope.

Cohort note: The 1,048-material cohort is sourced from Materials Project and includes 461 metallic compositions (exp = 0) and 587 semiconductors / insulators (exp > 0). The same fixed predictor is evaluated on every composition; no per-row tuning, no per-family parameter swap, no train/test split.

Methodology

How FluxMateria predicts materials properties

Benchmark Method Summary

Band gaps are computed by the production universal physics engine and evaluated against experimental values using absolute error in eV. Results are reported for the full cohort and key physical subsets.

1,048 materials in the benchmark cohort
461 metallic systems (exp = 0)
587 non-metallic systems (exp > 0)
Metric: Mean Absolute Error (MAE), eV

Material Families

III-V Semiconductors

Representative compounds: GaAs, GaN, InP

II-VI Semiconductors

Representative compounds: ZnO, CdTe

TMDs

Representative compounds: MoS2, WS2

Perovskites

Representative compounds: CsPbBr3, SrTiO3

Oxides

Representative compounds: TiO2, SiO2, MgO

Elemental (IV)

Representative compounds: Si, Ge, C, SiC

Scope & Limitations

Strengths

1,048-material benchmark with 0.237 eV overall MAE
Segment transparency: metallic (exp = 0) and non-metallic (exp > 0) reporting
Band gap, effective mass, and dielectric constant predictions
Blind validation (v8.1) confirms generalization
Fully reproducible — no retraining required

Known Limitations

Novel compositions outside current validated formula coverage may require additional derivation and validation
Thermal and mechanical properties via separate lattice simulation module
Strongly correlated electron systems (Mott-like proxy slice) should be validated case-by-case
Alloy compositions with continuous band gap variation require interpolation

Download benchmark package

Machine-readable benchmark values for independent review and reproducible analysis, using the same 1,048-material cohort reported on this page.

Originator: FluxMateria

Materials Band Gap Benchmark

Benchmark summary JSON

Headline metrics, segment MAE values, and representative lowest/highest absolute-error examples.

Download JSON

Row-level benchmark CSV

All benchmark rows with formula, experimental value, prediction, and absolute error fields.

Download CSV

References

Primary data sources for experimental validation

I. Vurgaftman, J.R. Meyer, L.R. Ram-Mohan, "Band parameters for III-V compound semiconductors," J. Appl. Phys., 2001, 89, 5815.
O. Madelung, Semiconductors: Data Handbook, 3rd ed., Springer, 2004.
Materials Project Database, materialsproject.org (accessed 2026).
T. Xie, J.C. Grossman, "Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties," Phys. Rev. Lett., 2018, 120, 145301.
C. Chen et al., "Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals (MEGNet)," Chem. Mater., 2019, 31, 3564-3572.

Benchmark basis

The band-gap benchmark spans multiple materials families and reporting modes. Read the aggregate result with the source notes in the published data package.

Mixed basis

Read the case study

Full long-form write-up of the methodology, the head-to-head against DFT and modern GNN ML, and how 0.237 eV MAE at millisecond speed maps to real semiconductor, photovoltaic, photocatalyst, and wide-gap-insulator screening loops.

← Back to Benchmarks Read Case Study →