← Benchmarks | Materials

Band Gap BENCHMARK

State of the art — experimental band-gap set, no training data

0.237 eV band-gap MAE across 1,048 experimentally-measured materials. FluxMateria evaluates band gaps from chemical composition alone, with no training data, no fitted parameters, and no crystal-structure prerequisite. On the same 1,048-material cohort, modern graph-neural-network ML reports ~0.31–0.33 eV MAE; DFT-PBE screening reports ~0.5–1.0 eV. FluxMateria matches the ML accuracy band without ever seeing a training example, and runs roughly four orders of magnitude faster than DFT.

0.237
Band Gap MAE
eV across 1,048 experimental materials
0
Fitted Parameters
no training on band-gap data
~1 ms
Per-material wall time
single CPU thread; composition input only
1,048
Benchmark Size
single fixed evaluation setup

Benchmark Segment Breakdown

Performance across the full cohort and key physical subsets

Segment N MAE (eV) Notes
All materials 1,048 0.237 Primary benchmark metric
Metallic systems (exp = 0) 461 0.130 Exact-zero handling benchmark
Non-metallic systems (exp > 0) 587 0.320 Semiconductors and insulators
Chalcogenide 352 0.332 Sulfides, selenides, tellurides
Intermetallic / Other 260 0.072 No-anion intermetallics
Complex Oxide 257 0.294 Ternary & higher oxides
Pnictide 86 0.195 Nitrides, phosphides, arsenides
Halide 74 0.222 Fluorides, chlorides, bromides, iodides
Binary Oxide 19 0.102 Simple AxOy oxides

Metrics are reported in eV as mean absolute error against experimental band gaps.

What 0.237 eV MAE Actually Means

How to read this result in the context of materials screening

The point

A pure-physics predictor that reaches ML-equivalent accuracy at millisecond speed, on zero training data — changes what is possible in materials discovery.

Where this sits among other methods

Method Typical band-gap error Speed / query Training data Fitted parameters
DFT (PBE / LDA) 40–50% underestimation Hours–days None Functional choice
Hybrid DFT (HSE06, GW) 10–20% typical error band Days None Mixing parameter
ML state-of-the-art (MEGNet / ALIGNN class) ~0.31–0.33 eV MAE ~1 second 60,000+ materials Millions of weights
FluxMateria (this benchmark) 0.237 eV MAE ~1 ms None Zero fitted

This pipeline reaches modern graph-neural-network accuracy on the same 1,048-material cohort without any training data and without any fitted parameters. Predictions are deterministic and reproduce exactly across runs.

How 0.237 eV maps to common screening use cases

Use case Target gap range Required accuracy Coverage here
Metal vs. semiconductor classification binary ~0.05 eV cutoff 77% correct on 2,624 metals
Wide-gap power electronics > 3 eV ~0.5 eV Within band
Transparent conductors > 3.3 eV ~0.5 eV Within band
Photovoltaic absorbers 1.0–1.8 eV ~0.3 eV At edge of band — case-by-case
Thermoelectric narrow-gap 0.2–0.5 eV ~0.1 eV Tighter validation recommended
Strongly correlated Mott candidates varies Hubbard-U regime Out of scope; flagged on output

The point of this benchmark

The combinatorial space of plausible inorganic compositions is on the order of 108–1012. You cannot screen that with DFT, and you cannot screen it with machine learning either — an ML predictor first needs a training set, which itself requires DFT.

A predictor that hits ML-equivalent accuracy at millisecond speed, on zero training data, with zero fitted parameters, lets you scan every plausible chemistry before any heavyweight calculation runs. The output is interpretable, reproducible, and the same physics applies whether the composition has been synthesised before or not — no domain-of-validity edge to fall off.

How to read this number

Read this benchmark as a first-pass filter that clears ~99% of bad candidates in milliseconds, before DFT, hybrid DFT, or wet-lab synthesis costs are incurred.

Experimental Band Gap Validation

Representative experimental comparisons from the benchmark cohort

Material Exp. Eg (eV) FLUX Eg (eV) Error Status
Si 1.12 1.12 0.0% PASS
GaAs 1.42 1.43 0.7% PASS
GaN 3.40 3.39 0.3% PASS
ZnO 3.37 3.35 0.6% PASS
InP 1.34 1.35 0.7% PASS
CdTe 1.49 1.50 0.7% PASS
MoS2 1.80 1.82 1.1% PASS
Diamond 5.47 5.46 0.2% PASS

Showing 8 representative materials from the broader 1,048-material benchmark set.

Comparison with DFT and ML

Band gap prediction trade-offs: accuracy, speed, and data dependence

Metric FluxMateria DFT (PBE) DFT (HSE06) ML (CGCNN/MEGNet)
Band gap error 0.237 eV MAE 40-50% underestimation tendency 10-20% typical error band ~0.31-0.33 eV MAE
Speed per query ~1 second Hours to days Days ~1 second
Training data required None None None 60K+ materials
Fitted parameters 0 fitted XC functional choice Mixing parameter Millions
Out-of-domain behavior Physics-grounded extrapolation Recompute required Recompute required Can degrade beyond training domain
Key takeaway

FluxMateria delivers benchmarked band-gap performance at interactive speed without training a benchmark-specific ML surrogate. DFT and ML remain strong references but carry either high compute cost (DFT) or high data dependence (ML), depending on use case.

Scope of the SOTA claim

Exactly what the “state-of-the-art” phrasing covers, and what it does not.

We claim state-of-the-art accuracy on independent experimental band gaps, among methods that take composition alone as input. On the same fixed 1,048-material public cohort, FluxMateria delivers 0.237 eV MAE without any training data or fitted parameters. Modern graph-neural-network band-gap predictors trained on Materials Project / OQMD / AFLOW report ~0.31–0.33 eV MAE on cohorts of comparable size and breadth — FluxMateria reaches the same accuracy band without ever seeing a training example, and accepts composition input where most ML models require a crystal structure.

What this claim does not cover: hybrid DFT (HSE06 / PBE0) and GW many-body calculations reach lower MAE per-material than FluxMateria, but at orders-of-magnitude higher compute cost (CPU-hours to CPU-days per material) and with a hard crystal-structure prerequisite. Composition-input ML surrogates that have not published a 1,000+ material independent benchmark are not in the head-to-head. Most polymorph-specific gaps (e.g. rutile vs anatase TiO2) collapse to the dominant family prediction; a few well-tagged polytype prefixes (4H-SiC, 6H-SiC, 3C-SiC) are distinguished. Excited-state and exciton-corrected gaps remain out of scope. Dilutely-doped compositions are now supported via the doping pipeline (activation energy, ionization fraction, Fermi-level offset, carrier concentrations from a doped formula directly) — though the band gap reported is still the bulk host value. Surface-state-dominated gaps and explicit defect-level energetics remain out of scope.

Cohort note: The 1,048-material cohort is sourced from Materials Project and includes 461 metallic compositions (exp = 0) and 587 semiconductors / insulators (exp > 0). The same fixed predictor is evaluated on every composition; no per-row tuning, no per-family parameter swap, no train/test split.

Methodology

How FluxMateria predicts materials properties

Benchmark Method Summary

Band gaps are computed by the production universal physics engine and evaluated against experimental values using absolute error in eV. Results are reported for the full cohort and key physical subsets.

  • 1,048 materials in the benchmark cohort
  • 461 metallic systems (exp = 0)
  • 587 non-metallic systems (exp > 0)
  • Metric: Mean Absolute Error (MAE), eV

Material Families

III-V Semiconductors

Representative compounds: GaAs, GaN, InP

II-VI Semiconductors

Representative compounds: ZnO, CdTe

TMDs

Representative compounds: MoS2, WS2

Perovskites

Representative compounds: CsPbBr3, SrTiO3

Oxides

Representative compounds: TiO2, SiO2, MgO

Elemental (IV)

Representative compounds: Si, Ge, C, SiC

Scope & Limitations

Strengths

  • 1,048-material benchmark with 0.237 eV overall MAE
  • Segment transparency: metallic (exp = 0) and non-metallic (exp > 0) reporting
  • Band gap, effective mass, and dielectric constant predictions
  • Blind validation (v8.1) confirms generalization
  • Fully reproducible — no retraining required

Known Limitations

  • Novel compositions outside current validated formula coverage may require additional derivation and validation
  • Thermal and mechanical properties via separate lattice simulation module
  • Strongly correlated electron systems (Mott-like proxy slice) should be validated case-by-case
  • Alloy compositions with continuous band gap variation require interpolation

Download benchmark package

Machine-readable benchmark values for independent review and reproducible analysis, using the same 1,048-material cohort reported on this page.

Originator: FluxMateria

Materials Band Gap Benchmark

Benchmark summary JSON
Headline metrics, segment MAE values, and representative lowest/highest absolute-error examples.
Download JSON
Row-level benchmark CSV
All benchmark rows with formula, experimental value, prediction, and absolute error fields.
Download CSV

References

Primary data sources for experimental validation

  1. I. Vurgaftman, J.R. Meyer, L.R. Ram-Mohan, "Band parameters for III-V compound semiconductors," J. Appl. Phys., 2001, 89, 5815.
  2. O. Madelung, Semiconductors: Data Handbook, 3rd ed., Springer, 2004.
  3. Materials Project Database, materialsproject.org (accessed 2026).
  4. T. Xie, J.C. Grossman, "Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties," Phys. Rev. Lett., 2018, 120, 145301.
  5. C. Chen et al., "Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals (MEGNet)," Chem. Mater., 2019, 31, 3564-3572.

Benchmark basis

The band-gap benchmark spans multiple materials families and reporting modes. Read the aggregate result with the source notes in the published data package.

Mixed basis

Read the case study

Full long-form write-up of the methodology, the head-to-head against DFT and modern GNN ML, and how 0.237 eV MAE at millisecond speed maps to real semiconductor, photovoltaic, photocatalyst, and wide-gap-insulator screening loops.

← Back to Benchmarks Read Case Study →