← Benchmarks | DFT Cross-Check

DFT Cross-Check BENCHMARK

Head-to-head against a standard PBE screening baseline (GPAW, plane-wave 200 eV, 6³ k-points) on 15 canonical materials. Three-layer scoring — engine vs experiment, DFT vs experiment, engine vs DFT — on the same fixed set. The engine takes only the chemical formula; the DFT side requires explicit crystal structure, pseudopotentials, and SCF settings.

0.2%
Lattice constant MAPE
median 0.1% across 15 materials
7.6%
Band gap MAPE
vs DFT-PBE 45.1% on the same set
3.6%
Magnetic moment MAPE
Fe and Ni; PBE 9.0% on the same
~104×
Speedup vs DFT (measured)
measured here on fast PBE; up to ~109× contextual at higher DFT quality (not measured)

Tier 1 — Single-Point Lattice / Band Gap / Magnetic Moment

Each property scored against both DFT and experiment on the same 15 materials, lattice fixed at experimental input

Property N Engine vs experiment DFT (PBE) vs experiment Verdict
Lattice constant a (Å) 15 MAPE 0.2% · median 0.1% 0.0% by construction (Tier 1 fixes DFT lattice to experiment; only Tier 2 relaxes) Engine predicts composition-only; DFT given exp
Band gap (eV) 10 MAPE 7.6% · median 1.2% MAPE 45.1% · median 50.7% Engine beats PBE
Magnetic momentB) — Fe and Ni only, n=2 2 MAPE 3.6% · median 3.6% MAPE 9.0% · median 9.0% Engine beats PBE on Fe/Ni

DFT runs use plane-wave PBE (200 eV cutoff, 6³ k-points, no relaxation). The full settings are in the methodology section below.

Per-Material Results

All 15 materials, raw numbers, sorted by family

Material Family Exp. a (Å) FLUX a (Å) a error Exp. Eg (eV) FLUX Eg (eV) DFT Eg (eV) Status
SiGroup IV (diamond) 5.4315.431 −0.0% 1.121.120.92 PASS
GeGroup IV (diamond) 5.6585.655 −0.1% 0.660.660.43 PASS
GaAsIII–V (zincblende) 5.6535.647 −0.1% 1.431.431.22 PASS
GaNIII–V (wurtzite) 3.1893.183 −0.2% 3.403.441.83 PASS
ZnOII–VI (wurtzite) 3.2493.230 −0.6% 3.373.400.93 PASS
MgOOxide (rocksalt) 4.2114.208 −0.1% 7.836.163.13 FAIR
TiO2Oxide (rutile) 4.5944.545 −1.1% 3.053.391.78 PASS
NaClIonic (rocksalt) 5.6405.630 −0.2% 8.609.175.38 PASS
AlMetal (fcc) 4.0464.049 +0.1% 000 PASS
CuMetal (fcc) 3.6153.621 +0.2% 000 PASS
FeMagnetic metal (bcc) 2.8662.864 −0.1% moment 2.22 μB → engine 2.26 (+1.9%), DFT 2.20 (−1.0%) PASS
NiMagnetic metal (fcc) 3.5243.521 −0.1% moment 0.62 μB → engine 0.65 (+5.4%), DFT 0.72 (+16.8%) PASS
Graphite (C)Layered 2.4612.461 +0.0% 05.731.28 FAIR
h-BNLayered 2.5042.507 +0.1% 5.963.973.84 FAIR
MoS2 (2H)Layered 3.1603.156 −0.1% 1.291.271.12 PASS

14 of 15 materials at ≤1% lattice error (all except TiO2-rutile, which sits just outside at 1.1%). Layered systems (graphite, h-BN, MoS2) and wurtzite in-plane lattice (GaN, ZnO) all converged to under-1% after the structural-geometry refinements landed in this iteration.

Tier 2 — Relaxed Lattice + Bulk Modulus

7-point Birch–Murnaghan equation-of-state per material (strain −6%, −4%, −2%, 0, +2%, +4%, +6%) adds two new comparisons: relaxed lattice constant and bulk modulus B

Same task, same 15 materials Engine DFT (PBE, fast quality)
Bulk modulus B (GPa) accuracy vs experiment All 15 materials: median 0.7% off exp; MAPE 6.0% (max 32.3% on ZnO). Layered systems (graphite, h-BN, MoS2) now sit inside 21% after the structural-geometry refinements landed. median 13.7% off exp; MAPE 176% (7 EOS-converged out of 15 — Cu and MgO fits noisy at this DFT cost; layered + Fe-bcc EOS did not converge)
Relaxed lattice a (Å) accuracy vs experiment median 0.1% off exp; MAPE 0.2%
(15 / 15 materials)
median 0.6% off exp
(7 EOS-converged out of 15)
Wall time per material for the same EOS / B + relaxed-a calculation ~3 milliseconds (one composition-only call) 11 s (Al) – 294 s (TiO2);
median 50 s; mean 88 s; 7-point SCF EOS scan
Total wall time, all 15 materials 1.4 seconds 22.1 minutes
Mean speedup, engine vs DFT EOS ~25,000× (per material; ~950× including the engine's one-time 600 ms first-call import)
Inputs required Chemical formula only Full crystal structure + EOS scan window + SCF settings

Measured in this benchmark

DFT setting we ran DFT wall time per material Engine wall time per material Measured speedup
Fast PBE EOS
PW 200 eV, 6³ k-points, 7 SCFs — a screening-grade setup
~50 s median
(11 s – 294 s)
~3 ms ~25,000×
(per material, measured)

Accuracy on this run, vs experiment: engine band gap median 1.2% (DFT 50.7%); engine B median 0.7%, MAPE 6.0% across all 15 materials (DFT 13.7% on the 7 EOS-converged). Engine returns 40+ properties from a single composition-only call.

Contextual: more expensive DFT settings

The numbers in this section are not measured in this benchmark. They are based on standard literature timings for the same kinds of calculations at higher quality — the DFT settings a real research lab or industrial materials team would normally run. Included as context for why our measured 25,000× is on the conservative end of the speedup story.

DFT setting (not run here) Typical per-material wall time Engine speedup (estimated) What it's used for
Production PBE (PW 500–600 eV, 12³ k-points, BFGS pre-relax + EOS) ~30 min – 2 hr ~500,000 – 2,000,000× Standard for a materials publication; DFT B converges to ~5–15% MAPE typical, band gap stays at PBE-functional level (~37%).
Hybrid functional (HSE06 EOS) ~hours per material ~10–100 million× Required for accurate band gaps (MAPE typically ~10–15% on a clean reference set).
GW corrections (G0W0 or self-consistent GW) ~hours – days ~108–109× The reference for band gaps in hard-matter physics (~5% MAPE typical).
Full ~40-property characterization (separate SCF / EOS / DFPT / BoltzTraP / magnetic-MC workflows) ~hours – days ~106–109× Architectural comparison: the engine returns the suite from one composition-only call. Per-property accuracy is reported on dedicated benchmark pages, not in this DFT cross-check.

Reading guide: the engine's 3 ms per material doesn't change with DFT quality. The DFT side scales with cutoff, k-mesh density, exchange-correlation cost, and workflow count, so the speedup grows as you push DFT toward higher quality. We did not run any of the DFT settings in this section — the timings are literature-standard.

Honest note on DFT B at this quality: Even on the 7 materials where the EOS curve fit converged, fast-quality DFT (PW 200 eV, 7-point ±6% strain) produces noisy B values — Cu shows 1036 GPa vs the 140 GPa experimental, MgO shows 922 vs 160. These are EOS-fit artifacts at affordable compute cost, not PBE failures. Production-quality DFT recovers B to ~5–15% on the same materials, but at the wall-times listed in the table above. The engine returns B from a single composition-only call and matches experiment to median 0.7%, MAPE 6.0% across all 15 materials.

Per-material apples-to-apples (B + relaxed lattice + wall time)

Material B engine (GPa) B DFT (GPa) B exp (GPa) a engine (Å) a DFT (Å) a exp (Å) DFT time Speedup
Si 98.088.698.0 5.4315.4775.431 31.0 s~10,000×
Ge 75.365.475.8 5.6555.7695.658 31.0 s~10,000×
GaAs 75.275.5 5.6475.653 32.6 s~10,000×
GaN 210.4210.0 3.1833.189 104.3 s~40,000×
ZnO 187.8142.0 3.2303.249 212.1 s~77,000×
MgO 136.1922†160.0 4.2084.2354.211 39.8 s~15,000×
TiO2 191.4210.0 4.5454.594 293.9 s~133,000×
NaCl 24.044.324.5 5.6305.6935.640 50.9 s~19,000×
Al 76.080.276.0 4.0494.0424.046 10.8 s~3,300×
Cu 140.11036†140.0 3.6213.5923.615 24.4 s~5,200×
Fe 170.1170.0 2.8642.866 21.7 s~11,000×
Ni 185.7194.3180.0 3.5213.5413.524 77.6 s~19,000×
Graphite 33.233.0 2.4612.461 44.4 s~15,000×
h-BN 28.536.0 2.5072.504 61.1 s~20,000×
MoS2 56.653.0 3.1563.160 292.5 s~97,000×

Engine wall-time per material: ~3 ms typical (one composition-only call). DFT wall-time is for the 7-point Birch–Murnaghan equation-of-state scan. — = DFT EOS fit failed at this quality (V0 outside the ±6% strain window, or fit non-convergent for wurtzite c/a anisotropy / spin–volume coupling on Fe). † = DFT EOS fit converged but bulk-modulus extraction is noisy at fast quality — production DFT recovers these. Engine value is unaffected. Earlier passes flagged graphite / h-BN B as out-of-scope due to a c-axis projection issue inflating the engine's isotropic B; that issue is now resolved (graphite +0.6%, h-BN -20.9%, MoS2 -6.0%) and the layered rows are scored alongside the rest.

Tier 3 — Workflow Breadth in One Call

This section is a workflow-compression comparison, not a per-property accuracy benchmark

A single composition-only call returns 40+ material properties in ~3 milliseconds. Reproducing the same property breadth in DFT requires a stack of 6+ separate workflows running for ~1–3 weeks of CPU time per material.

Note: this Tier compares workflow architecture (one engine call vs N DFT workflows), not per-property accuracy. Per-property engine accuracy against experiment is benchmarked on dedicated pages — band gap, mobility, Curie temperature, melting, elastic moduli, etc. The DFT cross-check accuracy claims are in Tiers 1 and 2 above.

FluxMateria
40+
Properties per call
FluxMateria
~3 ms
Wall time per material
DFT
~1–3 wk
To reproduce, per material
FluxMateria vs DFT
~108–109×
Speedup per material
Property class Count Properties returned by one engine call DFT workflow needed to match DFT wall time
Structural 6 Lattice constant, bond length, density, atomic volume, coordination, primitive cell SCF + cell-shape relaxation (BFGS) or EOS scan minutes – hours
Mechanical 8 Bulk modulus B, shear G, Young's E, Poisson's ν, elastic constants C11/C12/C44, Cauchy ratio DFPT or strain-matrix elastic-constants run; EOS for B hours – 1 day
Thermal 9 Cohesive energy, formation enthalpy ΔHf, Debye θD, sound velocity vs, thermal expansion α, heat capacity Cv, thermal conductivity κ, Grüneisen γ, melting Tm DFPT phonon dispersion + Boltzmann transport for κ; quasi-harmonic phonons across volumes for α / γ 2 – 5 days
Electronic 7 Band gap Eg, effective mass m*, mobilities μe / μh, work function W, ionization energy, electron affinity SCF + band structure + BoltzTraP (mobility); slab calc (work function); HSE06 / GW for accurate gap 1 – 3 days
Optical 6 Refractive index n, static ε(0), optical ε(∞), absorption coefficient α(ω), reflectivity, color LR-TDDFT or BSE for ε(ω); DFPT for static dielectric tensors 1 – 3 days
Magnetic 4 Magnetic moment μ, saturation magnetization Ms, Curie temperature TC, susceptibility χ Spin-polarised SCF + magnetic exchange parameters + Heisenberg Monte Carlo for TC 2 – 5 days
TOTAL per material 40+ One composition-only engine call returns the full set 6+ separate DFT workflows, each with its own setup, convergence, and post-processing ~1–3 weeks
~1010×
Workflow-time ratio
15-material full property suite
(estimated, not measured here)

FluxMateria returns the full ~40-property profile for every material on the page in under 50 milliseconds total — measured. Reproducing the same property breadth in a standard DFT workflow stack would take an estimated ~3–9 months of single-thread CPU time based on literature wall-times for the constituent calculations (15 materials × 1–3 weeks each across 6+ separate workflows). FluxMateria takes a chemical formula; DFT requires an input crystal structure, pseudopotentials, and a workflow stack.

DFT wall times in this section are estimates from standard literature timings, not measured in this benchmark. Per-property engine accuracy is reported on dedicated pages (band gap, mobility, Curie temperature, melting point, elastic moduli, etc.); the head-to-head DFT cross-check on the 15-material set is in Tiers 1 and 2 above. Tier 3 is a workflow-architecture comparison only.

Comparison with DFT and ML

High-level context: DFT and ML workflow comparison.

Ranges in this table are representative, method-dependent, and included for context. The direct measured comparison on this page is FluxMateria vs the GPAW PBE setup defined above; the ML row is reference only.

Metric FluxMateria DFT (GPAW PBE) ML (universal IPs / GNN)
Lattice constant 0.2% MAPE (median 0.1%) ~1–2% typical (relaxed) ~2–5% (in-domain)
Band gap 7.6% MAPE (median 1.2%) 30–50% (PBE underestimate) 20–40% (on labeled gaps)
Magnetic moment 3.6% MAPE (Fe / Ni) ~9% on Fe / Ni at PBE n/a in most universal IPs
Speed per query ~3 milliseconds
(40+ properties per call)
50 s (fast PBE EOS) → hours (production / hybrid) → days (GW); per workflow ~0.1–1 second per property
Input required Composition only Full crystal structure Composition + structure
Training data None None (ab initio) 10K–1M+ labeled cells
Fitted parameters 0 fitted XC functional choice Millions

Key takeaway: on this 15-material set, FluxMateria predicts lattice constants composition-only with 0.1% median error (0.2% MAPE across all 15), beats the same PBE setup on band gap (7.6% vs 45.1% MAPE) and on Fe / Ni magnetic moment (3.6% vs 9.0%), and returns bulk modulus from a single call with median 0.7% error and 6.0% MAPE across all 15 materials — all in milliseconds, and producing 40+ properties per composition-only call. The DFT side requires explicit crystal structure, pseudopotentials, and SCF settings; ML force fields additionally require thousands-to-millions of labeled training cells and degrade outside their training distribution.

Methodology

What was run, how, and where the artifacts live

Tier 1 Settings — lattice / band gap / magnetic moment

  • 15 canonical materials spanning 8 families: Group IV (Si, Ge), III–V (GaAs, GaN), II–VI (ZnO), oxides (MgO, TiO2-rutile), ionic (NaCl), metals (Al, Cu), magnetic metals (Fe, Ni), and layered (graphite, h-BN, MoS2)
  • DFT engine: GPAW 25.7, ASE 3.28, plane-wave PBE
  • DFT settings: 200 eV cutoff, 6³ k-mesh (6×6×4 for hexagonal cells), Fermi–Dirac smearing, spin-polarised SCF for the magnetic metals
  • Lattice: fixed at experimental values (no relaxation at Tier 1) — the lattice channel scores the engine's structural prediction, not DFT's
  • Three-layer comparison: engine vs DFT, engine vs experiment, DFT vs experiment, all on the same fixed set
  • Magnetic systems: Fe and Ni converged with spin-polarised PBE (110% band buffer, 200 SCF iterations)
  • Hardware: WSL2 / Ubuntu / GPAW 25.7 source build, RTX 3060 GPU (CPU used — small cells favour CPU)

Tier 2 Settings — relaxed lattice + bulk modulus

  • Equation-of-state scan: 7-point isotropic strain at −6%, −4%, −2%, 0, +2%, +4%, +6% volume around the experimental cell
  • Fit: Birch–Murnaghan, with safety check that rejects fits whose minimum lies outside the scan range
  • Outputs: equilibrium volume V0 → relaxed lattice constant a; curvature → bulk modulus B
  • DFT settings: same as Tier 1 (PBE / PW 200 eV / 6³ k-points) so the wall time stays affordable (~1.5 hours total for 15 materials)
  • Resilience: per-material atomic checkpointing — if the host crashes, the run resumes from the last completed material with no lost work
  • Why fast quality: the headline is "engine returns B in milliseconds where fast-PBE EOS is noisy at this cost" — not a comparison against converged production DFT. Production DFT (PW 500+ eV, 12³ k-mesh, BFGS pre-relax) recovers B more reliably than the run we did, at 10–100× the wall-time. We did not run that comparison; the literature-based estimate is in the Contextual section

Reproducibility

The benchmark is fully specified by:

  • A 15-material manifest with experimental lattice constants, band gaps, and (where relevant) magnetic moments from the Madelung 2004 and CRC 95th-ed. handbooks
  • The DFT settings listed above — standard plane-wave PBE that any GPAW or VASP user can reproduce
  • A three-layer scoring protocol — engine vs DFT, engine vs experiment, DFT vs experiment — emitted as CSV / JSON / Markdown

The DFT side completes in roughly 20 minutes on a modern laptop CPU; the engine side completes in seconds. A reproducer pack with manifest + run instructions is available on request.

Downloadable Benchmark Artifacts

Public exports with headline metrics, three-layer aggregates, and per-material rows for every material

Tier 1 — lattice / band gap / magnetic moment

Benchmark summary JSON
Headline MAPE / median / max per property, three-layer aggregate, and full per-material rows.
Download JSON
Row-level benchmark CSV
All 15 materials with experimental, engine, and DFT values for lattice constant, band gap, and magnetic moment, plus percent errors.
Download CSV
Human-readable summary (Markdown)
Methodology, sources, headline metrics, full per-material table, and notes in Markdown format.
Download MD

Tier 2 — relaxed lattice + bulk modulus B

Benchmark summary JSON
Tier 2 headlines, B and relaxed-a aggregates per layer, EOS-fit status per material, and full per-material rows.
Download JSON
Row-level benchmark CSV
All 15 materials with engine and DFT bulk modulus, EOS-derived relaxed lattice, plus per-property percent errors.
Download CSV
Human-readable summary (Markdown)
Tier 2 methodology, EOS-fit status, headlines, and per-material results in Markdown format.
Download MD

Tier 3 — full 40-property engine panel for all 15 materials

Engine panel JSON
All 40 properties × 15 materials, organized by class (structural / mechanical / thermal / electronic / optical / magnetic). One composition-only call per material, ~3 ms each.
Download JSON
Engine panel CSV
Wide table: 15 rows (materials) × 40+ columns (properties with units). One row = one composition-only call.
Download CSV
Human-readable summary (Markdown)
Tier 3 scope, the property-class breakdown, and a worked Si example showing all 40 values from one engine call.
Download MD

Tier 3B (direct DFPT-phonon / dielectric-tensor head-to-head against the engine on a small subset) is a planned follow-up where wall-time investment justifies the more expensive end of the DFT toolchain.

Scope & Limitations

Strengths

  • 15 canonical materials spanning 8 structural families with all DFT runs converged at Tier 1
  • Sub-1% lattice error on 14 of 15 materials (TiO2-rutile sits just outside at 1.1%); the previously-flagged wurtzite (GaN, ZnO) and layered (graphite, h-BN, MoS2) cells are now all under 1% after the latest structural-geometry refinements
  • Engine band gap median error 1.2%, vs 50.7% for DFT-PBE on the same set
  • Magnetic moment within 1.9% on Fe and 5.4% on Ni — better than PBE on the same set
  • Tier 2 engine bulk modulus median error 0.7% (MAPE 6.0%) across all 15 materials
  • Composition-only input: no crystal structure required from the user
  • Full reproducibility: manifest, settings, and per-material results downloadable below

Known Limitations

  • Wurtzite in-plane lattice (GaN, ZnO) shows ~6% over-prediction — bond-to-lattice geometry refinement is active work
  • Layered materials (graphite, h-BN) ~8% over-prediction; MoS2-2H +32% — in-plane d-projection geometry under refinement
  • MgO band gap underpredicted (-26.6%) — ionic wide-gap closure is being extended
  • Tier 2 now scores B across all 15 materials (median 0.7%, MAPE 6.0%); the c-axis projection issue that previously inflated layered MAPE is resolved and graphite / MoS2 are under 7% with h-BN at 21%. Wurtzite ZnO B (+32%) and h-BN B remain the elevated rows under refinement (planned Tier 2B with anisotropic c/a relaxation)
  • DFT B at fast quality is noisy even where the EOS fit converges — see footnote on Cu (1036 vs 140) and MgO (922 vs 160). Production-quality DFT is needed for reliable B
  • Layered materials' experimental B is in-plane; engine reports cell-isotropic B, so they're flagged as out-of-scope for the Tier 2 B benchmark
  • Tier 1+2 cover 15 materials; the broader 76- and 195-material benchmark sets are reported on dedicated pages

References

Primary data sources and DFT implementation

  1. O. Madelung, Semiconductors: Data Handbook, 3rd ed., Springer, 2004 — experimental lattice constants and band gaps for III–V, II–VI, IV semiconductors.
  2. D. R. Lide (Ed.), CRC Handbook of Chemistry and Physics, 95th ed., 2014 — experimental lattice constants and elastic moduli for metals and ionic compounds.
  3. J. Enkovaara et al., “Electronic structure calculations with GPAW,” J. Phys. Condens. Matter, 22, 253202 (2010) — GPAW DFT package.
  4. A. H. Larsen et al., “The atomic simulation environment—a Python library for working with atoms,” J. Phys. Condens. Matter, 29, 273002 (2017) — ASE structure-builder library.
  5. J. P. Perdew, K. Burke, M. Ernzerhof, “Generalized Gradient Approximation Made Simple,” Phys. Rev. Lett., 77, 3865 (1996) — PBE exchange–correlation functional.

Benchmark basis

This page is a head-to-head comparison of FluxMateria's composition-only predictions against locally-run plane-wave DFT (GPAW PBE) and experimental literature, on the same 15 fixed materials, across two tiers: single-point lattice / band gap / magnetic moment, and EOS-derived relaxed lattice + bulk modulus. Engine accuracy is reported in three layers: vs DFT, vs experiment, and DFT's own accuracy bound.

First-Principles Physics

Try the Materials module

Predict lattice constant, band gap, magnetic moment, mobility, Debye temperature, and 30+ more properties from composition alone — in milliseconds.

← Back to Module Request Access