DFT Cross-Check Benchmark

0.2%

Lattice constant MAPE

median 0.1% across 15 materials

7.6%

Band gap MAPE

vs DFT-PBE 45.1% on the same set

3.6%

Magnetic moment MAPE

Fe and Ni; PBE 9.0% on the same

~10⁴×

Speedup vs DFT (measured)

measured here on fast PBE; up to ~10⁹× contextual at higher DFT quality (not measured)

Tier 1 — Single-Point Lattice / Band Gap / Magnetic Moment

Each property scored against both DFT and experiment on the same 15 materials, lattice fixed at experimental input

Property	N	Engine vs experiment	DFT (PBE) vs experiment	Verdict
Lattice constant a (Å)	15	MAPE 0.2% · median 0.1%	0.0% by construction (Tier 1 fixes DFT lattice to experiment; only Tier 2 relaxes)	Engine predicts composition-only; DFT given exp
Band gap (eV)	10	MAPE 7.6% · median 1.2%	MAPE 45.1% · median 50.7%	Engine beats PBE
Magnetic moment (μ_B) — Fe and Ni only, n=2	2	MAPE 3.6% · median 3.6%	MAPE 9.0% · median 9.0%	Engine beats PBE on Fe/Ni

DFT runs use plane-wave PBE (200 eV cutoff, 6³ k-points, no relaxation). The full settings are in the methodology section below.

Per-Material Results

All 15 materials, raw numbers, sorted by family

Material	Family	Exp. a (Å)	FLUX a (Å)	a error	Exp. E_g (eV)	FLUX E_g (eV)	DFT E_g (eV)	Status
Si	Group IV (diamond)	5.431	5.431	−0.0%	1.12	1.12	0.92	PASS
Ge	Group IV (diamond)	5.658	5.655	−0.1%	0.66	0.66	0.43	PASS
GaAs	III–V (zincblende)	5.653	5.647	−0.1%	1.43	1.43	1.22	PASS
GaN	III–V (wurtzite)	3.189	3.183	−0.2%	3.40	3.44	1.83	PASS
ZnO	II–VI (wurtzite)	3.249	3.230	−0.6%	3.37	3.40	0.93	PASS
MgO	Oxide (rocksalt)	4.211	4.208	−0.1%	7.83	6.16	3.13	FAIR
TiO₂	Oxide (rutile)	4.594	4.545	−1.1%	3.05	3.39	1.78	PASS
NaCl	Ionic (rocksalt)	5.640	5.630	−0.2%	8.60	9.17	5.38	PASS
Al	Metal (fcc)	4.046	4.049	+0.1%	0	0	0	PASS
Cu	Metal (fcc)	3.615	3.621	+0.2%	0	0	0	PASS
Fe	Magnetic metal (bcc)	2.866	2.864	−0.1%	moment 2.22 μ_B → engine 2.26 (+1.9%), DFT 2.20 (−1.0%)			PASS
Ni	Magnetic metal (fcc)	3.524	3.521	−0.1%	moment 0.62 μ_B → engine 0.65 (+5.4%), DFT 0.72 (+16.8%)			PASS
Graphite (C)	Layered	2.461	2.461	+0.0%	0	5.73	1.28	FAIR
h-BN	Layered	2.504	2.507	+0.1%	5.96	3.97	3.84	FAIR
MoS₂ (2H)	Layered	3.160	3.156	−0.1%	1.29	1.27	1.12	PASS

14 of 15 materials at ≤1% lattice error (all except TiO₂-rutile, which sits just outside at 1.1%). Layered systems (graphite, h-BN, MoS₂) and wurtzite in-plane lattice (GaN, ZnO) all converged to under-1% after the structural-geometry refinements landed in this iteration.

Tier 2 — Relaxed Lattice + Bulk Modulus

7-point Birch–Murnaghan equation-of-state per material (strain −6%, −4%, −2%, 0, +2%, +4%, +6%) adds two new comparisons: relaxed lattice constant and bulk modulus B

Same task, same 15 materials	Engine	DFT (PBE, fast quality)
Bulk modulus B (GPa) accuracy vs experiment	All 15 materials: median 0.7% off exp; MAPE 6.0% (max 32.3% on ZnO). Layered systems (graphite, h-BN, MoS₂) now sit inside 21% after the structural-geometry refinements landed.	median 13.7% off exp; MAPE 176% (7 EOS-converged out of 15 — Cu and MgO fits noisy at this DFT cost; layered + Fe-bcc EOS did not converge)
Relaxed lattice a (Å) accuracy vs experiment	median 0.1% off exp; MAPE 0.2% (15 / 15 materials)	median 0.6% off exp (7 EOS-converged out of 15)
Wall time per material for the same EOS / B + relaxed-a calculation	~3 milliseconds (one composition-only call)	11 s (Al) – 294 s (TiO₂); median 50 s; mean 88 s; 7-point SCF EOS scan
Total wall time, all 15 materials	1.4 seconds	22.1 minutes
Mean speedup, engine vs DFT EOS	~25,000× (per material; ~950× including the engine's one-time 600 ms first-call import)
Inputs required	Chemical formula only	Full crystal structure + EOS scan window + SCF settings

Measured in this benchmark

DFT setting we ran	DFT wall time per material	Engine wall time per material	Measured speedup
Fast PBE EOS PW 200 eV, 6³ k-points, 7 SCFs — a screening-grade setup	~50 s median (11 s – 294 s)	~3 ms	~25,000× (per material, measured)

Accuracy on this run, vs experiment: engine band gap median 1.2% (DFT 50.7%); engine B median 0.7%, MAPE 6.0% across all 15 materials (DFT 13.7% on the 7 EOS-converged). Engine returns 40+ properties from a single composition-only call.

Contextual: more expensive DFT settings

The numbers in this section are not measured in this benchmark. They are based on standard literature timings for the same kinds of calculations at higher quality — the DFT settings a real research lab or industrial materials team would normally run. Included as context for why our measured 25,000× is on the conservative end of the speedup story.

DFT setting (not run here)	Typical per-material wall time	Engine speedup (estimated)	What it's used for
Production PBE (PW 500–600 eV, 12³ k-points, BFGS pre-relax + EOS)	~30 min – 2 hr	~500,000 – 2,000,000×	Standard for a materials publication; DFT B converges to ~5–15% MAPE typical, band gap stays at PBE-functional level (~37%).
Hybrid functional (HSE06 EOS)	~hours per material	~10–100 million×	Required for accurate band gaps (MAPE typically ~10–15% on a clean reference set).
GW corrections (G₀W₀ or self-consistent GW)	~hours – days	~10⁸–10⁹×	The reference for band gaps in hard-matter physics (~5% MAPE typical).
Full ~40-property characterization (separate SCF / EOS / DFPT / BoltzTraP / magnetic-MC workflows)	~hours – days	~10⁶–10⁹×	Architectural comparison: the engine returns the suite from one composition-only call. Per-property accuracy is reported on dedicated benchmark pages, not in this DFT cross-check.

Reading guide: the engine's 3 ms per material doesn't change with DFT quality. The DFT side scales with cutoff, k-mesh density, exchange-correlation cost, and workflow count, so the speedup grows as you push DFT toward higher quality. We did not run any of the DFT settings in this section — the timings are literature-standard.

Honest note on DFT B at this quality: Even on the 7 materials where the EOS curve fit converged, fast-quality DFT (PW 200 eV, 7-point ±6% strain) produces noisy B values — Cu shows 1036 GPa vs the 140 GPa experimental, MgO shows 922 vs 160. These are EOS-fit artifacts at affordable compute cost, not PBE failures. Production-quality DFT recovers B to ~5–15% on the same materials, but at the wall-times listed in the table above. The engine returns B from a single composition-only call and matches experiment to median 0.7%, MAPE 6.0% across all 15 materials.

Per-material apples-to-apples (B + relaxed lattice + wall time)

Material	B engine (GPa)	B DFT (GPa)	B exp (GPa)	a engine (Å)	a DFT (Å)	a exp (Å)	DFT time	Speedup
Si	98.0	88.6	98.0	5.431	5.477	5.431	31.0 s	~10,000×
Ge	75.3	65.4	75.8	5.655	5.769	5.658	31.0 s	~10,000×
GaAs	75.2	—	75.5	5.647	—	5.653	32.6 s	~10,000×
GaN	210.4	—	210.0	3.183	—	3.189	104.3 s	~40,000×
ZnO	187.8	—	142.0	3.230	—	3.249	212.1 s	~77,000×
MgO	136.1	922†	160.0	4.208	4.235	4.211	39.8 s	~15,000×
TiO₂	191.4	—	210.0	4.545	—	4.594	293.9 s	~133,000×
NaCl	24.0	44.3	24.5	5.630	5.693	5.640	50.9 s	~19,000×
Al	76.0	80.2	76.0	4.049	4.042	4.046	10.8 s	~3,300×
Cu	140.1	1036†	140.0	3.621	3.592	3.615	24.4 s	~5,200×
Fe	170.1	—	170.0	2.864	—	2.866	21.7 s	~11,000×
Ni	185.7	194.3	180.0	3.521	3.541	3.524	77.6 s	~19,000×
Graphite	33.2	—	33.0	2.461	—	2.461	44.4 s	~15,000×
h-BN	28.5	—	36.0	2.507	—	2.504	61.1 s	~20,000×
MoS₂	56.6	—	53.0	3.156	—	3.160	292.5 s	~97,000×

Engine wall-time per material: ~3 ms typical (one composition-only call). DFT wall-time is for the 7-point Birch–Murnaghan equation-of-state scan. — = DFT EOS fit failed at this quality (V₀ outside the ±6% strain window, or fit non-convergent for wurtzite c/a anisotropy / spin–volume coupling on Fe). † = DFT EOS fit converged but bulk-modulus extraction is noisy at fast quality — production DFT recovers these. Engine value is unaffected. Earlier passes flagged graphite / h-BN B as out-of-scope due to a c-axis projection issue inflating the engine's isotropic B; that issue is now resolved (graphite +0.6%, h-BN -20.9%, MoS₂ -6.0%) and the layered rows are scored alongside the rest.

Tier 3 — Workflow Breadth in One Call

This section is a workflow-compression comparison, not a per-property accuracy benchmark

A single composition-only call returns 40+ material properties in ~3 milliseconds. Reproducing the same property breadth in DFT requires a stack of 6+ separate workflows running for ~1–3 weeks of CPU time per material.

Note: this Tier compares workflow architecture (one engine call vs N DFT workflows), not per-property accuracy. Per-property engine accuracy against experiment is benchmarked on dedicated pages — band gap, mobility, Curie temperature, melting, elastic moduli, etc. The DFT cross-check accuracy claims are in Tiers 1 and 2 above.

FluxMateria

40+

Properties per call

FluxMateria

~3 ms

Wall time per material

DFT

~1–3 wk

To reproduce, per material

FluxMateria vs DFT

~10⁸–10⁹×

Speedup per material

Property class	Count	Properties returned by one engine call	DFT workflow needed to match	DFT wall time
Structural	6	Lattice constant, bond length, density, atomic volume, coordination, primitive cell	SCF + cell-shape relaxation (BFGS) or EOS scan	minutes – hours
Mechanical	8	Bulk modulus B, shear G, Young's E, Poisson's ν, elastic constants C₁₁/C₁₂/C₄₄, Cauchy ratio	DFPT or strain-matrix elastic-constants run; EOS for B	hours – 1 day
Thermal	9	Cohesive energy, formation enthalpy ΔH_f, Debye θ_D, sound velocity v_s, thermal expansion α, heat capacity C_v, thermal conductivity κ, Grüneisen γ, melting T_m	DFPT phonon dispersion + Boltzmann transport for κ; quasi-harmonic phonons across volumes for α / γ	2 – 5 days
Electronic	7	Band gap E_g, effective mass m*, mobilities μ_e / μ_h, work function W, ionization energy, electron affinity	SCF + band structure + BoltzTraP (mobility); slab calc (work function); HSE06 / GW for accurate gap	1 – 3 days
Optical	6	Refractive index n, static ε(0), optical ε(∞), absorption coefficient α(ω), reflectivity, color	LR-TDDFT or BSE for ε(ω); DFPT for static dielectric tensors	1 – 3 days
Magnetic	4	Magnetic moment μ, saturation magnetization M_s, Curie temperature T_C, susceptibility χ	Spin-polarised SCF + magnetic exchange parameters + Heisenberg Monte Carlo for T_C	2 – 5 days
TOTAL per material	40+	One composition-only engine call returns the full set	6+ separate DFT workflows, each with its own setup, convergence, and post-processing	~1–3 weeks

~10¹⁰×

Workflow-time ratio

15-material full property suite
(estimated, not measured here)

FluxMateria returns the full ~40-property profile for every material on the page in under 50 milliseconds total — measured. Reproducing the same property breadth in a standard DFT workflow stack would take an estimated ~3–9 months of single-thread CPU time based on literature wall-times for the constituent calculations (15 materials × 1–3 weeks each across 6+ separate workflows). FluxMateria takes a chemical formula; DFT requires an input crystal structure, pseudopotentials, and a workflow stack.

DFT wall times in this section are estimates from standard literature timings, not measured in this benchmark. Per-property engine accuracy is reported on dedicated pages (band gap, mobility, Curie temperature, melting point, elastic moduli, etc.); the head-to-head DFT cross-check on the 15-material set is in Tiers 1 and 2 above. Tier 3 is a workflow-architecture comparison only.

Comparison with DFT and ML

High-level context: DFT and ML workflow comparison.

Ranges in this table are representative, method-dependent, and included for context. The direct measured comparison on this page is FluxMateria vs the GPAW PBE setup defined above; the ML row is reference only.

Metric	FluxMateria	DFT (GPAW PBE)	ML (universal IPs / GNN)
Lattice constant	0.2% MAPE (median 0.1%)	~1–2% typical (relaxed)	~2–5% (in-domain)
Band gap	7.6% MAPE (median 1.2%)	30–50% (PBE underestimate)	20–40% (on labeled gaps)
Magnetic moment	3.6% MAPE (Fe / Ni)	~9% on Fe / Ni at PBE	n/a in most universal IPs
Speed per query	~3 milliseconds (40+ properties per call)	50 s (fast PBE EOS) → hours (production / hybrid) → days (GW); per workflow	~0.1–1 second per property
Input required	Composition only	Full crystal structure	Composition + structure
Training data	None	None (ab initio)	10K–1M+ labeled cells
Fitted parameters	0 fitted	XC functional choice	Millions

Key takeaway: on this 15-material set, FluxMateria predicts lattice constants composition-only with 0.1% median error (0.2% MAPE across all 15), beats the same PBE setup on band gap (7.6% vs 45.1% MAPE) and on Fe / Ni magnetic moment (3.6% vs 9.0%), and returns bulk modulus from a single call with median 0.7% error and 6.0% MAPE across all 15 materials — all in milliseconds, and producing 40+ properties per composition-only call. The DFT side requires explicit crystal structure, pseudopotentials, and SCF settings; ML force fields additionally require thousands-to-millions of labeled training cells and degrade outside their training distribution.

Methodology

What was run, how, and where the artifacts live

Tier 1 Settings — lattice / band gap / magnetic moment

15 canonical materials spanning 8 families: Group IV (Si, Ge), III–V (GaAs, GaN), II–VI (ZnO), oxides (MgO, TiO₂-rutile), ionic (NaCl), metals (Al, Cu), magnetic metals (Fe, Ni), and layered (graphite, h-BN, MoS₂)
DFT engine: GPAW 25.7, ASE 3.28, plane-wave PBE
DFT settings: 200 eV cutoff, 6³ k-mesh (6×6×4 for hexagonal cells), Fermi–Dirac smearing, spin-polarised SCF for the magnetic metals
Lattice: fixed at experimental values (no relaxation at Tier 1) — the lattice channel scores the engine's structural prediction, not DFT's
Three-layer comparison: engine vs DFT, engine vs experiment, DFT vs experiment, all on the same fixed set
Magnetic systems: Fe and Ni converged with spin-polarised PBE (110% band buffer, 200 SCF iterations)
Hardware: WSL2 / Ubuntu / GPAW 25.7 source build, RTX 3060 GPU (CPU used — small cells favour CPU)

Tier 2 Settings — relaxed lattice + bulk modulus

Equation-of-state scan: 7-point isotropic strain at −6%, −4%, −2%, 0, +2%, +4%, +6% volume around the experimental cell
Fit: Birch–Murnaghan, with safety check that rejects fits whose minimum lies outside the scan range
Outputs: equilibrium volume V₀ → relaxed lattice constant a; curvature → bulk modulus B
DFT settings: same as Tier 1 (PBE / PW 200 eV / 6³ k-points) so the wall time stays affordable (~1.5 hours total for 15 materials)
Resilience: per-material atomic checkpointing — if the host crashes, the run resumes from the last completed material with no lost work
Why fast quality: the headline is "engine returns B in milliseconds where fast-PBE EOS is noisy at this cost" — not a comparison against converged production DFT. Production DFT (PW 500+ eV, 12³ k-mesh, BFGS pre-relax) recovers B more reliably than the run we did, at 10–100× the wall-time. We did not run that comparison; the literature-based estimate is in the Contextual section

Reproducibility

The benchmark is fully specified by:

A 15-material manifest with experimental lattice constants, band gaps, and (where relevant) magnetic moments from the Madelung 2004 and CRC 95^th-ed. handbooks
The DFT settings listed above — standard plane-wave PBE that any GPAW or VASP user can reproduce
A three-layer scoring protocol — engine vs DFT, engine vs experiment, DFT vs experiment — emitted as CSV / JSON / Markdown

The DFT side completes in roughly 20 minutes on a modern laptop CPU; the engine side completes in seconds. A reproducer pack with manifest + run instructions is available on request.

Downloadable Benchmark Artifacts

Public exports with headline metrics, three-layer aggregates, and per-material rows for every material

Tier 1 — lattice / band gap / magnetic moment

Benchmark summary JSON

Headline MAPE / median / max per property, three-layer aggregate, and full per-material rows.

Download JSON

Row-level benchmark CSV

All 15 materials with experimental, engine, and DFT values for lattice constant, band gap, and magnetic moment, plus percent errors.

Download CSV

Human-readable summary (Markdown)

Methodology, sources, headline metrics, full per-material table, and notes in Markdown format.

Download MD

Tier 2 — relaxed lattice + bulk modulus B

Benchmark summary JSON

Tier 2 headlines, B and relaxed-a aggregates per layer, EOS-fit status per material, and full per-material rows.

Download JSON

Row-level benchmark CSV

All 15 materials with engine and DFT bulk modulus, EOS-derived relaxed lattice, plus per-property percent errors.

Download CSV

Human-readable summary (Markdown)

Tier 2 methodology, EOS-fit status, headlines, and per-material results in Markdown format.

Download MD

Tier 3 — full 40-property engine panel for all 15 materials

Engine panel JSON

All 40 properties × 15 materials, organized by class (structural / mechanical / thermal / electronic / optical / magnetic). One composition-only call per material, ~3 ms each.

Download JSON

Engine panel CSV

Wide table: 15 rows (materials) × 40+ columns (properties with units). One row = one composition-only call.

Download CSV

Human-readable summary (Markdown)

Tier 3 scope, the property-class breakdown, and a worked Si example showing all 40 values from one engine call.

Download MD

Tier 3B (direct DFPT-phonon / dielectric-tensor head-to-head against the engine on a small subset) is a planned follow-up where wall-time investment justifies the more expensive end of the DFT toolchain.

Scope & Limitations

Strengths

15 canonical materials spanning 8 structural families with all DFT runs converged at Tier 1
Sub-1% lattice error on 14 of 15 materials (TiO₂-rutile sits just outside at 1.1%); the previously-flagged wurtzite (GaN, ZnO) and layered (graphite, h-BN, MoS₂) cells are now all under 1% after the latest structural-geometry refinements
Engine band gap median error 1.2%, vs 50.7% for DFT-PBE on the same set
Magnetic moment within 1.9% on Fe and 5.4% on Ni — better than PBE on the same set
Tier 2 engine bulk modulus median error 0.7% (MAPE 6.0%) across all 15 materials
Composition-only input: no crystal structure required from the user
Full reproducibility: manifest, settings, and per-material results downloadable below

Known Limitations

Wurtzite in-plane lattice (GaN, ZnO) shows ~6% over-prediction — bond-to-lattice geometry refinement is active work
Layered materials (graphite, h-BN) ~8% over-prediction; MoS₂-2H +32% — in-plane d-projection geometry under refinement
MgO band gap underpredicted (-26.6%) — ionic wide-gap closure is being extended
Tier 2 now scores B across all 15 materials (median 0.7%, MAPE 6.0%); the c-axis projection issue that previously inflated layered MAPE is resolved and graphite / MoS₂ are under 7% with h-BN at 21%. Wurtzite ZnO B (+32%) and h-BN B remain the elevated rows under refinement (planned Tier 2B with anisotropic c/a relaxation)
DFT B at fast quality is noisy even where the EOS fit converges — see footnote on Cu (1036 vs 140) and MgO (922 vs 160). Production-quality DFT is needed for reliable B
Layered materials' experimental B is in-plane; engine reports cell-isotropic B, so they're flagged as out-of-scope for the Tier 2 B benchmark
Tier 1+2 cover 15 materials; the broader 76- and 195-material benchmark sets are reported on dedicated pages

References

Primary data sources and DFT implementation

O. Madelung, Semiconductors: Data Handbook, 3^rd ed., Springer, 2004 — experimental lattice constants and band gaps for III–V, II–VI, IV semiconductors.
D. R. Lide (Ed.), CRC Handbook of Chemistry and Physics, 95^th ed., 2014 — experimental lattice constants and elastic moduli for metals and ionic compounds.
J. Enkovaara et al., “Electronic structure calculations with GPAW,” J. Phys. Condens. Matter, 22, 253202 (2010) — GPAW DFT package.
A. H. Larsen et al., “The atomic simulation environment—a Python library for working with atoms,” J. Phys. Condens. Matter, 29, 273002 (2017) — ASE structure-builder library.
J. P. Perdew, K. Burke, M. Ernzerhof, “Generalized Gradient Approximation Made Simple,” Phys. Rev. Lett., 77, 3865 (1996) — PBE exchange–correlation functional.

Try the Materials module

Predict lattice constant, band gap, magnetic moment, mobility, Debye temperature, and 30+ more properties from composition alone — in milliseconds.

← Back to Module Request Access

DFT Cross-Check BENCHMARK

Tier 1 — Single-Point Lattice / Band Gap / Magnetic Moment

Per-Material Results

Tier 2 — Relaxed Lattice + Bulk Modulus

Measured in this benchmark

Contextual: more expensive DFT settings

Per-material apples-to-apples (B + relaxed lattice + wall time)

Tier 3 — Workflow Breadth in One Call

Comparison with DFT and ML

Methodology

Tier 1 Settings — lattice / band gap / magnetic moment

Tier 2 Settings — relaxed lattice + bulk modulus

Reproducibility

Downloadable Benchmark Artifacts

Tier 1 — lattice / band gap / magnetic moment

Tier 2 — relaxed lattice + bulk modulus B

Tier 3 — full 40-property engine panel for all 15 materials

Scope & Limitations

Strengths

Known Limitations

References

Benchmark basis

Try the Materials module