Experimental Reference Benchmark

1,483

Validated Reference Points

atomic + molecular scalar targets

0.176%

Weighted Raw MAPE

official no-fit score

0

Fitted Corrections

no training, no calibration

0

DFT Targets

computed-only rows excluded

Primary result

1,483 validated targets are scored by raw Flux formula value versus experimental/reference value, with 0.176% weighted raw MAPE. No training, no fitted correction, no DFT targets.

Benchmark Breakdown

Raw Flux outputs against embedded experimental/reference targets, with no fitted correction

Target family	N	Raw MAPE	Raw MAE	Unit	Embedded reference basis
All scored rows	1,483	0.176%	weighted by target family	mixed	Validated rows only; DFT/computed-only rows excluded
Atomic ionization energy	86	0.149%	0.0110	eV	NIST atomic spectroscopy references
Atomic electron affinity	80	0.518%	0.0048	eV	NIST/CRC electron-affinity references
Atomic covalent radius	86	0.356%	0.496	pm	CRC/Cordero-style covalent-radius references
Atomic polarizability	94	0.056%	0.0078	A³	CRC atomic polarizability references
Atomic electronegativity	15	0.000%	0.0000	Pauling	Validated embedded Pauling references only
Molecular bond length	330	0.130%	0.197	pm	CCCBDB/NIST/CRC bond-length references
Molecular bond angle	149	0.064%	0.069	deg	CCCBDB/NIST/CRC bond-angle references
Molecular dipole moment	184	0.295%	0.0044	D	CCCBDB/NIST/CRC dipole references
Expanded dipole moment	158	0.237%	0.0064	D	Alternative Flux dipole unit, same references
Molecular polarizability	151	0.166%	0.0098	A³	CCCBDB/NIST/CRC molecular polarizability references
Molecular ionization energy	150	0.010%	0.0010	eV	CCCBDB/NIST/CRC molecular ionization references

Scoring Policy

The benchmark is intentionally strict about what counts

Included

Rows with a raw Flux formula value, an explicit reference value, and validated=true.

Excluded

Rows without references, rows not marked validated, and computed-only or DFT-only targets.

No fitting

The official score uses the raw Flux value. Global linear-fit diagnostics are not counted as accuracy claims.

Reproducible

A public-safe summary export is linked below with the benchmark policy, coverage, and target-family metrics.

Benchmark Data

Public-safe summary artifact for reviewers and readers

Experimental reference summary

The downloadable JSON contains the benchmark policy, aggregate score, coverage totals, and per-family raw MAPE/MAE values. It intentionally omits implementation details and private repository structure.

1,483 scored targets

11 target families

0.176% weighted raw MAPE

Summary JSON

Machine-readable public artifact for the published benchmark page.

Download summary

Runtime Context

How this benchmark sits relative to common computational routes

Method family	Typical runtime posture	What it would mean for this 1,483-point panel	Important caveat
FluxMateria raw formula engines	closed-form Flux physics formula evaluation	Interactive-scale evaluation for the full scalar-property panel. Cached values are convenience artifacts from the same Flux formulas, not trained or fitted surrogates.	Applies to the validated target families listed on this page.
DFT / Kohn-Sham electronic structure	Self-consistent quantum calculation; conventional diagonalization is commonly the expensive step.	Would require many independent electronic-structure jobs, often plus geometry optimization or response calculations depending on the target.	DFT is not itself an experimental target and does not automatically deliver sub-1% agreement for every scalar observable without method choices.
Semiempirical quantum methods	Much faster than ab initio quantum chemistry; often used for geometry, screening, and prescreening.	Could run many small cases quickly, but accuracy is method- and chemistry-dependent.	Parameterized/semiempirical by design; not the same claim as raw no-fit Flux formulas versus references.
Classical force fields	Very fast molecular mechanics.	Useful for structure, conformations, and dynamics, but not a general route to atomic ionization energies, electron affinities, or quantum response properties.	Parameter coverage and transferability define the valid domain.
ML potentials / learned surrogates	Fast at inference after training.	Can be excellent inside the training domain, especially for energies and forces.	This benchmark excludes training-derived predictions from the official score.

The useful comparison is not that DFT is “slow” and Flux is “fast.” The sharper point is that this page scores against experimental/reference scalar values directly. DFT, semiempirical methods, force fields, and ML models are alternative computational routes; they would need their own fixed protocol and target-family score to make a strict apples-to-apples comparison.

Background references: conventional Kohn-Sham implementations are widely discussed as facing diagonalization and scaling costs in the electronic-structure literature, including the Acta Numerica review of Kohn-Sham numerical methods. Semiempirical xTB/GFN methods are explicitly positioned as fast approximate quantum methods in the ORCA semiempirical-method documentation, and GFN-FF is described in the xTB documentation as combining force-field speed with near-quantum-mechanical accuracy for structures and dynamics.

Methodology

How the benchmark report is generated

Benchmark source

The benchmark uses curated atomic and molecular formula tables with row-level Flux values, reference values, and validation flags. The current report spans atomic constants and molecular scalar observables used throughout the FluxMateria chemistry stack.

Official metric

The official score is raw mean absolute percentage error (MAPE) between the Flux value and the embedded reference value. The aggregate headline is weighted by the number of scored rows per target family. No model is trained, no target-specific fit is applied, and no DFT output is accepted as a target for this page.

Evidence packet

The public summary artifact reports target-family metrics without exposing private repository structure or implementation details. Full row-level provenance can be provided in a reviewer packet under the normal validation workflow.

What this benchmark does NOT claim

Not a validation of spatial density, visualization output, or 3D molecular shape. This page covers scalar atomic and molecular properties only.
Not a DFT comparison. DFT/computed-only targets are explicitly excluded from this score.
Not a trained surrogate result. The rows are scored from raw Flux formulas against embedded references.
Not a universal guarantee across every chemical observable. It covers the target families listed above.
Not a substitute for external blind validation. It is a published reference benchmark with explicit scope.

Experimental Reference BENCHMARK