The science running on AI infrastructure matters deeply.
We built the instrument that measures how well it performs.
The cluster characterizes itself first.
PROFILE runs before any other engine. It identifies the scheduler, hardware, and workload types in use — then routes each engine to the highest-fidelity data source available. Everything that follows depends on what PROFILE finds.
GPU utilization, measured honestly.
Of the GPU capacity allocated to each workload, how much was actually used? The answer is almost always surprising. ACE finds the gap between what was requested and what ran.
Energy in versus useful work out.
Cooling is the largest non-compute energy cost in most facilities. COOL grades thermal efficiency against what is genuinely achievable for this facility type and climate — not an industry average.
Is the carbon accounting actually traceable?
Carbon accounting in AI infrastructure ranges from rigorous to meaningless. FLUX grades the methodology and detects when claimed renewable coverage cannot be traced from grid to certificate to claim.
The decisions made before work starts.
Scheduler policy determines how compute gets allocated before a single workload runs. PACE grades those decisions and finds the patterns that leave hardware waiting: over-requesting, poor backfill, fairshare imbalance.
Is this the right hardware for this work?
A cluster optimized for training in 2021 may be the wrong tool for inference in 2026. CORE grades hardware-to-workload fit, fleet age, and whether the infrastructure was designed for what it is actually running.
Five engines become one number.
GRADE weighs what every engine found and produces a PTL Score between 0.0 and 1.0. The weights are published. The formula is deterministic. The same inputs always produce the same number.
Ranked by impact, not difficulty.
A score is only useful if it points somewhere. ATLAS ranks the changes most likely to improve the next assessment. Every recommendation is specific to this cluster and this workload profile. No generic advice.
The cluster describes itself.
A NemoClaw-compatible agent that runs inside the infrastructure being assessed. No forms. No manual exports. CLAW collects from DCGM, Slurm, Kubernetes, and inference servers — then packages it for assessment.
AI infrastructure should perform at the level of the science it supports.
research@plaintheory.org