Plain Theory Labs

One honest score for AI compute infrastructure.

The science running on AI infrastructure matters deeply. We built the instrument that measures how well it performs.

0.873

Frontier NERSC Perlmutter Operating at the frontier of what this hardware is capable of.

PTL Score · 0.0 to 1.0 · higher is better

0.814 Optimized Meridian AI High performance. Room to grow.

0.736 Optimized OLCF Frontier Strong efficiency. Notable gaps identified.

0.450 Developing MIT Supercloud First assessment. Clear path forward.

Start a pilot Read the documentation

We built PTL because the measurement didn't exist, and we believed it should.

The framework

ACE Of the computing capacity allocated to each workload, whether a Slurm job, a Kubernetes pod, or a running agent, how much was actually used? The answer is almost always surprising.

COOL Cooling is the largest source of wasted energy in most facilities. COOL grades thermal efficiency against what is genuinely achievable for this type of facility in this climate.

FLUX Carbon accounting in AI infrastructure ranges from rigorous to meaningless. FLUX grades the quality of the methodology and detects when claimed renewable coverage is not supported by how the numbers were calculated.

PACE Scheduler policy determines how compute gets allocated before a single workload runs. PACE grades those decisions and finds the patterns that leave hardware waiting: over-requesting, poor backfill, and fairshare imbalance.

CORE A cluster optimized for training in 2021 may be the wrong tool for inference in 2026. CORE grades the fit between the hardware deployed and the work being done.

GRADE GRADE produces the number. It weighs what every engine found and arrives at a PTL Score between 0.0 and 1.0, where 1.0 represents the full potential of the hardware and configuration in use. The score reflects how the infrastructure actually performs, not how it was intended to perform.

ATLAS A score is only useful if it points somewhere. ATLAS ranks the changes most likely to improve the next assessment, ordered by impact, not by difficulty.

CLAW A NemoClaw-compatible agent that runs inside the infrastructure being assessed. The cluster describes itself.

Work with us

We are taking three pilots this quarter. The findings belong to your organization. We keep the methodology.

research@plaintheory.org