The evaluation problem
Physics-ML vendors make bold accuracy claims. Comparing them across different internal benchmarks and proprietary datasets is impossible. Comparotor gives procurement teams a single, neutral evaluation framework: the same dataset, the same SU2 reference, the same scoring formula for every vendor.
Why Comparotor for procurement
Vendor comparison
Evaluate multiple vendors on the same sealed test set. Private multi-vendor comparisons: no vendor sees another vendor's results or submission.
Procurement-ready reports
Signed PDF reports include dataset version, scoring breakdown, OOD performance, and a procurement interpretation section. Designed to survive engineering governance review.
Contamination resistance
Quarterly test-set rotation prevents vendors from overfitting to the benchmark. Evaluation results remain meaningful across procurement cycles.
Replay audit
Every evaluation can be replayed against the archived test set. Results are verifiable 24 months post-evaluation for contract disputes or post-procurement review.
Procurement workflow
Benchmark coverage
Frequently asked questions
Discuss your evaluation
Our team can walk you through multi-vendor evaluation design, managed submission options, and Enterprise plan terms.