Skip to content
§spec

Documentation

Authentication, API reference, datasets, submission formats, scoring, error codes, rate limits, and anti-gaming — for rotorbench-aero-v0.1.

The full spec for

rotorbench-aero-v0.1
is versioned in this repository — see the methodology page for the canonical benchmark definition. Below is the canonical summary plus the fastest route to a successful submission.

Authentication

All API requests must include a Clerk JWT bearer token in the Authorization header. Get your token from the dashboard after signing in. Tokens expire after 24 hours; refresh by signing in again.

  • Header: Authorization: Bearer <token>
  • Token lifetime: 24 hours
  • Refresh: sign in again via the dashboard
curl -H "Authorization: Bearer <your-token>" \
     https://api.comparotor.com/v1/leaderboard

1. Task

Predict aerodynamic coefficients (Cl, Cd, Cm) for a 2D airfoil at a single operating point (Re, α, M). Operating envelope:

  • Reynolds number Re ∈ [1e5, 1e7]
  • Angle of attack α ∈ [-10°, 20°] (1° resolution)
  • Mach number M ∈ [0.1, 0.8]

2. Datasets

Public training set

UIUC airfoil database (~1,600 airfoils) labelled by XFOIL across the operating envelope. Distributed as a single train.parquet from R2:

curl -O https://r2.comparotor.com/rotorbench-aero/v0.1/train.parquet
curl -O https://r2.comparotor.com/rotorbench-aero/v0.1/airfoils.tar.gz

Private test set

200 held-out airfoils (100 perturbed UIUC parents, 50 supercritical for OOD, 50 modern eVTOL/wind-blade sections). Labelled with SU2 RANS. Geometry and labels remain private. 50-airfoil rotation each quarter.

3. Before you submit

Verify all of the following before creating a submission:

  • Model produces valid float32 outputs for Cl, Cd, Cm
  • ONNX model passes onnxruntime inference with example inputs
  • Docker image responds to GET /healthz with HTTP 200
  • Docker image responds to POST /predict with JSON body
  • Submission name is unique within your organisation
  • You own or have rights to the model architecture and weights

4. Submission formats

Two accepted formats. Both are archived in R2 for 24 months.

ONNX + wrapper

Upload model.onnx plus a wrapper.json declaring input/output shapes. Inputs: geometry: float32[200, 2], operating_point: float32[3]. Output: coefficients: float32[3].

Docker image

Container exposing GET /healthz and POST /predict. Resources: 16 GB RAM, 1× NVIDIA T4, 60s wall-clock per request.

/predict request / response schema

Request body:

{
  "reynolds_number": 6000000,
  "angle_of_attack_deg": 8.0,
  "mach_number": 0.15,
  "airfoil_id": "naca0012"
}

Response body:

{
  "cl": 0.923,
  "cd": 0.00812,
  "cm": -0.0234,
  "latency_ms": 12.4
}

5. Scoring

Five metrics. The leaderboard's default sort is the composite below; users can re-sort by any individual metric.

composite = MAE_Cl + 10·MAE_Cd + 0.5·MAE_Cm
          + 0.2·(1 - rho_LD)
          + 0.1·OOD_score
          + 0.001·latency_p50_ms
  • MAE on Cl, Cd, Cm — pointwise mean absolute error
  • ρ L/D — Spearman rank correlation of L/D across airfoils per operating point
  • OOD score — error on the supercritical-airfoil subset
  • Latency p50 / p99 — wall-clock on the reference T4 container

6. Submission API (fastest path)

# 1. Register the submission (returns upload_url for ONNX)
curl -X POST $API/submissions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "my-rotor-net",
    "model_version": "1.0.0",
    "format": "onnx",
    "visibility": "public",
    "benchmark": "rotorbench-aero-v0.1"
  }'

# 2. Upload the ONNX artefact
curl -X PUT "$UPLOAD_URL" \
  -H "Authorization: Bearer $TOKEN" \
  --data-binary @model.onnx

# 3. Finalize and queue evaluation
curl -X POST "$FINALIZE_URL" \
  -H "Authorization: Bearer $TOKEN"

# 4. Poll the run
curl "$API/runs/$RUN_ID"

7. API quick reference

All endpoints are under https://api.comparotor.com/v1. Every request requires Authorization: Bearer <token>.

MethodEndpointDescription
POST/v1/submissionsCreate a new submission. Body: model_name, model_version, format, visibility, benchmark. Returns submission_id, upload_url, finalize_url.
GET/v1/submissions/{id}Get submission status. Returns status, run_id, created_at.
GET/v1/runs/{id}Get run details and scores. Returns composite, mae_cl, mae_cd, mae_cm, rho_ld, ood_score, latency_p50_ms, status.
GET/v1/leaderboardPublic leaderboard entries. Query params: benchmark, sort, limit, offset.
POST/v1/runs/{id}/replayRequest a replay evaluation against the current test set. Returns new run_id.
GET/v1/reports/{run_id}Get a signed PDF report URL. Returns url, expires_at.

8. Error codes

CodeMeaning
400Invalid submission format or missing required fields
401Missing or expired authentication token
403Plan limit reached (upgrade required)
404Submission or run not found
409Duplicate submission (same artefact SHA)
422Model failed health check (/healthz returned non-200)
429Rate limit exceeded
500Evaluation runner error — contact support

9. Rate limits

PlanRunsAPI calls
Free1 public run / week
Vendor Proof50 private runs / month10 calls / minute
Procurement EvaluationUnlimited runs60 calls / minute

Rate limit status is returned in response headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

10. Anti-gaming

  • Plan-tier rate limits (Free: 1/week, Pro: 50/month)
  • 50-airfoil quarterly rotation of the private test set
  • All submission artefacts archived in R2 for 24 months — replay against fresh sets if contamination is suspected
  • Sub-percent Gaussian geometry obfuscation breaks exact-match memorisation
  • Optional manual review for SOTA-claim submissions (composite better than current best by >2σ)
  • Two-run reproducibility check; flag if results disagree above numerical tolerance

11. Security model

See the security page for full details. Key guarantees:

  • Docker containers run with outbound network access blocked during evaluation
  • ONNX models run in an isolated inference process
  • Private submissions are never shared or used to train any Comparotor model

Want the full spec?

The Benchmark specification on the methodology page is the source of truth — including the SU2 RANS generation pipeline, full D1 schema, and the OpenAPI contract at apps/api/openapi.yaml. The full specification is versioned in this repository — see the methodology page for the canonical benchmark definition.

Questions? Email [email protected] — solo founder, replies same-day.

Getting help