API reference

Integrate the Verifiable Labs platform.

A typed REST API over the verifier engines, telemetry, datasets, keys, and usage. JSON in, JSON out — RFC-7807 errors, two auth planes, one base URL.

Get started

Quickstart

Zero to a gated training loop in four steps: create a key, certify a reward, run your first IPT scan, then post per-batch telemetry. Everything below is copy-paste curl and Python; full request and response shapes for each call live in the sections that follow, and the two credential planes are covered in Authentication.

1 · Create an API key, then export it

keysbash

curl -X POST https://api.verifiable-labs.com/v1/keys \
  -H "Authorization: Bearer $CLERK_JWT" \
  -H "Content-Type: application/json" \
  -d '{ "name": "training-loop" }'

shellbash

export VLABS_KEY=vlk_...   # from the dashboard keys page

2 · Certify the reward before you train on it

certifybash

curl -X POST https://api.verifiable-labs.com/v1/engines/verifier-robustness/certify \
  -H "X-Vlabs-Key: $VLABS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task_id": "gcd",
    "max_mutants": 40,
    "weak_threshold": 0.9
  }'

3 · Run your first IPT scan

ipt-scanbash

curl -X POST https://api.verifiable-labs.com/v1/engines/verifier-robustness/ipt-scan \
  -H "X-Vlabs-Key: $VLABS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task_id": "gcd",
    "candidate_kind": "dict_hack",
    "n_sigma": 5,
    "n_cases": 8
  }'

Step 3 in Python — same endpoint, same fields

ipt_scan.pypython

# python -m pip install requests
import os
import requests

resp = requests.post(
    "https://api.verifiable-labs.com/v1/engines/verifier-robustness/ipt-scan",
    headers={"X-Vlabs-Key": os.environ["VLABS_KEY"]},
    json={
        "task_id": "gcd",
        "candidate_kind": "dict_hack",
        "n_sigma": 5,
        "n_cases": 8,
    },
    timeout=60,
)
resp.raise_for_status()
report = resp.json()
print(report["is_shortcut"], report["evidence"])

4 · Post per-batch telemetry from your training loop

verifier-audits/ingestbash

curl -X POST https://api.verifiable-labs.com/v1/verifier-audits/ingest \
  -H "X-Vlabs-Key: $VLABS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "audit_series_id": "rl-run-2026-06",
    "batch_number": 0,
    "candidates_tested": 256,
    "shortcuts_detected": 12,
    "mean_hacking_gap": 0.18,
    "mean_invariance_violation": 0.05
  }'

Overview

Introduction

Every endpoint lives under a single versioned base URL. Send and receive JSON; authenticate with the header for the plane you are calling (see Authentication). Errors are always application/problem+json.

Base URLhttps://api.verifiable-labs.com

Version prefix/v1

Content typeapplication/json

Errorsapplication/problem+json (RFC 7807)

Auth planesX-Vlabs-Key · Clerk Bearer

base urlhttp

https://api.verifiable-labs.com/v1

Auth

Authentication

Two planes, two credentials. The data plane runs your engine, audit and dataset calls and is authenticated with a vlk_ API key. The management plane issues and revokes those keys and reads your account, and is authenticated with the dashboard’s Clerk session token.

X-Vlabs-KeyData plane

Send your secret API key in the X-Vlabs-Key header. Keys are prefixed vlk_ and shown in full exactly once, at creation — copy it then. Used by the verifier engines, audit ingest, datasets and usage.

X-Vlabs-Key: vlk_3a7c9f2b…

BearerManagement plane

Send the dashboard-issued Clerk session token as a Bearer credential. Used to create / list / revoke API keys and to read your own usage. Never put a Clerk token into a data-plane call.

Authorization: Bearer <clerk-session-jwt>

Keep keys server-side

A vlk_ key is a bearer secret with full data-plane access for your account. Keep it on your servers (or in CI secrets), never in a browser bundle or a public repo. Rotate by creating a new key and revoking the old one — see API keys.

Core · data plane

Verifier engines

The paid engines that measure how gameable a reward is and whether a candidate is a reward shortcut. Each call debits your verification-scan quota and runs only server-constructed code on registered tasks — your reference code and test cases never leave your trust domain.

Scope

IPT (Isomorphic Perturbation Testing) is judge-free and deterministic, and it needs a trusted reference; the reference-less integrity scan covers documented tamper patterns without one — and where neither layer can conclude, the gate returns a LIMIT verdict, held for human review. Measured figures for both layers are published on the platform page — reproducible, dataset-specific runs.

POST/v1/engines/verifier-robustness/certifyX-Vlabs-Key

Certify how gameable a registered task's single-provided-test reward is, before you train on it. Mutation testing on the engine's own reference: deployed_kill_score (what the shipped reward catches) vs trusted_kill_score (the isomorphic re-grade), plus a gameable verdict.

Cost: 1 score unit

Request

certifybash

curl -X POST https://api.verifiable-labs.com/v1/engines/verifier-robustness/certify \
  -H "X-Vlabs-Key: $VLABS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task_id": "gcd",
    "max_mutants": 40,
    "weak_threshold": 0.9
  }'

Response · 200

VerifierCertifyResponsejson

{
  "task_id": "gcd",
  "n_mutants": 40,
  "deployed_kill_score": 0.41,
  "trusted_kill_score": 0.97,
  "exploitable_gap": 0.56,
  "gameable": true,
  "headline": "single provided test misses 56% of reference deviations",
  "engine_version": "vr-2026.06"
}

Field	Type	Description
`task_id`required	string	A server-registered coding task id.
`max_mutants`optional	int (5–200)	Reference variants to generate. Default 40.
`weak_threshold`optional	float (0–1]	Kill-score below which the reward is flagged gameable. Default 0.9.

POST/v1/engines/verifier-robustness/ipt-scanX-Vlabs-Key

Run Isomorphic Perturbation Testing — an established verifier-integrity technique — over a registered task and a declarative candidate. Reports whether the candidate passed the public suite but is not invariant under a semantics-preserving relabeling — i.e. is a reward shortcut.

Cost: 1 score unit

Request

ipt-scanbash

curl -X POST https://api.verifiable-labs.com/v1/engines/verifier-robustness/ipt-scan \
  -H "X-Vlabs-Key: $VLABS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task_id": "gcd",
    "candidate_kind": "dict_hack",
    "n_sigma": 5,
    "n_cases": 8
  }'

Response · 200

IPTScanResponsejson

{
  "task_id": "gcd",
  "extensional_pass": true,
  "isomorphic_pass": false,
  "is_shortcut": true,
  "decisive": true,
  "hacking_gap": 0.31,
  "invariance_violation_rate": 0.22,
  "extensional_pass_rate": 1.0,
  "isomorphic_pass_rate": 0.78,
  "n_sigma": 5,
  "isomorphic_cases_evaluated": 8,
  "broken_dimension": "relabel_invariance",
  "evidence": "passes provided test; fails isomorphic re-grade",
  "config_hash": "c0ffee…",
  "candidate_hash": "9a1b…",
  "reference_hash": "4d2e…"
}

Field	Type	Description
`task_id`required	string	A server-registered coding task id.
`candidate_kind`optional	"genuine" \| "dict_hack" \| "ifchain_hack"	The server-constructed candidate to scan. Provide this (preferred) instead of candidate_code.
`n_sigma`optional	int (1–20)	Isomorphic relabelings per case. Default 5.
`n_cases`optional	int (1–50)	Isomorphic cases evaluated. Default 8.

POST/v1/engines/contamination/scoreX-Vlabs-Key

Score candidate items for data-contamination risk and resolve release / train / hidden-eval permissions. Returns the Contamination Firewall report — an aggregate DCR score, risk band, the three decisions and a per-scorer breakdown.

Cost: 1 score unit per item

Request

contamination/scorebash

curl -X POST https://api.verifiable-labs.com/v1/engines/contamination/score \
  -H "X-Vlabs-Key: $VLABS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "items": ["def add(a, b): return a + b", "..."],
    "split": "train",
    "train_allowed": false,
    "public_release_allowed": false
  }'

Response · 200

ContaminationReportjson

{
  "dcr_score": 0.12,
  "risk_level": "low",
  "release_allowed": true,
  "train_allowed": true,
  "hidden_eval_allowed": true,
  "reasons": ["no near-duplicate eval overlap detected"],
  "components": [
    { "component": "exact_duplicate", "score": 0.10, "weight": 1.0, "placeholder": false, "detail": "…" },
    { "component": "near_duplicate", "score": 0.14, "weight": 1.0, "placeholder": false, "detail": "…" }
  ]
}

Field	Type	Description
`items`required	string[] (1–1000)	Candidate strings scored against the eval. Quota debit = items.length.
`split`required	"train" \| "validation" \| "hidden_eval" \| "public_demo"	Which split these items belong to.
`train_allowed`optional	bool	Caller's training-use intent. Default false.
`public_release_allowed`optional	bool	Caller's public-release intent. Default false.

Core · data plane

Audit telemetry

Run IPT over each batch of rollouts inside your training loop and post the per-batch aggregate. The server computes hack_rate from the two counts — candidate code and grader stay in your trust domain. Ingest is idempotent on (key, series, batch).

POST/v1/verifier-audits/ingestX-Vlabs-Key

Record one per-batch reward-hacking snapshot for a series. hack_rate is computed server-side from candidates_tested and shortcuts_detected; a retry of the same batch returns the stored value without a second charge.

Cost: 1 score unit per new batch · idempotent on (key, series, batch)

Request

verifier-audits/ingestbash

export VLABS_KEY=vlk_...   # from the dashboard keys page

curl -X POST https://api.verifiable-labs.com/v1/verifier-audits/ingest \
  -H "X-Vlabs-Key: $VLABS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "audit_series_id": "rl-run-2026-06",
    "batch_number": 0,
    "candidates_tested": 256,
    "shortcuts_detected": 12,
    "mean_hacking_gap": 0.18,
    "mean_invariance_violation": 0.05
  }'

Response · 201

VerifierAuditIngestResponsejson

// 201 Created
{
  "audit_series_id": "rl-run-2026-06",
  "batch_number": 0,
  "hack_rate": 0.046875
}

Field	Type	Description
`audit_series_id`required	string (1–200)	Your id for one training run. Batches accumulate into the run curve.
`batch_number`required	int (≥ 0)	Monotonic batch index within the series.
`candidates_tested`required	int (1–100000)	Rollouts you ran IPT over this batch.
`shortcuts_detected`required	int (≥ 0)	Reward shortcuts found. Must not exceed candidates_tested.
`mean_hacking_gap`optional	float (0–1)	Mean extensional-vs-isomorphic gap. Default 0.
`mean_invariance_violation`optional	float (0–1)	Mean isomorphic invariance-violation rate. Default 0.

Core · data plane

Datasets

Asynchronous synthetic-dataset jobs. You bring your own LLM endpoint and key (it is encrypted at rest and never returned); the job runs against a registered environment and emits parquet or jsonl. Creation returns immediately — poll GET /v1/datasets/{id} for status.

POST/v1/datasetsX-Vlabs-Key

Enqueue a synthetic-dataset generation job. Validates the env, guards the outbound LLM URL (SSRF), reserves the tuples quota and returns the queued job. Supply idempotency_key for permanent exact-request replay.

Cost: requested_tuples against the tuples quota · idempotent via idempotency_key

Request

datasetsbash

curl -X POST https://api.verifiable-labs.com/v1/datasets \
  -H "X-Vlabs-Key: $VLABS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "env_id": "math-algebra",
    "idempotency_key": "7d1f0c9a-…",
    "requested_tuples": 1000,
    "seed_start": 0,
    "llm_endpoint_url": "https://api.openai.com/v1",
    "llm_api_key": "sk-…",
    "llm_model": "gpt-4o-mini",
    "output_format": "parquet"
  }'

Response · 201

DatasetCreateResponsejson

// 201 Created — job is queued; poll GET /v1/datasets/{dataset_id}
{
  "dataset_id": "ds_3f9a…",
  "state": "queued",
  "requested_tuples": 1000,
  "seed_start": 0,
  "seed_end": 999,
  "output_format": "parquet",
  "env_version": "0.42.0",
  "created_at": "2026-06-21T10:04:12Z"
}

Field	Type	Description
`env_id`required	string	A registered environment id. Unknown ids return 404 unknown_environment.
`requested_tuples`required	int (1–100000)	Number of tuples to generate. Charged against the tuples quota.
`seed_start`required	int (≥ 0)	First seed; the job spans seed_start … seed_start+requested_tuples-1.
`llm_endpoint_url`required	string	Your model endpoint. Internal / metadata hosts are rejected (400 blocked_outbound_url).
`llm_api_key`required	string	Your model credential. Encrypted at rest; never returned.
`llm_model`required	string	Model name passed to your endpoint.
`output_format`optional	"parquet" \| "jsonl"	Output encoding. Default parquet.

Reading a job back

GET /v1/datasets lists your jobs; GET /v1/datasets/{id} returns full status with reward stats once succeeded; and GET /v1/datasets/{id}/download 302-redirects to a presigned URL (or, with Accept: application/json, returns the URL inline with the SHA-256 and size).

Core · management plane

API keys

Mint and revoke data-plane keys from your Clerk session. The plaintext vlk_ key is returned only on creation. Revocation is a soft revoke — the record is retained for audit with revoked_at set.

POST/v1/keysBearer

Create a new API key. The response includes plaintext_key exactly once — copy it immediately. Only the prefix is ever shown again.

Request

keysbash

curl -X POST https://api.verifiable-labs.com/v1/keys \
  -H "Authorization: Bearer $CLERK_JWT" \
  -H "Content-Type: application/json" \
  -d '{ "name": "training-loop" }'

Response · 200

APIKeyCreatedjson

{
  "id": "3a7c…",
  "prefix": "vlk_3a7c",
  "name": "training-loop",
  "created_at": "2026-06-21T10:00:00Z",
  "last_used_at": null,
  "revoked_at": null,
  "plaintext_key": "vlk_3a7c9f2b… "   // shown ONCE — copy it now
}

Field	Type	Description
`name`required	string (1–64)	A label for the key, shown in the dashboard and the list response.

GET/v1/keysBearer

List the caller's API keys (metadata only — plaintext is never listed). Returns an { items: [...] } envelope.

Response · 200

APIKeyListjson

{
  "items": [
    {
      "id": "3a7c…",
      "prefix": "vlk_3a7c",
      "name": "training-loop",
      "created_at": "2026-06-21T10:00:00Z",
      "last_used_at": "2026-06-21T10:42:11Z",
      "revoked_at": null
    }
  ]
}

DELETE/v1/keys/{key_id}Bearer

Soft-revoke a key by id. Returns 200 with the now-revoked record (revoked_at set). In a team, revoking another member's key requires an org admin/owner role (else 403 insufficient_org_role).

Core · management plane

Usage

Read your aggregate monthly usage and tier limits across all your keys. The same metrics back the dashboard usage view.

GET/v1/me/usageBearer

This month's usage (traces, verification scans, tuples) against your tier caps, plus your per-minute rate limit and active key count.

Request

me/usagebash

curl https://api.verifiable-labs.com/v1/me/usage \
  -H "Authorization: Bearer $CLERK_JWT"

Response · 200

MeUsageResponsejson

{
  "tier": "free",
  "period_start": "2026-06-01",
  "period_end": "2026-07-01",
  "traces": { "used": 1240, "limit": 10000 },
  "scores": { "used": 312,  "limit": 1000 },
  "tuples": { "used": 0,    "limit": 1000 },
  "rpm_limit": 100,
  "api_keys_active": 2
}

Usage from a data-plane key

If you only have a vlk_ key, call GET /v1/usage with the X-Vlabs-Key header — it returns the same usage shape for that key’s account.

Reference

Errors

Every error — validation failures and uncaught exceptions included — is returned as application/problem+json (RFC 7807) with a stable machine-readable code. Branch on code and status, not on the human-readable title.

problem+jsonjson

// HTTP/1.1 402 Payment Required
// Content-Type: application/problem+json
{
  "type": "https://api.verifiable-labs.com/errors/quota_exceeded",
  "title": "monthly trace quota exhausted for this tier",
  "status": 402,
  "code": "quota_exceeded",
  "detail": "tier=free scores_cap=1000, used=1000, requested=1; upgrade or wait for next month"
}

Field	Type	Description
`type`optional	string (URI)	Stable error URI, https://api.verifiable-labs.com/errors/{code}.
`title`optional	string	Short human-readable summary.
`status`optional	int	HTTP status code, mirrored in the body.
`code`optional	string	Machine-readable code — branch on this.
`detail`optional	string	Optional context for this specific occurrence.

Common codes

Status	code	When it happens
400	`blocked_outbound_url`	A supplied URL targets a blocked internal / cloud-metadata host (SSRF guard).
401	`invalid_api_key`	Missing or invalid X-Vlabs-Key header.
402	`quota_exceeded`	Monthly quota for this tier is exhausted — upgrade or wait for reset.
403	`insufficient_org_role`	Your org role can't perform this action (e.g. revoking another member's key).
422	`validation_error`	Request body failed schema validation; see the errors array.
429	`rate_limited`	Per-tier rate limit exceeded; honor the Retry-After header.

Reference

Rate limits

Requests are rate-limited per billing owner (your personal account or active organization) over a 60-second sliding window, so creating extra keys does not multiply the allowance. The Free tier allows 100 requests / minute; paid tiers raise the ceiling. Exceeding it returns 429 rate_limited with a Retry-After header.

Window60 s sliding

Free tier100 req / min

Pro tier1,000 req / min

Team tier10,000 req / min

On excess429 rate_limited + Retry-After (s)

Monthly quotas are separate

Per-minute rate limits throttle burst traffic; monthly quotas (traces, verification scans, tuples) meter paid work and reset at the start of each calendar month. A scores debit that would overflow the monthly cap returns 402 quota_exceeded, not a 429. Read your current standing at GET /v1/me/usage.

Reference

Idempotency

Write endpoints are safe to retry. Two mechanisms apply depending on the endpoint.

Field	Type	Description
`idempotency_key`optional	body field	On `POST /v1/score`, `POST /v1/datasets`, and `POST /v1/compute/runs`, an exact re-issue with the same `idempotency_key` returns the original resource and current state without creating a duplicate or consuming quota. The binding is permanent. Reusing that key with a different canonical request payload—or reusing a legacy key whose original request cannot be proven—returns `409 idempotency_conflict`; use a new key for a new request. Idempotency never substitutes one operation for another.
`(key, series, batch)`optional	natural key	On `POST /v1/verifier-audits/ingest`, a retry of the same `audit_series_id` + `batch_number` returns the stored `hack_rate` with no second charge and no duplicate row.

Get started

Next steps

Create a key, certify a reward, then wire IPT into your training loop and post telemetry — the Quickstart walks all four steps with copy-paste curl and Python.

NextQuickstart

Integrate the Verifiable Labs platform.

Quickstart#

Introduction#

Authentication#

Verifier engines#

Audit telemetry#

Datasets#

API keys#

Usage#

Errors#

Rate limits#

Idempotency#

Next steps#

Quickstart

Introduction

Authentication

Verifier engines

Audit telemetry

Datasets

API keys

Usage

Errors

Rate limits

Idempotency

Next steps