Five releases shipped · April–May 2026
Company

AI infrastructure, built by operators.

Verifiable Labs is the verification and audit infrastructure layer for AI systems. Built by engineers and operators who have shipped production AI, audit pipelines, and enterprise-grade infrastructure: Stelios Zacharioudakis (engineering and platform) and Stefanos Eleftheriou (operations and go-to-market).

View on GitHub
Founders

The two people behind Verifiable Labs

SZ

Stelios Zacharioudakis

Co-founder · Engineering & PlatformAthens, Greece

Production ML engineer. Builds verification infrastructure at scale: LLM inference systems, formal SMT-based reasoning, trustworthy classification pipelines, and the conformal-calibrated reward stack that powers Verifiable Labs. Founder and head engineer at AsklepiosMed.

Highlights
  • AsklepiosMed · 480 doctors · 99.5% uptime over 12 months
  • Production infra: PyTorch, vLLM, Z3 SMT, GNNs
  • vlabs-audit: signed multi-page reports in under five minutes
  • Built and validated the verification stack across 5 model families (Alibaba, Microsoft, Meta, Google, Anthropic)
Experience
  • Head EngineerJun 2022 — Present
    Paphos Medical Association · AsklepiosMed

    Platform serving 480 doctors across Cyprus. Reduced onboarding 60% via automated digital identity (Stripe → Apple/Google Wallet). Cut deployment cycle from 2 days to 15 minutes; 99.5% uptime over 12 months.

  • Full-Stack EngineerJan 2025 — Present
    Medihyal Clinic

    Clinic booking platform on Next.js 16 + Supabase handling 200+ monthly reservations. Integrated Groq LLM for inventory reorder review, reducing manual review time by 80%.

EducationNational and Kapodistrian University of Athens · BSc Computer Science · 2022–2026
SE

Stefanos Eleftheriou

Co-founder · OperationsPaphos, Cyprus

Operations and go-to-market lead. Owns the customer-facing surface: brand, web, content systems, and the enterprise touchpoints that turn a developer install into a procurement signature. Former Cyprus National Swimming Team member: 17 years of compounding execution discipline.

Highlights
  • Owns brand, web, and content systems across Verifiable Labs
  • Domenica Group: SEO and digital communication across real-estate portfolio
  • Cyprus National Swim Team · 17 years
  • Athletes' Unit · Cyprus National Guard 2021–22
Experience
  • Marketing & Web AdministrationJul 2025 — Present
    Domenica Group

    WordPress site administration, SEO, and content strategy across the company's real-estate portfolio. Front- and back-end CMS work, content publishing, and digital brand communication.

  • Accounting AssistantJul – Aug 2024
    ProVision Accountants

    Invoice processing and bank reconciliations during a summer placement; financial-records accuracy and compliance work.

EducationUniversity of Groningen · BSc Communication & Information Studies · 2023–2026
Mission

Turn every AI model claim into reproducible, audit-grade evidence. Across six capability domains, on a stack procurement and post-training teams can verify together.

How

Contamination-resistant tasks, objective ground truth, calibrated uncertainty, and signed receipts. Open-source SDK at the edge, hosted calibration and attestation in the middle, self-hosted or VPC for regulated AI.

Why now

AI is being deployed and post-trained faster than it can be verified. Without an audit layer, every model release is a leap of faith and every benchmark score is a marketing number.

Thesis

AI is being deployed faster than it can be verified.

AI teams are shipping models faster than they can verify them. Verifiable Labs gives them the missing audit layer: fresh tasks, objective rewards, calibrated uncertainty, and signed reports that survive procurement, safety review, and release gates.

Validated across five model families with statistically significant results on 13 of 15 paired held-out comparisons. Strongest single result: +2064 % reward improvement on Llama 3.2-1B + code-execution tasks.

The bottleneck in post-training is not compute or data. It’s a reward signal you can trust. When the benchmark is memorised, the reward measures recall, not capability. The model improves on paper while shipping the same failures.

Open-source SDK at the edge, hosted calibration in the middle, self-hosted or VPC for regulated AI. If you train, evaluate, or buy AI models, this is the substrate you can verify — sample report available right now via vlabs-audit.

The wedge today is audit; the platform is reward infrastructure for post-training. Every signed audit also ships a paired trace, a calibrated interval, and a documented failure mode. Those are exactly the inputs a frontier-AI team needs to train better models. The first product makes the verification layer usable. The next products turn that layer into training-grade reward signals: proprietary paired-evaluation data, distribution- free reward labels, and domain-specialist verifiers for math, code, SQL, tools, and scientific reconstruction.

Shipping cadence

Five releases in two months.

Every release ships tagged code, CI-green tests, and a public writeup with reproducible numbers. No marketing math, no roadmap theatre. Just artefacts you can verify in the repo.

  1. Verification protocolApr 2026

    Procedural task generation + conformal-calibrated rewards

    Six capability domains, contamination-resistant by construction. Objective ground truth (executable or closed-form) and calibrated uncertainty on every reward. Public artefacts on the Resources page.

  2. Post-training proof-pointApr 2026

    Cross-model validation across five model families

    GRPO post-training showed measurable reward improvement on contamination-resistant tasks, validated across Alibaba, Microsoft, Meta, Google, and Anthropic open models. Honest writeup: early gains skew toward format compliance, not breakthrough capability.

  3. vlabs-calibrateApr 2026

    Distribution-free calibration library (0.1.0)

    Calibrated reward intervals shipping inside every trace. 90.1 % empirical coverage against the 0.90 target across the production environments.

  4. vlabs-apiMay 2026

    Hosted calibration API in production

    FastAPI service on Fly: API-key auth, tier-aware rate limits, Stripe billing, audit history. Production-grade, currently serving customer evaluation calls.

  5. vlabs-auditMay 2026

    Signed audit pipeline with reproducible reports

    Drop in a model, get a signed multi-page report in under five minutes. Every chart, score, seed, and trace links back to source evidence. Anonymisable for procurement, safety review, or release-gate sign-off.

Talk to the founders

Enterprise deployments, post-training partnerships, and procurement conversations. Same inbox.