Proof and rigor

Audit-ready engineering. Every run is reproducible by design.

Engineering

Engineering rigor

Extensive automated tests across modules; reproducibility by design.

Test coverage across all modules
Safety

Safety guardrails

Anti-reward-hacking mechanisms: shaping cap, evidence auditing, anomaly detection.

Shaping cap and anomaly detection guardrails

Reproducibility contract

Frozen runs (hashes, seeds, budgets)

Per-item JSONL logs

Artifacts indexed + replay

Cost tracked (time / tokens)