Reproducibility contracts, evidence guarantees, budgets, and gated evolution. This document describes the interface and guarantees — not internal routing details.
Every UCogNet evaluation run operates under a strict reproducibility contract. The system is designed so that any result can be independently verified by replaying the frozen state with the same seeds and budgets.
Every evaluation run is sealed with SHA-256 hashes, fixed random seeds, and declared token/time budgets. Nothing changes post-hoc.
Each benchmark item produces a JSONL record with task ID, prompt hash, raw output, extracted answer, and scoring trace. Fully auditable.
All generated artifacts (tool code, sandbox outputs, intermediate reasoning) are indexed and replayable from the same frozen state.
Wall-clock time, token counts (prompt + completion), and inference cost are tracked per item. Budget overruns trigger automatic rollback.
UCogNet does not produce bare answers. Every response carries structured evidence:
Claims are explicit, provenance is machine-readable, and every output can be replayed. This is the foundation of audit-ready AI.
Every task execution operates under declared budgets:
UCogNet evolves its policies through controlled mutations. Every candidate policy must pass through a series of gates before replacing the current best:
Improvement threshold
Candidate must exceed baseline by a statistically significant margin (bootstrap CI)
Cost cap
New mutation cannot exceed 1.2x the cost of current best policy
Safety anomaly detection
Reward spikes > 3σ from rolling mean trigger automatic audit
Gradual rollout
Mutations deploy to 10% → 30% → 100% traffic with gates at each stage
Rollback
If any gate fails, system reverts to previous policy within one evaluation cycle
This technical note describes interfaces and guarantees, not internal implementation. Specifically, we do not disclose:
These are available under NDA for qualified partners and investors. Contact samuel@ucognet.pro for access.
Back to overviewRequest full access