How it works
A request becomes a matched job, real inference, two layers of verification, and an on-chain payment — in a couple of seconds.
Request
You call
POST /v1/chat/completions(OpenAI-compatible) with a Bearer API key. Quota and prompt-abuse limits run before anything touches the chain.Match
The gateway normalizes the request to a canonical job and the matching engine picks a node by price and reputation.
Serve
The node streams tokens back over a WebSocket; the gateway proxies them to you and counts independently — you're billed on
min(node, gateway)tokens.Verify
Layer-B checks structure (non-empty, length, loop detection, the node is pinned to what it sent). Layer-A re-runs ~5% of jobs on oracle inference and compares embedding similarity — anomalies are flagged for review, never auto-slashed.
Settle
With an open credit session the job settles off-chain against your signed cap and flushes in one
batchSettle; otherwise it's a per-job escrow release. Either way: 95% to the node, 5% to the protocol treasury.Reputation & economics
The node's 5-dimension score updates and snapshots on-chain daily. The treasury sweeps fees 60/20/20 (ops / stakers / burn) — the burn shrinks supply.