Architecture
How QueraIS works
A coordinator (the gateway) matches your request to an open market of staked GPU nodes, streams the result back, samples it for honesty, and settles payment on-chain — 95% to the node, 5% to the protocol. Here’s the whole machine, and how the trusted parts get removed over time.
The pieces
| Component | Role |
|---|---|
| Requester / developer | Sends OpenAI-style requests; pre-funds a credit account and signs one spending cap. |
| Gateway | The coordinator — auth, matching, streaming, verification, and batched settlement. Trusted today; bounded by signed prices. |
| GPU node | Stakes $QAIS, advertises models + price, runs inference, streams tokens, earns 95%. |
| Matching engine | Picks the serving node per job by price, reputation, latency, and capability. |
| Verification oracle | Re-runs ~5% of jobs on its own nodes and updates reputation; flags anomalies into disputes. |
| Smart contracts | Token, node registry + stake, credit/escrow, dispute resolution, and treasury — on Arbitrum. |
A job, end to end
- Request. A developer calls /v1/chat/completions (OpenAI-compatible). The gateway normalizes it to a job spec and checks the requester’s signed credit headroom.
- Match. The matching engine scores eligible nodes — price, reputation, latency, capability — and assigns the winner over a live WebSocket channel.
- Serve. The node runs the model and streams tokens back through the gateway to the caller in real time.
- Verify. Every job gets cheap format/length checks; ~5% are re-run on oracle nodes and compared by embedding similarity. The result updates the node’s reputation.
- Settle. Debits accrue off-chain and flush in a batched on-chain transaction — 95% to the node, 5% to the protocol — amortizing gas to a fraction of a cent per job.
Trust model — and why the worst case is bounded
Today a single gateway coordinates matching and settlement. That’s a real trust assumption, but a tightly fenced one:
Can’t steal deposits
Requester funds are locked in the credit contract; the gateway can only settle at the prices you already signed in each job spec.
Can’t exceed your cap
Your EIP-712 spending cap bounds the most it can ever spend. Revoke in one tx.
Can’t block refunds
Unclaimed deposits are withdrawable on-chain after a short notice window.
Can’t fake quality
Sampled re-runs and staking/slashing make dishonest serving a losing trade — see Security.
The path to decentralization
Each step removes a piece of trust from the gateway. Sequencing is directional, not dated.
| Stage | Step | What changes |
|---|---|---|
| Today | Trusted gateway + on-chain settlement | One gateway does matching and batched settlement; stake, reputation, and payment are already on-chain. Worst case is bounded — see the trust model above. |
| Next | On-chain auction | Job specs post on-chain; nodes submit sealed bids in a short window and a contract selects the winner — removing the gateway’s matching role. |
| Then | P2P mesh + decentralized oracle | Nodes discover each other over a libp2p DHT and gossip jobs; verification moves to a decentralized oracle instead of protocol-run infrastructure. |
| Goal | DAO governance | Arbitration and parameters move to on-chain governance; node participation is fully permissionless and the trusted gateway is gone. |
Tech stack
| Layer | Today |
|---|---|
| Blockchain | Arbitrum Sepolia (EVM L2) — testnet today |
| Contracts | Solidity 0.8 + OpenZeppelin, transparent proxy (5 core) |
| Gateway / API | Node.js + Fastify (TypeScript), OpenAI-compatible REST |
| Node daemon | TypeScript daemon wrapping llama.cpp / Ollama |
| Inference | llama.cpp · Ollama (vLLM optional) |
| Settlement | EIP-712 signed sessions → batched on-chain (50–500 jobs/tx) |
| Verification | Sampled re-runs + format checks; 5-dimension EMA reputation |
| Model integrity | SHA-256 digests (IPFS pinning on the roadmap) |
| Frontend | React dashboard (served by the gateway) + this Next.js site |
| P2P (roadmap) | libp2p mesh for discovery + gossip |