wave GPU Edge

On-demand GPU inference at the edge — gateway-authenticated requests proxied to a managed serverless GPU backend, with usage metered in GPU-hours.

inferenceon-demand GPUedge

  gateway-authenticated request
    │
    ▼   GPU Edge on the WAVE protocol plane
  POST /v1/gpu/infer ─▶ {model, input, max_tokens}
    │
    └─ auth + metering federate through api.wave.online

run inferencePOST /v1/gpu/infer

livenessGET /v1/healthz

Gateway-fronted — authentication, quota enforcement, and metering all happen at the gateway before this spoke is reached.