What if every AI model had an on-chain identity, a wallet, and a price tag?
Not a hypothetical. We built it. A tokenised model that charges USDC per inference, pays its owner automatically, and logs every invocation on-chain. The gallery and metering calculator are live at kcolbchain.com/erc721-ai/.
This is not a product announcement. This is a design pattern for what we believe becomes the default way AI models are distributed and monetised within two years.
The problem with AI model distribution
Today, if you train a useful model, you have roughly three options for monetisation:
Option one: host it behind an API and charge per request. This works, but you are now running infrastructure. You need servers, load balancers, rate limiters, billing systems, customer support. You became a SaaS company when all you wanted to do was train models.
Option two: upload it to Hugging Face and hope someone gives you a job. The open-source route. Noble, but it does not pay rent.
Option three: sell exclusive access to a company. The consulting route. High margin, low volume, and the model sits in one customer's data centre instead of being useful to the world.
None of these options are great. Option one has high operational overhead. Option two has zero revenue. Option three has zero distribution. What we want is option four: the model is publicly accessible, charges per use automatically, and pays its creator without any infrastructure burden on the creator's side.
ERC-721 AI: the model as an NFT
ERC-721 AI extends the standard NFT metadata schema with fields specific to machine learning models. A tokenised model includes:
- Model hash. A SHA-256 hash of the model weights, pinned to IPFS or Arweave. This is the cryptographic commitment that the model being served is the model that was tokenised.
- Architecture descriptor. Model type, parameter count, input/output schema, framework version. Enough metadata for an agent to determine programmatically whether this model can handle its task.
- Performance benchmarks. Accuracy, latency, and throughput numbers on standard benchmarks. Stored on-chain so they are immutable and auditable. No more marketing claims that evaporate under scrutiny.
- Pricing parameters. Base price per inference in USDC, volume discount tiers, and maximum batch size. These parameters are readable by any agent or protocol on-chain.
- Owner address. The wallet that receives payment. Ownership is transferable via standard ERC-721 mechanics — you can sell your model by selling the NFT.
The gallery at kcolbchain.com/erc721-ai/ showcases example tokenised models with full metadata, and includes an interactive metering calculator where you can estimate costs for different usage patterns.
x402: the payment rail
A tokenised model needs a way to charge for inference. This is where x402 comes in.
The model is served behind an HTTP endpoint. When an agent sends an inference request, the server returns HTTP 402 with a payment envelope specifying the cost in USDC and the settlement chain. The agent's wallet runtime (this is where switchboard fits) signs the payment, attaches the receipt as a header, and retries the request. The server verifies the receipt, runs inference, and returns the result.
The payment settles on Base. The USDC goes directly to the model owner's address — the address stored in the ERC-721 AI token metadata. No intermediary. No platform fee (unless the hosting provider adds one). The model earns for its creator every time it is used, automatically.
This is the key insight: x402 turns every HTTP server into a paywall, and ERC-721 AI turns every model into an asset that knows its own price. Put them together and you get a self-pricing, self-monetising model that works without a billing system, without a sales team, and without a platform.
The missing piece: verifiable inference
There is an obvious trust problem. When an agent pays for inference, how does it know the model actually ran? How does it know the response came from the tokenised model and not from a cheaper substitute? How does it know the model weights match the hash in the NFT metadata?
This is the ZKML frontier. Zero-knowledge proofs that attest: "this output was produced by running this input through a model whose weights hash to this value." The proof is verifiable on-chain, so a smart contract can confirm that the inference was legitimate before releasing payment from escrow.
We opened an issue for Giza ZKML integration — Giza is the leading framework for generating ZK proofs of ML inference. The integration path is: the model server generates a Giza proof alongside the inference result, the proof is included in the x402 response, and the agent's wallet runtime verifies it before marking the payment as settled.
This is not production-ready today. ZKML proofs for large models are still too slow and too expensive for real-time inference. But the technology is improving rapidly — proof generation time has dropped 100x in the past year — and for smaller models (classifiers, embeddings, decision trees), it is already practical.
The metering calculator
The calculator at kcolbchain.com/erc721-ai/ helps model creators price their work. Input your model's parameter count, expected latency, hosting cost, and target margin, and it outputs a recommended per-inference price in USDC with volume discount tiers.
It also shows the economics from the consumer side. If you are an agent making 10,000 inference calls per day, the calculator shows your daily cost across different models, compared against the cost of self-hosting the same model on various cloud providers. The breakeven analysis helps agents decide: rent inference via x402, or run your own GPU?
The grand vision
We are heading toward a world where AI models are first-class economic actors. They have identities (ERC-721 tokens). They have prices (on-chain metadata). They have revenue (x402 payments). They have provenance (ZKML proofs). They have owners who earn passively from their creation.
Imagine a marketplace — not a centralised platform, but an on-chain registry — where thousands of models are listed. An agent needs a sentiment classifier. It queries the registry, filters by accuracy above 0.95 and price below $0.001 per inference, checks the ZKML verification status, and starts making paid requests. No API key signup. No terms of service. No billing dashboard. Just cryptographic commerce.
The model creator — maybe a PhD student in Nairobi, maybe a retired engineer in Osaka — earns USDC every time their model is useful to someone. They never set up a server. They never wrote a billing system. They trained a model, tokenised it, and let the infrastructure do the rest.
This is not a utopian fantasy. Every piece of the stack exists today, in various stages of maturity. ERC-721 is battle-tested. x402 is in production. ZKML is in active development. The work left is integration — wiring these pieces together into a coherent developer experience.
That is what we are building at kcolbchain. Not any single piece of this stack, but the connective tissue between them. The dashboards, the simulators, the reference implementations, and the open-source tooling that makes this future accessible to anyone who wants to build on it.
Browse tokenised AI models, calculate per-inference costs, and see how ERC-721 AI + x402 creates a self-monetising model economy. Then check the switchboard dashboard for the wallet runtime that ties it all together.