Tokenized AI models: Who profits from open source?

Pavel Krylov · 25.01.2026, 20:17:13

Tokenized AI models: Who profits from open source?


Author: Pavel Krylov | AI Economics Researcher | Former ML Lead at Yandex

Meta spent hundreds of millions training Llama. Then they gave it away for free. Stability AI burned through investor cash creating Stable Diffusion, then open-sourced it. Mistral raised €400 million, released their models openly.

The pattern is confusing. Why spend fortunes building something, then give it away? Who actually profits from open-source AI? And could tokenization change the economics entirely?

The open-source AI paradox

Let me map the current landscape honestly.

Big tech releases open models as competitive weapons. Meta's Llama threatens OpenAI and Google. If open models are "good enough," why pay for GPT-4? Meta doesn't need AI revenue directly — they need AI capabilities for ads, Metaverse, and keeping developers in their ecosystem.

Startups open-source to build reputation. Mistral, Hugging Face, EleutherAI — they gain influence, talent, and eventual monetization paths through open work. The model is free; consulting, enterprise support, and hosted services are not.

Researchers open-source for scientific norms. Academic culture expects publication. Open models get citations, collaborations, and career advancement. The incentives are reputational, not financial.

But here's the uncomfortable truth: most individual contributors get nothing. The researcher who improves a model, the engineer who optimizes inference, the community member who finds bugs — they contribute value but capture none of it.

Where the value actually flows

Follow the money in open-source AI.

Cloud providers profit enormously. AWS, Google Cloud, Azure host open models and charge for compute. They contribute relatively little but capture massive value from inference traffic. The model creators get nothing from this revenue.

Hardware companies benefit. NVIDIA sells more GPUs because open models democratize AI development. More developers means more compute demand. They're the arms dealers in this gold rush.

Companies using models capture value. Startups build products on Llama, charge customers, pay nothing to Meta. That's the license — but it means value created flows to application layer, not model layer.

Aggregators and platforms win. Hugging Face hosts models, builds community, captures attention. They raised $235 million at $4.5 billion valuation largely from hosting other people's work.

The people who actually build models? Mostly salaries and reputation. Sometimes equity in successful companies. But no direct participation in the value their models create downstream.

The tokenization thesis

Here's where crypto enters the conversation.

What if model contributors could own tokens representing their contribution? What if those tokens captured value from model usage? What if open-source AI had native economic incentives?

The basic mechanism: mint tokens representing ownership shares in a model. Contributors earn tokens proportional to their contribution. Token holders receive revenue from model usage — inference fees, fine-tuning fees, licensing fees. Value flows back to creators.

This isn't theoretical. Projects like Bittensor, Ocean Protocol, and various AI DAOs are experimenting with these mechanisms. The execution is early, but the concept is being tested.

How it could work technically

Let me sketch a plausible architecture.

Contribution tracking uses cryptographic attestation. Every training run, every fine-tuning contribution, every dataset addition gets recorded with contributor signatures. The provenance is on-chain and verifiable.

Token allocation follows contribution metrics. Compute contributed, data provided, improvements measured by benchmarks — various metrics could determine token distribution. The formula would be governance-controlled and transparent.

Revenue capture happens at inference. Every API call, every model download, every commercial use triggers a small fee. Fees flow to a smart contract that distributes to token holders proportionally.

Governance through token voting. Token holders decide model direction, licensing terms, fee structures, which contributions qualify. Decentralized control over a collectively-owned asset.

The model becomes a digital cooperative. Contributors are members. Usage generates revenue. Revenue flows to members. The economics of open source get inverted.

The problems are real

I'm not naive about the challenges.

Attribution is genuinely hard. How do you value a training dataset versus a architectural insight versus an optimization trick? Contribution measurement is unsolved in traditional open source, let alone for AI.

Free-rider problems persist. If you can use a model without paying, why pay? Open models can be downloaded and run locally, bypassing any fee mechanism. Enforcement requires either technical restrictions or social norms.

Token speculation distorts incentives. If tokens trade on speculation rather than fundamentals, the economic signals get corrupted. Contributors might optimize for token price rather than model quality.

Legal ambiguity is significant. Are AI model tokens securities? Does token ownership convey intellectual property rights? The legal framework doesn't exist yet.

Governance at scale fails. We've seen this with DAOs — meaningful participation is hard, plutocracy emerges, decisions become slow. Applying this to rapidly-evolving AI development seems problematic.

What's actually being built

Despite challenges, experiments are underway.

Bittensor runs a network where "miners" contribute AI capabilities and earn TAO tokens. It's more about inference than training, but demonstrates tokenized AI coordination. Current market cap over $3 billion — real money validating the thesis.

Ocean Protocol enables data tokenization. Data providers mint tokens representing datasets, earn when data is used for training. It's infrastructure for tokenized AI inputs, not models directly.

Various "AI DAOs" are forming. Collectively-funded model training with token-based governance. Most are small experiments, but they're testing mechanisms.

Render Network and similar projects tokenize compute contributions. Not model ownership exactly, but related infrastructure — you can own tokens representing GPU contribution to AI workloads.

The ecosystem is fragmented and experimental. Nothing has achieved scale. But the building blocks are being assembled.

My honest assessment

Where do I think this actually goes?

Tokenized models won't replace traditional open source. The frictionless nature of truly free models is too valuable. Llama without tokens will always exist alongside tokenized alternatives.

But tokenization creates a new tier. Premium models with contributor compensation. Higher quality because incentives align. Commercial users who want to pay fairly, contribute sustainability to the ecosystem.

Data tokenization might matter more than model tokenization. The training data bottleneck is real. If data contributors could earn from AI trained on their data, that unlocks datasets currently locked away. The legal battles over training data might be solved economically rather than judicially.

Compute tokenization is already working. Render, Akash, io.net — functional markets for contributed compute. This is the proven wedge into tokenized AI infrastructure.

The full vision — contributor-owned AI models generating revenue for all participants — is years away and may never fully arrive. But partial implementations solving specific problems? Those are happening now.

The bigger question

Here's what keeps me thinking about this space.

AI is becoming the most economically significant technology in human history. The value created will be measured in trillions. Who captures that value matters enormously.

The current trajectory concentrates value in a few large companies. OpenAI, Google, Meta, Anthropic — they build models, capture revenue, employ the talent. Everyone else gets to use the tools and compete for scraps.

Tokenization offers an alternative path. Distributed ownership. Contributor compensation. Value flowing to those who create it. It's not guaranteed to work, but it's worth trying.

Open source revolutionized software by enabling collaboration without centralized control. Tokenized AI could do the same for machine learning — collaboration with economic participation.

I don't know if this vision succeeds. But I know the current model — where contributors work for free while platforms capture billions — isn't sustainable or fair. Something needs to change. Tokenization might be part of the answer.

Pavel Krylov researches AI economics and incentive mechanisms. He previously led machine learning at Yandex and advises several tokenized AI projects. He holds a PhD in Economics from Moscow State University.

#Crypto


Related posts

Solana vs Ethereum: Battle for developers
Next-gen wallets: Account Abstraction explained
Scroll down to load next post