a16z: What is the key to predicting explosive growth in market trends?

Jan 23, 2026 14:36:30

Share to

Author: a16z

Compiled by: Jiahua, ChainCatcher

Last year, the trading volume of the prediction market for the Venezuelan presidential election results exceeded $6 million. However, when the vote counting ended, the market faced an impossible situation: the government announced that Nicolás Maduro had won, while the opposition and international observers accused fraud. Should the prediction market's resolution follow the "official information" (Maduro's victory) or the "consensus of credible reports" (opposition victory)?

In the case of the Venezuelan election, the observers' accusations escalated: they first condemned the disregard for rules and the "theft" of user funds, then denounced the resolution mechanism for monopolizing power in this political game—acting as "judge, jury, and executioner," and even claimed it was severely manipulated.

This is not an isolated incident. It is, in my opinion, one of the biggest bottlenecks faced by prediction markets in the scaling process: contract adjudication.

The risks here are extremely high. If adjudication is handled properly, people will trust your market, be willing to trade in it, and prices will become meaningful signals for society. If adjudication is mishandled, trading will feel frustrating and unpredictable. Participants may drop out, liquidity may face depletion, and prices will no longer reflect accurate predictions of stable targets. Instead, prices begin to reflect a vague mixture—containing both the actual probability of outcomes occurring and traders' beliefs about how the distorted resolution mechanism will adjudicate.

The controversy in Venezuela was relatively high-profile, but more subtle failures frequently occur across various platforms:

  • The Ukraine Map Manipulation Case demonstrated how opponents could profit directly through the gaming resolution mechanism. A contract regarding territorial control stipulated that it would resolve based on a specific online map. It was alleged that someone modified the map to influence the contract's outcome. When your "source of truth" can be manipulated, your market can be manipulated as well.

  • The Government Shutdown Contract showcased how the source of resolution can lead to inaccurate or at least unpredictable results. The resolution rules stipulated that the market would pay out based on the time the U.S. Office of Personnel Management's website showed the end of the shutdown. President Trump signed the funding bill on November 12—but for unknown reasons, the OPM website did not update until November 13. Traders who correctly predicted the shutdown would end on the 12th lost their bets due to the website administrator's delay.

  • The Zelensky Suit Market raised concerns about conflicts of interest. The contract asked whether Ukrainian President Zelensky would wear a suit at a specific event—seemingly a trivial question, yet it attracted over $200 million in bets. When Zelensky attended a NATO summit in attire described as a suit by the BBC, the New York Post, and other media, the market initially resolved as "yes." However, UMA token holders contested the outcome, and the resolution was subsequently flipped to "no."

In this article, I will explore how to cleverly combine large language models (LLMs) and cryptographic technology to help us create a prediction market resolution method that is difficult to manipulate, accurate, fully transparent, and credibly neutral at scale.

The Challenge is Not Limited to Prediction Markets

Similar issues have also plagued financial markets. For years, the International Swaps and Derivatives Association (ISDA) has been grappling with the adjudication dilemma in the credit default swap (CDS) market. A CDS is a contract that pays out in the event of a company or sovereign debt default. Their 2024 review report candidly highlights these difficulties. Their resolution committee, composed of major market participants, votes on whether a credit event has occurred. However, this process has been criticized for its lack of transparency, potential conflicts of interest, and inconsistent outcomes, much like the UMA process.

The fundamental issue is the same: when vast sums of money depend on the adjudication of ambiguous situations, every resolution mechanism becomes a target for gaming, and every point of ambiguity becomes a potential flashpoint.

Four Pillars of an Ideal Adjudication Solution

Any viable solution needs to achieve several key attributes simultaneously:

  1. Resistance to Manipulation If opponents can influence adjudication, such as by editing Wikipedia, planting fake news, bribing oracles, or exploiting software vulnerabilities, the market becomes a contest of who can manipulate better, rather than who can predict better.

  2. Reasonable Accuracy The mechanism must make correct adjudications most of the time. In a world filled with genuine ambiguities, perfect accuracy is impossible, but systematic errors or glaring mistakes will destroy credibility.

  3. Pre-emptive Transparency Traders need to know exactly how adjudication will occur before placing their bets. Changing the rules mid-course violates the fundamental contract between the platform and participants.

  4. Credible Neutrality Participants need to believe that the mechanism does not favor any particular trader or outcome. This is why having those holding a large amount of UMA adjudicate contracts they have bet on is so problematic: even if they act fairly, the appearance of conflicts of interest undermines trust.

Human review panels can satisfy some of these attributes, but they struggle to achieve others—especially resistance to manipulation and credible neutrality—at scale. Token-based voting systems like UMA also have their own well-documented issues regarding whale dominance and conflicts of interest.

This is where AI comes in.

Reasons to Support LLM Judges

This is a proposal that has garnered attention within the prediction market community: using large language models as adjudicating judges and locking specific models and prompts on the blockchain at the time of contract creation.

The basic architecture is as follows: at the time of contract creation, market makers must not only specify the adjudication criteria in natural language but also specify the exact LLM (identified by a timestamped model version) and the exact prompts to determine the outcome.

This specification is cryptographically submitted to the blockchain. When trading opens, participants can inspect the complete adjudication mechanism; they know exactly which AI model will adjudicate the outcome, what prompts it will receive, and what information sources it can access.

If they do not like this setup, they do not trade.

At the time of adjudication, the submitted LLM runs using the submitted prompts, accesses the specified information sources, and generates a judgment. The output determines who receives payouts.

This approach simultaneously addresses several key constraints:

  • Strong Resistance to Manipulation (though not absolute) Unlike a Wikipedia page or a small news site, you cannot easily edit the output of a mainstream LLM. The model's weights are fixed at submission. To manipulate the adjudication, opponents would need to compromise the information sources the model relies on or somehow poison the model's training data long ago; both of these attack vectors are costly and uncertain compared to bribing oracles or editing maps.

  • Providing Accuracy With the rapid improvement of reasoning models and their ability to handle an astonishing range of intellectual tasks, especially when they can browse the web for new information, LLM judges should be able to accurately adjudicate many markets—experiments are underway to assess their accuracy.

  • Built-in Transparency The entire adjudication mechanism is visible and auditable before anyone places a bet. No mid-course rule changes, no discretionary judgments, no backroom negotiations. You know exactly what you are signing up for.

  • Significantly Enhanced Credible Neutrality LLMs have no economic stake in the outcomes. They cannot be bribed. They do not own UMA tokens. Their biases, whatever they may be, are attributes of the model itself—not attributes of ad hoc decisions made by stakeholders.

Limitations of AI and Defensive Measures

  • Models Can Make Mistakes LLMs may misinterpret news articles, generate factual hallucinations, or inconsistently apply adjudication standards. However, as long as traders know which model they are betting on, they can factor these flaws into the pricing. If a specific model has known tendencies to resolve ambiguities in a certain way, seasoned traders will take that into account. The model does not need to be perfect; it needs to be predictable.

  • Not Impossible to Manipulate If prompts specify particular news sources, opponents may attempt to plant stories within those sources. This type of attack can be costly for mainstream media but may be feasible for smaller outlets—this is another form of the map editing issue. The design of prompts is crucial here: adjudication mechanisms that rely on diverse, redundant sources are more robust than those that depend on single points of failure.

  • Poisoning Attacks Are Theoretically Possible Opponents with sufficient resources may attempt to influence the training data of LLMs to bias their future judgments. However, this requires action long before the contract is created, with uncertain returns and high costs—this is a much higher threshold than bribing committee members.

  • The Diffusion of LLM Judges Can Create Coordination Problems If different market creators use different prompts for different LLMs, liquidity will become fragmented. Traders will find it difficult to compare contracts or aggregate information across markets. Standardization is valuable—but allowing the market to discover which combinations of LLMs and prompts work best is also valuable. The right answer may be some combination: allowing experimentation to occur while establishing mechanisms for the community to converge on well-tested defaults over time.

Four Recommendations for Builders

In summary: AI-based adjudication essentially trades one set of problems (human biases, conflicts of interest, opacity) for another set of problems (model limitations, prompt engineering challenges, information source vulnerabilities), and the latter set may be more manageable. So how should we move forward? Platforms should:

  1. Experiment: Test LLM adjudication on lower-risk contracts to establish a track record. Which models perform best? Which prompt structures are most robust? What failure modes emerge in practice?

  2. Standardize: As best practices emerge, the community should work towards establishing standardized combinations of LLMs and prompts as defaults. This does not preclude innovation but helps concentrate liquidity in well-understood markets.

  3. Build Transparent Tools: For example, create interfaces that make it easy for traders to check the complete adjudication mechanism—models, prompts, information sources—before trading. Resolution rules should not be buried in fine print.

  4. Engage in Ongoing Governance: Even with AI judges, humans still need to be responsible for setting the top-level rules: which models to trust, how to handle situations where models give clearly incorrect answers, when to update defaults. The goal is not to remove humans entirely from the loop but to shift them from ad hoc case-by-case judgments to systematic rule-making.

Prediction markets have extraordinary potential to help us understand a noisy, complex world. But this potential depends on trust, and trust depends on fair contract adjudication. We have seen the consequences of resolution mechanism failures: confusion, anger, and traders walking away. I have witnessed people completely exit prediction markets in frustration after feeling deceived by an outcome that seemed to violate the spirit of their bets—vowing never to use their previously favored platforms again. This represents a missed opportunity for unlocking the benefits of prediction markets and broader applications.

LLM judges are not perfect. But when combined with cryptographic technology, they are transparent, neutral, and capable of resisting the manipulations that have long plagued human systems. In a world where the scaling of prediction markets outpaces our governance mechanisms, this may be exactly what we need.

Recent Fundraising

More
-- Jan 22
$1M Jan 22
-- Jan 22

New Tokens

More
Jan 26
Jan 23
Jan 22

Latest Updates on 𝕏

More
Jan 23
Jan 23