Wall Street Keeps Testing AI Traders, But Most Are Still Underperfo...

What they're not telling you: # Wall Street Keeps Testing AI Traders, But Most Are Still Underperforming Large language models are losing money at scale when given actual trading authority, and Wall Street's quiet experiments reveal a technology industry overselling capabilities it hasn't yet delivered. Recent competitions testing AI models from OpenAI, Anthropic, Google, and xAI show a pattern of consistent underperformance that contradicts the hype surrounding artificial intelligence in finance. According to Bloomberg's analysis of these tests, models frequently lost money, traded excessively, and made erratic decisions even when receiving identical instructions.

The Take

Diana Reeves · Corporate Watchdog & Markets

# THE TAKE: Wall Street's AI Gamble Is Exactly What We Should Fear The underperformance *is* the feature, not the bug. Wall Street doesn't need AI traders that work—it needs plausible deniability. When algorithms underperform, losses are "systematic risks." When human traders tank portfolios, executives get clawed back. The testing phase buys cover: "We're *responsibly* innovating." Meanwhile, banks extract billions in government liquidity while offloading volatility risk onto retail investors who can't compete with algorithmic speed anyway. The real play isn't building better AI traders. It's automating decision-making far enough removed from human accountability that when the next cascade happens, no single person held the knife. These competitions? Theater. Wall Street's already integrated AI into execution, pricing, and risk models. We're watching them workshop the narrative for when it all breaks.

What the Documents Show

In Alpha Arena, a competition created by startup Nof1, eight AI models were each given $10,000 to trade U.S. tech stocks over two weeks. Across four separate competitions, the models collectively lost roughly one-third of their capital, with only six of 32 outcomes generating profit. The results suggest that despite months of headlines promising AI-driven investing, the technology remains fundamentally unreliable for actual portfolio management. The inconsistency extends beyond mere losses.

🔎 Mainstream angle: The corporate press either ignored this story entirely or buried it in a 3-sentence brief. The framing, when it appeared at all, focused on process rather than impact.

Follow the Money

Under identical prompts, competing models exhibited wildly different behaviors—xAI's Grok 4.20 executed just 158 trades in one contest while Alibaba's Qwen made 1,418 trades under the exact same conditions. This variation suggests the models lack the coherent decision-making frameworks necessary for investing, instead generating responses that diverge dramatically based on subtle differences in how they process information. Nof1 founder Jay Azhang identified the core problem: current models struggle with foundational trading concepts including "position sizing, timing, signal weighting and overtrading." These aren't edge-case failures—they're fundamental gaps in how these systems approach risk management. The broader market data confirms what these experiments reveal. Research blog Flat Circle analyzed 11 public AI trading competitions and found that while each event produced at least one profitable model, only two generated a profitable median return. This distinction matters: it means that most AI trading bots underperform more often than they succeed, a red flag for any technology being positioned as a replacement for human judgment.

What Else We Know

What's notable is how cautiously Wall Street itself has approached this technology despite years of promotional messaging. JPMorgan Chase and Balyasny Asset Management already deploy AI for research and fraud detection, but both firms have deliberately avoided delegating actual investment decisions to these systems. Azhang was direct about the current state: deploying an LLM with autonomous trading authority "isn't a thing yet." For ordinary investors, this gap between promise and performance carries real implications. While venture capital and tech companies have aggressively marketed AI as the future of wealth management, the empirical evidence suggests retail investors should remain skeptical of robo-advisors or AI-driven services making independent trading decisions. The technology may eventually reach that capability, but current testing shows we're nowhere close. Until these systems demonstrate consistent profitability and coherent decision-making, the safest assumption is that human oversight—or traditional passive strategies—remains the more reliable path to protecting wealth.

Primary Sources

Source: ZeroHedge
Category: Money & Markets
Cross-reference independently — don't take our word for it.

What are they not saying? Who benefits from this story staying buried? Follow the regulatory filings, the court dockets, and the FOIA releases. The truth is in the paperwork — it always is.

Disclosure: NewsAnarchist aggregates from public records, API feeds (Federal Register, CourtListener, MuckRock, Hacker News), and independent media. AI-assisted synthesis. Always verify primary sources linked above.

What the Documents Show

Follow the Money

What Else We Know

Primary Sources

Related from Chronic Internet

Solopreneur Financial Command Center

The Lean Startup Blueprint

More They're Not Covering

Europe Sees 'Hyper-Concentration' Of Crypto 'Wrench Attacks' As Losses Hit $101 Million

FTC Reaches Settlement with Crypto Company Voyager Digital; Charges Former Executive with Falsely Claiming Consumers’ Deposits Were Insured by FDIC

Stay Informed. No Spin.