The post OpenAI and Paradigm Launch EVMbench to Test AI Agents Against Smart Contract Vulnerabilities appeared on BitcoinEthereumNews.com. TLDR: EVMbench draws The post OpenAI and Paradigm Launch EVMbench to Test AI Agents Against Smart Contract Vulnerabilities appeared on BitcoinEthereumNews.com. TLDR: EVMbench draws

OpenAI and Paradigm Launch EVMbench to Test AI Agents Against Smart Contract Vulnerabilities

TLDR:

  • EVMbench draws from 120 high-severity vulnerabilities curated across 40 real-world smart contract audits. 
  • GPT-5.3-Codex scored 72.2% in exploit mode, far outperforming GPT-5, which only reached 31.9% in testing. 
  • Moonwell and CrossCurve both suffered smart contract exploits recently, adding urgency to AI-driven security tools. 
  • Anthropic’s late 2025 report warned that AI agents can already identify smart contract flaws independently and autonomously.

EVMbench is the latest collaborative effort between OpenAI and crypto investment firm Paradigm. The tool is designed to measure how well AI agents detect, patch, and exploit vulnerabilities in smart contracts.

Built from 120 high-severity vulnerabilities across 40 audits, EVMbench targets the Ethereum Virtual Machine ecosystem.

This development comes amid recent DeFi exploits that have renewed industry focus on smarter, faster contract auditing through artificial intelligence.

EVMbench Tests AI Agents Across Multiple Capability Modes

EVMbench evaluates AI agents across several distinct capability modes. These include detecting vulnerabilities, modifying contract code, and eliminating potential exploitability in deployed contracts.

The benchmark also tests an agent’s ability to execute end-to-end fund-draining attacks in a sandboxed blockchain environment.

OpenAI explained the rationale behind the tool in a blog post on Wednesday. “Smart contracts secure billions of dollars in assets, and AI agents are likely to be transformative for both attackers and defenders,” the company stated. That framing sets the tone for why the benchmark was built in the first place.

The vulnerabilities used in EVMbench were drawn from sponsored open-code audit competitions. They also include security audits conducted for Tempo, a Layer 1 blockchain co-developed by Paradigm and Stripe. This gives the benchmark a real-world foundation rooted in active protocol development.

Early test results show a clear performance gap between AI models. GPT-5.3-Codex scored 72.2% in exploit mode, compared to GPT-5 at just 31.9%. However, coverage for vulnerability detection and patching tasks remains incomplete across both models.

Recent DeFi Attacks Add Urgency to AI-Driven Security Tools

The release of EVMbench follows a series of high-profile smart contract attacks in the DeFi space. Moonwell, a DeFi lending protocol, suffered an exploit this month involving vulnerable code written with AI assistance.

The incident raised fresh concerns about AI-generated code entering production environments without sufficient review.

Around the same time, CrossCurve, a cross-chain liquidity protocol, was compromised through a smart contract vulnerability.

The attack resulted in losses of roughly $3 million across multiple networks. Both incidents point to the growing financial risk tied to unaudited contract code.

OpenAI addressed the broader stakes in its blog post directly. “As AI agents improve at reading, writing, and executing code, it becomes increasingly important to measure their capabilities in economically meaningful environments,” the company wrote. The statement reinforces why a structured benchmark like EVMbench is being introduced now.

Late last year, Anthropic published a separate report on this topic. The report argued that AI agents have already advanced enough to identify smart contract vulnerabilities independently.

As a result, the cost of crypto exploits could decrease over time as AI-powered auditing becomes standard practice.

The post OpenAI and Paradigm Launch EVMbench to Test AI Agents Against Smart Contract Vulnerabilities appeared first on Blockonomi.

Source: https://blockonomi.com/openai-and-paradigm-launch-evmbench-to-test-ai-agents-against-smart-contract-vulnerabilities/

Market Opportunity
Smart Blockchain Logo
Smart Blockchain Price(SMART)
$0.004543
$0.004543$0.004543
+0.55%
USD
Smart Blockchain (SMART) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

China Launches Cross-Border QR Code Payment Trial

China Launches Cross-Border QR Code Payment Trial

The post China Launches Cross-Border QR Code Payment Trial appeared on BitcoinEthereumNews.com. Key Points: Main event involves China initiating a cross-border QR code payment trial. Alipay and Ant International are key participants. Impact on financial security and regulatory focus on illicit finance. China’s central bank, led by Deputy Governor Lu Lei, initiated a trial of a unified cross-border QR code payment gateway with Alipay and Ant International as participants. This pilot addresses cross-border fund risks, aiming to enhance financial security amid rising money laundering through digital channels, despite muted crypto market reactions. China’s Cross-Border Payment Gateway Trial with Alipay The trial operation of a unified cross-border QR code payment gateway marks a milestone in China’s financial landscape. Prominent entities such as Alipay and Ant International are at the forefront, participating as the initial institutions in this venture. Lu Lei, Deputy Governor of the People’s Bank of China, highlighted the systemic risks posed by increased cross-border fund flows. Changes are expected in the dynamics of digital transactions, potentially enhancing transaction efficiency while tightening regulations around illicit finance. The initiative underscores China’s commitment to bolstering financial security amidst growing global fund movements. “The scale of cross-border fund flows is expanding, and the frequency is accelerating, providing opportunities for risks such as cross-border money laundering and terrorist financing. Some overseas illegal platforms transfer funds through channels such as virtual currencies and underground banks, creating a ‘resonance’ of risks at home and abroad, posing a challenge to China’s foreign exchange management and financial security.” — Lu Lei, Deputy Governor, People’s Bank of China Bitcoin and Impact of China’s Financial Initiatives Did you know? China’s latest initiative echoes the Payment Connect project of June 2025, furthering real-time cross-boundary remittances and expanding its influence on global financial systems. As of September 17, 2025, Bitcoin (BTC) stands at $115,748.72 with a market cap of $2.31 trillion, showing a 0.97%…
Share
BitcoinEthereumNews2025/09/18 05:28
Why Pepeto Could Outperform Every New Presale This Year

Why Pepeto Could Outperform Every New Presale This Year

The post Why Pepeto Could Outperform Every New Presale This Year appeared on BitcoinEthereumNews.com. Crypto Projects What if the next crypto to explode is not
Share
BitcoinEthereumNews2026/02/21 13:16
SEC Clears the Way for Spot Crypto ETFs with New Generic Rules

SEC Clears the Way for Spot Crypto ETFs with New Generic Rules

The post SEC Clears the Way for Spot Crypto ETFs with New Generic Rules appeared first on Coinpedia Fintech News The U.S. SEC has approved new listing standards that simplify the process for launching spot crypto ETFs under the ’33 Act. Cryptocurrencies with listed futures on Coinbase, currently about 12 to 15 coins, will now qualify automatically, removing the need for separate case-by-case approvals. This change streamlines regulatory procedures, cutting delays and hurdles, while opening …
Share
CoinPedia2025/09/18 14:35