Notion migrated from Spark on EMR to Ray, cutting embedding costs 80% and improving query latency 10x. Uber and Salesforce shared similar AI infrastructure winsNotion migrated from Spark on EMR to Ray, cutting embedding costs 80% and improving query latency 10x. Uber and Salesforce shared similar AI infrastructure wins

Notion Slashes AI Embedding Costs 80% After Ditching Spark for Ray

2026/04/10 00:48
3 min di lettura
Per feedback o dubbi su questo contenuto, contattateci all'indirizzo crypto.news@mexc.com.

Notion Slashes AI Embedding Costs 80% After Ditching Spark for Ray

James Ding Apr 09, 2026 16:48

Notion migrated from Spark on EMR to Ray, cutting embedding costs 80% and improving query latency 10x. Uber and Salesforce shared similar AI infrastructure wins.

Notion Slashes AI Embedding Costs 80% After Ditching Spark for Ray

Notion has slashed its AI embedding pipeline costs by more than 80% after migrating from Apache Spark to Ray, the distributed computing framework backed by Anyscale. The productivity software company also achieved 10x improvements in query latency while consolidating three separate jobs per region into one.

The migration details emerged at Ray Day Seattle on April 9, 2026, where ML engineers from Notion, Uber, Salesforce, and Apple shared hard-won lessons about scaling AI infrastructure.

What Notion Actually Changed

Mickey Liu, a software engineer on Notion's search platform team, walked through the overhaul. Their original setup used a three-step Spark pipeline running on Amazon EMR: data chunking, third-party API calls for embedding generation, and writes to a vector store.

The pain points were predictable but severe. Double compute costs. Third-party API rate limits throttling throughput. Debugging nightmares when failures occurred across tools—driver and executor logs weren't even persisted in YARN.

The new architecture streams Kafka data directly into a Ray cluster handling CPU chunking, GPU embedding generation, and vector store writes in a single pipeline. No intermediate S3 handoffs. What started as the backend for a Q&A feature in 2023 now powers all of Notion AI and custom agents.

Uber and Salesforce Report Similar Gains

Uber's Peng Zhang detailed how their Michelangelo ML platform evolved from TensorFlow/Horovod to Ray with PyTorch. The standout move: separating CPU data-loading nodes from GPU training nodes in a heterogeneous cluster design. Result? GPU utilization jumped 20%, and training time dropped roughly 50% in select pipelines.

Salesforce tackled a different beast—summarizing documents up to 200,000 tokens long (roughly a short novel) with P95 latency under 15 seconds. Their team used Ray to chunk documents and run parallel inference across a distributed actor pool with vLLM, then merge results. They landed on 1-2 GPU data parallelism as the sweet spot after running scaling experiments directly on Ray.

Why This Matters Beyond These Companies

Robert Nishihara, Ray's co-creator and Anyscale co-founder, opened the event by framing the core problem: AI infrastructure keeps getting harder. Multimodal data processing, reinforcement learning workloads, and multi-node LLM inference are pushing existing tools past their limits.

Every speaker landed on the same conclusion from different angles—their previous tooling ran out of road.

Apple engineers Charlie Chen and Haocheng Bian highlighted foundation model training challenges: massive unstructured data, billion-plus parameters, and sparse architectures like Mixture of Experts. Traditional engines fail because data pipelines and training frameworks run in separate environments with no shared context.

What's Next

Ray Day Seattle kicked off Anyscale's 2026 "Ray on the Road" tour—eight cities across three countries. The company is also running invite-only customer roundtables at each stop to preview their product roadmap.

For teams hitting similar walls with Spark or other distributed frameworks, Notion's full technical writeup is available on their engineering blog under "Two Years of Vector Search at Notion." The 80% cost reduction and 10x latency improvement offer a concrete benchmark for anyone evaluating similar migrations.

Image source: Shutterstock
  • ai infrastructure
  • ray
  • machine learning
  • enterprise tech
  • cost optimization
Opportunità di mercato
Logo Raydium
Valore Raydium (RAY)
$0.6956
$0.6956$0.6956
-1.68%
USD
Grafico dei prezzi in tempo reale di Raydium (RAY)
Disclaimer: gli articoli ripubblicati su questo sito provengono da piattaforme pubbliche e sono forniti esclusivamente a scopo informativo. Non riflettono necessariamente le opinioni di MEXC. Tutti i diritti rimangono agli autori originali. Se ritieni che un contenuto violi i diritti di terze parti, contatta crypto.news@mexc.com per la rimozione. MEXC non fornisce alcuna garanzia in merito all'accuratezza, completezza o tempestività del contenuto e non è responsabile per eventuali azioni intraprese sulla base delle informazioni fornite. Il contenuto non costituisce consulenza finanziaria, legale o professionale di altro tipo, né deve essere considerato una raccomandazione o un'approvazione da parte di MEXC.

Potrebbe anche piacerti

CME Group to launch Solana and XRP futures options in October

CME Group to launch Solana and XRP futures options in October

The post CME Group to launch Solana and XRP futures options in October appeared on BitcoinEthereumNews.com. CME Group is preparing to launch options on SOL and XRP futures next month, giving traders new ways to manage exposure to the two assets.  The contracts are set to go live on October 13, pending regulatory approval, and will come in both standard and micro sizes with expiries offered daily, monthly and quarterly. The new listings mark a major step for CME, which first brought bitcoin futures to market in 2017 and added ether contracts in 2021. Solana and XRP futures have quickly gained traction since their debut earlier this year. CME says more than 540,000 Solana contracts (worth about $22.3 billion), and 370,000 XRP contracts (worth $16.2 billion), have already been traded. Both products hit record trading activity and open interest in August. Market makers including Cumberland and FalconX plan to support the new contracts, arguing that institutional investors want hedging tools beyond bitcoin and ether. CME’s move also highlights the growing demand for regulated ways to access a broader set of digital assets. The launch, which still needs the green light from regulators, follows the end of XRP’s years-long legal fight with the US Securities and Exchange Commission. A federal court ruling in 2023 found that institutional sales of XRP violated securities laws, but programmatic exchange sales did not. The case officially closed in August 2025 after Ripple agreed to pay a $125 million fine, removing one of the biggest uncertainties hanging over the token. This is a developing story. This article was generated with the assistance of AI and reviewed by editor Jeffrey Albus before publication. Get the news in your inbox. Explore Blockworks newsletters: Source: https://blockworks.co/news/cme-group-solana-xrp-futures
Condividi
BitcoinEthereumNews2025/09/17 23:55
Zelenskyy warns Russia aims to involve Belarus in Ukraine conflict

Zelenskyy warns Russia aims to involve Belarus in Ukraine conflict

The post Zelenskyy warns Russia aims to involve Belarus in Ukraine conflict appeared on BitcoinEthereumNews.com. Zelenskyy said Russia is trying to draw Belarus
Condividi
BitcoinEthereumNews2026/04/18 11:12
Bitcoin, Gold, and U.S. Stocks Dive as Trump Pledges to Hit Iran ‘Extremely Hard’

Bitcoin, Gold, and U.S. Stocks Dive as Trump Pledges to Hit Iran ‘Extremely Hard’

The post Bitcoin, Gold, and U.S. Stocks Dive as Trump Pledges to Hit Iran ‘Extremely Hard’ appeared on BitcoinEthereumNews.com. In brief Bitcoin dropped Thursday
Condividi
BitcoinEthereumNews2026/04/02 17:57

USD1 Genesis: 0 Fees + 12% APR

USD1 Genesis: 0 Fees + 12% APRUSD1 Genesis: 0 Fees + 12% APR

New users: stake for up to 600% APR. Limited time!