The post Vishal Misra: Transformers learn correlations, not causations, the significance of in-context learning, and the role of Bayesian updating in AI appearedThe post Vishal Misra: Transformers learn correlations, not causations, the significance of in-context learning, and the role of Bayesian updating in AI appeared

Vishal Misra: Transformers learn correlations, not causations, the significance of in-context learning, and the role of Bayesian updating in AI

2026/04/11 10:00
11 min di lettura
Per feedback o dubbi su questo contenuto, contattateci all'indirizzo crypto.news@mexc.com.


Understanding transformers’ limitations reveals the crucial shift needed from correlation to causation for true AI advancement.

Key Takeaways

  • Transformers primarily learn correlations, not causations, limiting their ability to achieve true intelligence.
  • Achieving AGI requires models that can transition from learning correlations to understanding causations.
  • Large language models generate text by predicting the next token based on probability distributions.
  • The context provided in prompts significantly influences the output of language models.
  • Language models operate on sparse matrices where many token combinations are nonsensical.
  • In-context learning allows LLMs to solve problems in real-time using examples.
  • Domain-specific languages (DSLs) can simplify complex database queries into natural language.
  • In-context learning in LLMs is similar to Bayesian updating, adjusting probabilities with new evidence.
  • The debate between Bayesian and frequentist approaches affects the perception of new machine learning models.
  • The Bayesian wind tunnel concept offers a controlled environment for testing machine learning architectures.
  • Understanding the mechanics of LLMs is crucial for leveraging their applications effectively.
  • The transition from correlation to causation is a significant hurdle in AI development.
  • Contextual relevance in LLMs highlights the importance of prompt selection.
  • Sparse matrices in language models enhance efficiency by filtering out irrelevant token combinations.
  • The Bayesian wind tunnel provides a novel framework for evaluating machine learning models.

Guest intro

Vishal Misra is Professor of Computer Science and Electrical Engineering and Vice Dean of Computing and AI at Columbia University’s School of Engineering. He returns to the a16z Podcast to discuss his latest research revealing how transformers in LLMs update predictions in a precise, mathematically predictable manner as they process new information. His work highlights the gap to AGI, emphasizing the need for continuous post-training learning and causal understanding over pattern matching.

Understanding transformers and LLMs

  • — Vishal Misra

  • LLMs primarily learn correlations rather than causations, which limits their intelligence.
  • — Vishal Misra

  • Achieving AGI requires models that can learn causations, not just correlations.
  • — Vishal Misra

  • LLMs generate text by constructing a probability distribution for the next token.
  • — Vishal Misra

  • Understanding the mechanics of LLMs is crucial for leveraging their applications effectively.

The role of context in language models

  • The behavior of language models is influenced by the prior context provided in prompts.
  • — Vishal Misra

  • Contextual relevance in LLMs highlights the importance of prompt selection.
  • Language models operate on a sparse matrix where many combinations of tokens are nonsensical.
  • — Vishal Misra

  • Sparse matrices enhance efficiency by filtering out irrelevant token combinations.
  • The context provided can drastically change the output of language models.
  • Understanding how language models generate text based on input prompts is essential.

In-context learning and real-time problem solving

  • In-context learning allows LLMs to learn and solve problems in real-time.
  • — Vishal Misra

  • LLMs process and learn from new information through examples.
  • In-context learning resembles Bayesian updating, adjusting probabilities with new evidence.
  • — Vishal Misra

  • This mechanism is crucial for understanding the capabilities of LLMs.
  • Real-time problem solving in LLMs is enabled by in-context learning.
  • The ability to learn from examples showcases the adaptability of LLMs.

Domain-specific languages and data accessibility

  • Domain-specific languages (DSLs) convert natural language queries into a processable format.
  • — Vishal Misra

  • DSLs simplify complex database queries into natural language.
  • The creation of DSLs showcases innovation in using AI for specific applications.
  • Understanding the challenges of querying complex databases is essential.
  • DSLs enhance user interactions with data by simplifying query processes.
  • The development of DSLs highlights the role of AI in data accessibility.
  • This approach provides a technical solution to common problems in data accessibility.

Bayesian updating and statistical approaches in AI

  • In-context learning in language models resembles Bayesian updating.
  • — Vishal Misra

  • Understanding Bayesian inference is crucial for grasping how LLMs process information.
  • The distinction between Bayesian and frequentist approaches affects AI model perceptions.
  • — Vishal Misra

  • The debate between these approaches impacts the reception of new research.
  • Bayesian updating provides a clear mechanism for in-context learning in LLMs.
  • This statistical concept links well-established methodologies with modern AI processes.

The Bayesian wind tunnel and model testing

  • The Bayesian wind tunnel concept allows for testing machine learning architectures.
  • — Vishal Misra

  • This concept provides a controlled environment for evaluating models.
  • Testing architectures like transformers, MAMBA, LSTMs, and MLPs is facilitated by this framework.
  • Understanding the concept of a wind tunnel in aerospace helps grasp its application in AI.
  • The Bayesian wind tunnel offers a novel framework for advancing machine learning.
  • This approach is critical for evaluating and improving AI models.
  • The controlled testing environment enhances the reliability of model assessments.
Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

Understanding transformers’ limitations reveals the crucial shift needed from correlation to causation for true AI advancement.

Key Takeaways

  • Transformers primarily learn correlations, not causations, limiting their ability to achieve true intelligence.
  • Achieving AGI requires models that can transition from learning correlations to understanding causations.
  • Large language models generate text by predicting the next token based on probability distributions.
  • The context provided in prompts significantly influences the output of language models.
  • Language models operate on sparse matrices where many token combinations are nonsensical.
  • In-context learning allows LLMs to solve problems in real-time using examples.
  • Domain-specific languages (DSLs) can simplify complex database queries into natural language.
  • In-context learning in LLMs is similar to Bayesian updating, adjusting probabilities with new evidence.
  • The debate between Bayesian and frequentist approaches affects the perception of new machine learning models.
  • The Bayesian wind tunnel concept offers a controlled environment for testing machine learning architectures.
  • Understanding the mechanics of LLMs is crucial for leveraging their applications effectively.
  • The transition from correlation to causation is a significant hurdle in AI development.
  • Contextual relevance in LLMs highlights the importance of prompt selection.
  • Sparse matrices in language models enhance efficiency by filtering out irrelevant token combinations.
  • The Bayesian wind tunnel provides a novel framework for evaluating machine learning models.

Guest intro

Vishal Misra is Professor of Computer Science and Electrical Engineering and Vice Dean of Computing and AI at Columbia University’s School of Engineering. He returns to the a16z Podcast to discuss his latest research revealing how transformers in LLMs update predictions in a precise, mathematically predictable manner as they process new information. His work highlights the gap to AGI, emphasizing the need for continuous post-training learning and causal understanding over pattern matching.

Understanding transformers and LLMs

  • — Vishal Misra

  • LLMs primarily learn correlations rather than causations, which limits their intelligence.
  • — Vishal Misra

  • Achieving AGI requires models that can learn causations, not just correlations.
  • — Vishal Misra

  • LLMs generate text by constructing a probability distribution for the next token.
  • — Vishal Misra

  • Understanding the mechanics of LLMs is crucial for leveraging their applications effectively.

The role of context in language models

  • The behavior of language models is influenced by the prior context provided in prompts.
  • — Vishal Misra

  • Contextual relevance in LLMs highlights the importance of prompt selection.
  • Language models operate on a sparse matrix where many combinations of tokens are nonsensical.
  • — Vishal Misra

  • Sparse matrices enhance efficiency by filtering out irrelevant token combinations.
  • The context provided can drastically change the output of language models.
  • Understanding how language models generate text based on input prompts is essential.

In-context learning and real-time problem solving

  • In-context learning allows LLMs to learn and solve problems in real-time.
  • — Vishal Misra

  • LLMs process and learn from new information through examples.
  • In-context learning resembles Bayesian updating, adjusting probabilities with new evidence.
  • — Vishal Misra

  • This mechanism is crucial for understanding the capabilities of LLMs.
  • Real-time problem solving in LLMs is enabled by in-context learning.
  • The ability to learn from examples showcases the adaptability of LLMs.

Domain-specific languages and data accessibility

  • Domain-specific languages (DSLs) convert natural language queries into a processable format.
  • — Vishal Misra

  • DSLs simplify complex database queries into natural language.
  • The creation of DSLs showcases innovation in using AI for specific applications.
  • Understanding the challenges of querying complex databases is essential.
  • DSLs enhance user interactions with data by simplifying query processes.
  • The development of DSLs highlights the role of AI in data accessibility.
  • This approach provides a technical solution to common problems in data accessibility.

Bayesian updating and statistical approaches in AI

  • In-context learning in language models resembles Bayesian updating.
  • — Vishal Misra

  • Understanding Bayesian inference is crucial for grasping how LLMs process information.
  • The distinction between Bayesian and frequentist approaches affects AI model perceptions.
  • — Vishal Misra

  • The debate between these approaches impacts the reception of new research.
  • Bayesian updating provides a clear mechanism for in-context learning in LLMs.
  • This statistical concept links well-established methodologies with modern AI processes.

The Bayesian wind tunnel and model testing

  • The Bayesian wind tunnel concept allows for testing machine learning architectures.
  • — Vishal Misra

  • This concept provides a controlled environment for evaluating models.
  • Testing architectures like transformers, MAMBA, LSTMs, and MLPs is facilitated by this framework.
  • Understanding the concept of a wind tunnel in aerospace helps grasp its application in AI.
  • The Bayesian wind tunnel offers a novel framework for advancing machine learning.
  • This approach is critical for evaluating and improving AI models.
  • The controlled testing environment enhances the reliability of model assessments.
Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

Loading more articles…

You’ve reached the end


Add us on Google

`;
}

function createMobileArticle(article) {
const displayDate = getDisplayDate(article);
const editorSlug = article.editor ? article.editor.toLowerCase().replace(/\s+/g, ‘-‘) : ”;
const captionHtml = article.imageCaption ? `

${article.imageCaption}

` : ”;
const authorHtml = article.isPressRelease ? ” : `
`;

return `


${captionHtml}

${article.subheadline ? `

${article.subheadline}

` : ”}

${createSocialShare()}

${authorHtml}
${displayDate}

${article.content}

${article.isPressRelease ? ” : article.isSponsored ? `

Disclosure: This is sponsored content. It does not represent Crypto Briefing’s editorial views. For more information, see our Editorial Policy.

` : `

Disclosure: This article was edited by ${article.editor}. For more information on how we create and review content, see our Editorial Policy.

`}

`;
}

function createDesktopArticle(article, sidebarAdHtml) {
const editorSlug = article.editor ? article.editor.toLowerCase().replace(/\s+/g, ‘-‘) : ”;
const displayDate = getDisplayDate(article);
const captionHtml = article.imageCaption ? `

${article.imageCaption}

` : ”;
const categoriesHtml = article.categories.map((cat, i) => {
const separator = i < article.categories.length – 1 ? ‘|‘ : ”;
return `${cat}${separator}`;
}).join(”);
const desktopAuthorHtml = article.isPressRelease ? ” : `
`;

return `

${categoriesHtml}

${article.subheadline ? `

${article.subheadline}

` : ”}

${desktopAuthorHtml}
${displayDate}
${createSocialShare()}

${captionHtml}

${article.content}
${article.isPressRelease ? ” : article.isSponsored ? `
Disclosure: This is sponsored content. It does not represent Crypto Briefing’s editorial views. For more information, see our Editorial Policy.

` : `

Disclosure: This article was edited by ${article.editor}. For more information on how we create and review content, see our Editorial Policy.

`}

`;
}

function loadMoreArticles() {
if (isLoading || !hasMore) return;

isLoading = true;
loadingText.classList.remove(‘hidden’);

// Build form data for AJAX request
const formData = new FormData();
formData.append(‘action’, ‘cb_lovable_load_more’);
formData.append(‘current_post_id’, lastLoadedPostId);
formData.append(‘primary_cat_id’, primaryCatId);
formData.append(‘before_date’, lastLoadedDate);
formData.append(‘loaded_ids’, loadedPostIds.join(‘,’));

fetch(ajaxUrl, {
method: ‘POST’,
body: formData
})
.then(response => response.json())
.then(data => {
isLoading = false;
loadingText.classList.add(‘hidden’);

if (data.success && data.has_more && data.article) {
const article = data.article;
const sidebarAdHtml = data.sidebar_ad_html || ”;

// Check for duplicates
if (loadedPostIds.includes(article.id)) {
console.log(‘Duplicate article detected, skipping:’, article.id);
// Update pagination vars and try again
lastLoadedDate = article.publishDate;
loadMoreArticles();
return;
}

// Add to mobile container
mobileContainer.insertAdjacentHTML(‘beforeend’, createMobileArticle(article));

// Add to desktop container with fresh ad HTML
desktopContainer.insertAdjacentHTML(‘beforeend’, createDesktopArticle(article, sidebarAdHtml));

// Update tracking variables
loadedPostIds.push(article.id);
lastLoadedPostId = article.id;
lastLoadedDate = article.publishDate;

// Execute any inline scripts in the new content (for ads)
const newArticle = desktopContainer.querySelector(`article[data-article-id=”${article.id}”]`);
if (newArticle) {
const scripts = newArticle.querySelectorAll(‘script’);
scripts.forEach(script => {
const newScript = document.createElement(‘script’);
if (script.src) {
newScript.src = script.src;
} else {
newScript.textContent = script.textContent;
}
document.body.appendChild(newScript);
});
}

// Trigger Ad Inserter if available
if (typeof ai_check_and_insert_block === ‘function’) {
ai_check_and_insert_block();
}

// Trigger Google Publisher Tag refresh if available
if (typeof googletag !== ‘undefined’ && googletag.pubads) {
googletag.cmd.push(function() {
googletag.pubads().refresh();
});
}

} else if (data.success && !data.has_more) {
hasMore = false;
endText.classList.remove(‘hidden’);
} else if (!data.success) {
console.error(‘AJAX error:’, data.error);
hasMore = false;
endText.textContent=”Error loading more articles”;
endText.classList.remove(‘hidden’);
}
})
.catch(error => {
console.error(‘Fetch error:’, error);
isLoading = false;
loadingText.classList.add(‘hidden’);
hasMore = false;
endText.textContent=”Error loading more articles”;
endText.classList.remove(‘hidden’);
});
}

// Set up IntersectionObserver
const observer = new IntersectionObserver(function(entries) {
if (entries[0].isIntersecting) {
loadMoreArticles();
}
}, { threshold: 0.1 });

observer.observe(loadingTrigger);
})();

© Decentral Media and Crypto Briefing® 2026.

Source: https://cryptobriefing.com/vishal-misra-transformers-learn-correlations-not-causations-the-significance-of-in-context-learning-and-the-role-of-bayesian-updating-in-ai-ai-a16z/

Opportunità di mercato
Logo Notcoin
Valore Notcoin (NOT)
$0.0004128
$0.0004128$0.0004128
-0.69%
USD
Grafico dei prezzi in tempo reale di Notcoin (NOT)
Disclaimer: gli articoli ripubblicati su questo sito provengono da piattaforme pubbliche e sono forniti esclusivamente a scopo informativo. Non riflettono necessariamente le opinioni di MEXC. Tutti i diritti rimangono agli autori originali. Se ritieni che un contenuto violi i diritti di terze parti, contatta crypto.news@mexc.com per la rimozione. MEXC non fornisce alcuna garanzia in merito all'accuratezza, completezza o tempestività del contenuto e non è responsabile per eventuali azioni intraprese sulla base delle informazioni fornite. Il contenuto non costituisce consulenza finanziaria, legale o professionale di altro tipo, né deve essere considerato una raccomandazione o un'approvazione da parte di MEXC.

Potrebbe anche piacerti

CME Group to launch Solana and XRP futures options in October

CME Group to launch Solana and XRP futures options in October

The post CME Group to launch Solana and XRP futures options in October appeared on BitcoinEthereumNews.com. CME Group is preparing to launch options on SOL and XRP futures next month, giving traders new ways to manage exposure to the two assets.  The contracts are set to go live on October 13, pending regulatory approval, and will come in both standard and micro sizes with expiries offered daily, monthly and quarterly. The new listings mark a major step for CME, which first brought bitcoin futures to market in 2017 and added ether contracts in 2021. Solana and XRP futures have quickly gained traction since their debut earlier this year. CME says more than 540,000 Solana contracts (worth about $22.3 billion), and 370,000 XRP contracts (worth $16.2 billion), have already been traded. Both products hit record trading activity and open interest in August. Market makers including Cumberland and FalconX plan to support the new contracts, arguing that institutional investors want hedging tools beyond bitcoin and ether. CME’s move also highlights the growing demand for regulated ways to access a broader set of digital assets. The launch, which still needs the green light from regulators, follows the end of XRP’s years-long legal fight with the US Securities and Exchange Commission. A federal court ruling in 2023 found that institutional sales of XRP violated securities laws, but programmatic exchange sales did not. The case officially closed in August 2025 after Ripple agreed to pay a $125 million fine, removing one of the biggest uncertainties hanging over the token. This is a developing story. This article was generated with the assistance of AI and reviewed by editor Jeffrey Albus before publication. Get the news in your inbox. Explore Blockworks newsletters: Source: https://blockworks.co/news/cme-group-solana-xrp-futures
Condividi
BitcoinEthereumNews2025/09/17 23:55
Zelenskyy warns Russia aims to involve Belarus in Ukraine conflict

Zelenskyy warns Russia aims to involve Belarus in Ukraine conflict

The post Zelenskyy warns Russia aims to involve Belarus in Ukraine conflict appeared on BitcoinEthereumNews.com. Zelenskyy said Russia is trying to draw Belarus
Condividi
BitcoinEthereumNews2026/04/18 11:12
Bitcoin, Gold, and U.S. Stocks Dive as Trump Pledges to Hit Iran ‘Extremely Hard’

Bitcoin, Gold, and U.S. Stocks Dive as Trump Pledges to Hit Iran ‘Extremely Hard’

The post Bitcoin, Gold, and U.S. Stocks Dive as Trump Pledges to Hit Iran ‘Extremely Hard’ appeared on BitcoinEthereumNews.com. In brief Bitcoin dropped Thursday
Condividi
BitcoinEthereumNews2026/04/02 17:57

USD1 Genesis: 0 Fees + 12% APR

USD1 Genesis: 0 Fees + 12% APRUSD1 Genesis: 0 Fees + 12% APR

New users: stake for up to 600% APR. Limited time!