LLMs and the Causal Conundrum: Inferring Reality or Mere Illusion?
Updated: Oct 19
In the burgeoning field of generative AI, Large Language Models (LLMs) stand out for their ability to engage in "meaningful" human interaction. Their prowess owes much to modern processing power and a carefully curated compendium of texts, from literature to scientific papers. While still nascent, the notion that LLMs can achieve some form of "causal inference" has divided experts. On one side is Geoffrey Hinton, arguing that these models are already showing signs of causal reasoning, and cautioning against uncontrolled development. On the opposing bench is Yann Lecun, who insists that robust evidence is lacking and advocates for open-sourcing AI to ensure its safety—a policy aligned with his employer, Meta's, open-source strategy for its LLaMa model. This mirrors Google's earlier playbook with Android in its battle against Apple's iPhone. At stake is whether LLMs' "meta-causal representations" are synonymous with true "causal inference." This week, we delve deep into this schism, while also taking a glance at the gold rush among AI engineers, reminiscent of the erstwhile "golden boys" of the financial sector.
Last week's discourse pivoted on the contentious issue of AI sentience, questioning whether artificial entities, constrained by limited sensory capabilities, could ever grasp human-like "sentience." This week, however, we delve into an even murkier intellectual swamp: the quest for "causal inference" in Large Language Models (LLMs).
In the world of linear regression, linking independent and dependent variables might offer insights, but does it unveil causality? Herein lies a nuanced challenge. Mere correlation is not causation, a maxim as old as statistics itself. Traditional statistical engines, LLMs included, excel in flagging correlations, but inferring causation is another matter. This typically demands rigorous accounting for confounding variables—those stealth agents that give the illusion of causality where none exists.
Where controlled, randomized experiments—the gold standard for establishing causality—are impractical, causal AI methods like instrumental variables, counterfactual reasoning, or causal Bayesian networks offer alternative pathways. These models hinge on underlying assumptions that demand validation; failing this, their trustworthiness evaporates.
Consider LLMs: even if these advanced systems could untangle the Gordian knot of causality, their "black box" architecture presents an obstacle to human interpretation. In high-stakes domains, such as healthcare, this is more than a mere academic concern. There, a statistical link between a drug and patient recovery rates is insufficient; clinical trials and expert vetting are non-negotiable.
The Classic Debate
The academic wrangling over "causal inference" is hardly a fresh debate. One could trace its modern contours back to 1994 with the seminal work "Designing Social Inquiry: Scientific Inference in Qualitative Research," penned by the trio of Gary King, Robert Keohane, and Sidney Verba—collectively known as "KKV." At the time, KKV levied a critique against the emerging field of qualitative research (QUAL). Their contention was that QUAL, often hobbled by limited sample sizes and confined to case studies, suffered from a dearth of robust causal inference. The implication was clear: QUAL needed to adopt the rigors of its quantitative cousin to gain academic respectability.
Yet the counter-argument came fast and fierce, notably in the form of "After KKV: The New Methodology of Qualitative Research." This rejoinder argued that KKV had missed the point by focusing on Data-Set Observations (DSOs) and an overreliance on deductive reasoning. QUAL, the argument went, operates in a higher dimension, one of Causal-Process Observations (CPOs), be they Independent Variable CPOs, Mechanism CPOs, or Auxiliary Outcome CPOs. These CPOs capture nuances that a DSO-based approach cannot fathom. Consequently, QUAL can both "build theory" and "test theory," showcasing a versatility and depth that quantitative methods (QUAN) struggle to match.
What Really "Language" Is?
The debate over causal inference in the academic sphere has echoes in the rapidly evolving realm of artificial intelligence, especially in the subfield dedicated to Large Language Models (LLMs). However, the stakes are even higher here, further complicated by an array of variables often too labyrinthine for even the most advanced models to navigate. In the real world, capturing "every" variable and their interrelationships is a Sisyphean task. In this context, human language serves as a distilled form of causal inference. It is a simplifying lens—a reductionist, yet remarkably efficient, framework for conceptual communication.
Even as this holds true for human cognition, it's important to note that the brain functions through selective attention mechanisms, prioritizing immediate and crucial information to prevent data overflow. This filtering is essential for managing the flood of sensory data, ensuring that only what is "necessary" for causal inference is processed.
Large Language Models (LLMs) may well be stumbling upon a sort of "concise causal inference." Trained on enormous datasets, these models generate responses by identifying patterns, some of which may quietly echo causal relationships, even if they aren't explicitly designed to do so. This emerging capability is spotlighted in a recent arXiv paper, "Causal Reasoning and Large Language Models: Opening a New Frontier for Causality," which posits LLMs as potential game-changers in the arena of causal analysis.
Yet, there are caveats to consider. For starters, LLMs should be viewed as augmentative to existing causal methods, rather than as replacements. They offer a fresh perspective, yes, but traditional approaches continue to hold their own unique value.
Meta-Causal Representation: Language as a "Concise Instrument" for LLMs' Indirect Approach to Reality—Yet Calibration with "Ground Truth" Remains Essential for Model Efficiency.
It's the domain knowledge encapsulated within LLMs that often empowers them to venture into causal analysis. This rich tapestry of human insight is an invaluable asset for any form of causal scrutiny. But let's not confuse potential with present capabilities. As revolutionary as LLMs may appear, empirical validation is required to gauge the reliability and accuracy of their generated causal inferences in real-world applications.
Furthermore, while reaching the "Holy Grail" of true causal understanding is an aspiration worth its weight in computational gold, we must proceed with caution. Navigating the complex labyrinth of causal inference involves a careful calibration of experimental setups, a consideration of potential confounding variables, and an intimate understanding of domain-specific nuances.
AI Industrial Monitoring
In a sector marked by eyebrow-raising compensation figures, AI researchers are in a league of their own. According to Rora's 2023 Salary Negotiation Report, the pay packages at leading AI firms defy gravity. OpenAI takes the lead with an annual remuneration of $865,000, a 30% increase from the previous year's $665,000. Trailing closely are Anthropic with $855,000 and Inflection at $825,000, the latter of which hasn't provided comparative data. Tesla follows with an 11% year-on-year increase, bringing the annual compensation to $780,000. Amazon's AI researchers enjoy a 38% hike, sitting at $719,000, while Google Brain trails at $695,000, up 17% from last year.
Despite a grim landscape of tech layoffs—in the hundreds of thousands in 2023—the market for AI experts is robust. New entrants like Anthropic and Inflection are flexing their financial muscles, raking in significant funding to poach the best talent. Established tech giants like Google and Meta may have decelerated their hiring sprees compared to yesteryears, but they're far from stagnant. Meanwhile, the surge of venture-backed AI startups, particularly since the advent of models like ChatGPT, signals no signs of slowing down. Venture capitalists, armed with ample reserves, appear undeterred by macroeconomic conditions. As reported by the New York Times, the hunger for AI startups has inflated valuations to an extent that dwarfs the "everything bubble" of 2021. Even industries peripheral to tech—finance, healthcare, and science—are joining the race, each vying for machine learning experts to apply cutting-edge technology in their respective fields.
In the final analysis, the AI sector emerges as a capital-intensive juggernaut, where the towering annual compensations for researchers (as well as AI engineers) are but the tip of the iceberg. When one considers the colossal financial commitments required for data acquisition, pretraining models, and infrastructural setup, the barriers to entry seem increasingly insurmountable for fledgling startups. Unless these nascent companies can secure significant capital infusion, the odds of catching up with established industry titans are steep. In this gold rush, it appears that not everyone will strike it rich, raising questions about the democratization of AI and the concentration of intellectual and financial capital.