AI in Theory vs AI in Practice

20 March 2026 | AIMG

There is a growing volume of discussion around “AI”, but much of it diverges at a very basic level: we are not always talking about the same thing.

Some of that is expected. AI, as a field, has existed for decades and spans a wide range of approaches – from highly constrained, domain-specific machine learning systems to newer transformer-based models that can generate language, code, and images. Large language models are only one part of that landscape, but they currently dominate the conversation.

My perspective here is not academic. I am not an AI researcher, and I am not trying to be one. I build and operate systems in real environments, with real constraints, for clients who are less interested in theory and more interested in whether something actually works.

Across research, industry, and implementation contexts, these systems are being interpreted through different lenses. From that position, a consistent pattern emerges.

The conversation about AI often mixes together very different types of systems, expectations, and capabilities. Academic research, product development, and practical implementation are all valid in their own contexts – but when they are combined without clear boundaries, the result is confusion.

That confusion shows up in predictable ways.

At one end, there is an assumption that modern AI systems can act as general-purpose reasoning engines, capable of replacing complex human roles across domains. At the other, there is a dismissal of these systems as unreliable or overhyped. Most views sit somewhere in between, but often without a clear model of what the underlying systems actually are.

What we are seeing is not unique to AI. It follows a familiar pattern described in diffusion of innovations theory, where technologies spread unevenly across groups, each forming its own interpretation of what the technology is and what it can realistically do.

Early adopters tend to explore possibilities and extend capability, sometimes beyond what is currently practical. Later groups, encountering the same technology without that context, may either overestimate or dismiss it entirely. Between these positions sits a gap – not just in knowledge, but in expectations.

Much of the current conversation around AI sits inside that gap.

In practice, those distinctions matter.

There are machine learning systems that operate within well-defined domains, trained on structured data, with measurable and repeatable outcomes. These systems have been in use for years in areas such as fraud detection, optimisation, and risk analysis. They are engineered, tested, and understood within clear boundaries.

There are also large language models – probabilistic systems built on transformer architectures – that generate responses based on patterns in large, partially unverified datasets. These systems are highly flexible and capable of producing coherent outputs, but coherence is not the same as correctness. When they encounter uncertainty, they do not stop; they complete.

Then there are what are often described as “agents”. In most cases, these are orchestrated workflows: prompts, tools, APIs, and control logic combined into multi-step processes. They can appear autonomous, but they do not originate intent or form independent goals. They execute based on signals and constraints defined by the system around them.

All of these are useful. None of them are interchangeable.

Much of the current confusion comes from treating them as if they were.

The recent surge in interest has been driven largely by large language models, particularly once they became accessible through conversational interfaces. That shift made interaction feel natural, fluid, and at times surprisingly capable. It also created the impression that these systems possess a form of general intelligence.

In practice, they do not.

They do not have a grounded model of the world. They do not verify their own outputs. They do not know what they do not know. They generate plausible responses based on patterns, and those responses can be right, wrong, or somewhere in between.

This is not a flaw in the sense of something broken. It is a property of how they work.

The issues that have emerged – hallucinations, prompt injection, governance concerns – are not edge cases. They are natural consequences of using probabilistic systems in contexts that assume reliability. Many of the current solutions, such as retrieval augmentation or multi-step orchestration, are attempts to introduce constraints and verification into systems that do not inherently provide them.

That work is necessary. But it does not change the underlying nature of the system.

A similar point applies to the current discussion around “agents”. The term suggests autonomy and decision-making capability. In reality, most implementations today are structured automation systems that use language models as components. They can chain actions together, call tools, and respond dynamically, but they do not independently decide what they should be doing.

This distinction matters because language shapes expectation.

When systems are described in ways that imply more capability than they actually have, they tend to be used in ways that exceed their limits. When they inevitably fall short, the response often swings in the opposite direction, toward dismissal.

Both reactions come from the same place: a lack of shared understanding.

There is also a practical dimension to this.

In real-world environments, systems have to be maintained, monitored, and trusted to behave within acceptable limits. It is not enough for something to work most of the time, or to produce outputs that look convincing. There needs to be a clear understanding of where it works, where it fails, and how those failures are handled.

This is where the gap between theory and implementation becomes most visible.

From an implementation perspective, large language models are not sources of truth. They are tools for generating possibilities. They can accelerate exploration, assist with synthesis, and reduce the time taken to work through familiar problems. But they do not remove the need for reasoning, validation, or accountability.

Those responsibilities sit outside the model.

A more effective way to use these systems is to treat them as part of a broader process. The model generates and expands. The surrounding system – whether human or technical – provides structure, verification, and decision-making.

Used this way, they are valuable. Used as substitutes for those functions, they become risky.

I have been through multiple technology waves over the years. Each one arrives with a similar pattern: a rapid expansion of possibility, followed by a period where reality catches up with expectation. This is not unusual, and it is not a reason for pessimism.

Equally, it is not a reason for blind optimism.

From a practical standpoint, the constraint is always the same: resources are finite. The objective, whether acknowledged explicitly or not, is to match a requirement to a solution as efficiently as possible. That is what organisations and individuals optimise for over time. Deviations from that path may be framed as innovation or experimentation, but they come with cost, and sometimes that cost is significant.

The same principle applies here.

Generative systems and other machine learning approaches are tools. Some are highly reliable within defined boundaries. Others are flexible but inherently uncertain. Understanding where each fits, and using them accordingly, is what determines whether they deliver value.

There is an old adage that a poor craftsman blames his tools. It still applies.

These systems can be powerful when used well, and problematic when used poorly. The difference does not sit in the tool itself, but in how it is understood, applied, and validated.

Over time, the terminology will likely become more precise, and expectations will align more closely with reality. For now, however, much of the disconnect comes down to a simple issue:

We are using one word to describe very different things, and then wondering why the conversation does not quite hold together.

AIMG Expert Insight Source: Jay Girvan, AIMG Advisory Board Member