Cutting through AI vaporware: What actually matters when evaluating AI today

Published

March 31, 2026

Read Time

mins

Filip Verley

Chief Innovation Officer at Liminal

Subscribe to stay informed

Home

Blog

Cutting through AI vaporware: What actually matters when evaluating AI today

Table of contents

Text Link

If you spend any time talking to vendors right now, you’ll hear the same thing over and over again: we have AI. In many cases, it sounds compelling. The demos are polished. The language is confident. The promises are big.
‍

And yet, when you start peeling back the layers, it becomes clear that much of what’s being sold isn’t actually ready for the real world. That’s the tension we’re living in today. AI is absolutely real, and in many ways, transformative, but at the same time, the gap between what’s marketed and what’s truly production-ready is still enormous.
‍

At Liminal, we monitor thousands of companies across fraud, compliance, risk, and cybersecurity. Over the past year, generative and agentic AI has quickly become the most common topic in our conversations. Everyone is trying to figure out the same thing: how to use it, how to trust it, and how not to get burned by it.
‍

I recently joined a fascinating webinar discussion with Unit21 on this exact topic. I’ve summarized my thoughts here. This blog isn’t a single shortcut to answering these questions. But patterns are emerging both in what works and in what doesn’t.
‍

The hardest part isn’t adopting AI, it’s understanding what you’re actually buying

One of the more telling moments for me recently was hosting a demo-focused event around agentic AI. We reached out to dozens of companies that claimed to have real, working solutions. Only a fraction participated. And of those, only a couple were able to show something that resembled a product operating beyond a controlled demo.
‍

The rest were, quite frankly, still in the realm of ideas.
‍

That’s not necessarily a criticism. This is a fast-moving space, and many teams are building as quickly as they can. But it does highlight a core challenge for buyers: you’re being asked to make decisions in a market where the language has outpaced the reality.
‍

Terms like “agentic,” “autonomous,” and “AI-powered” are being used loosely, often to describe very different things. In one case, it might mean a system that can execute multi-step workflows with minimal intervention. In another, it’s a chatbot layered on top of an existing product.
‍

If you don’t slow down and clarify what those terms actually mean in practice, it becomes almost impossible to evaluate vendors effectively.
‍

Start by forcing clarity because the market doesn’t give it to you

Before you even get into product comparisons, there’s a more fundamental step that too many organizations skip: establishing a shared understanding of what you’re talking about.
‍

When someone says they offer “agentic AI,” you should be able to answer a few simple questions:

What tasks does the system complete end-to-end?
Where are decisions being made autonomously versus handed off to a human?
How does it behave when something goes wrong or falls outside a “happy path”?
‍

If those answers aren’t clear, or if they rely heavily on generalities, that’s usually a signal that the product hasn’t matured yet.
‍

This might sound basic, but in practice, it’s one of the most effective ways to cut through noise. The vendors who have done the hard work can explain their systems in concrete terms. The ones who haven’t tend to stay at a higher level.
‍

The difference between a demo and a product is where most AI efforts fall apart

‍

It has never been easier to build something that looks like AI.
‍

With access to large language models, you can stand up a compelling demo in a matter of days. You can show a workflow, generate outputs, and create the impression that something meaningful is happening. But that’s not the same as building something that can operate reliably in production.
‍

Once you move beyond the demo, the real challenges start to surface: accuracy, consistency, edge cases, auditability, cost, and integration into existing processes. These are not trivial problems, and they don’t get solved by simply adding an AI layer on top of what already exists.
‍

That’s why one of the simplest and most underrated evaluation tactics is also the most powerful:
‍

Ask to see it in production.

‍

Not a walkthrough. Not a roadmap. Not a simulated example. Real usage, with real customers, solving real problems. If that doesn’t exist yet, you’re not evaluating a finished product. You’re evaluating a bet.
‍

Where AI is actually working today (and why that matters)

‍

Despite all of this, there are clear signs of where AI is gaining traction, and those patterns are instructive.
‍

The most successful deployments tend to start in areas that are repetitive, structured, and relatively well understood. Think about processes like sanctions screening, negative news checks, or initial alert triage. These are workflows that already follow defined steps, often at high volume, and in many cases have been outsourced or standardized over time.
‍

That combination of high repetition, clear inputs, and manageable risk makes them a natural entry point for AI.
‍

What’s interesting is that once organizations see success in these areas, their thinking starts to shift. The conversation moves from “Can we automate this?” to “What else becomes possible now that this is automated?”
‍

And that’s where AI starts to move from incremental improvement to something more transformative.
‍

AI doesn’t just make things faster. It changes what’s worth doing

‍

A common framing around AI is efficiency: faster processing, lower costs, fewer manual steps.

That’s part of the story, but it’s not the most interesting part.

What we’re seeing more and more of is that AI enables tasks that previously didn’t make sense to do at scale. Activities that required too much time or effort, like deep research across multiple data sources or continuous optimization of detection logic, suddenly become feasible.

That opens the door to better coverage, earlier detection, and more consistent decision-making.

In other words, AI doesn’t just help you do the same work more efficiently. It expands the scope of what your organization can realistically take on.
‍

The operating model is going to change whether you plan for it or not

‍

One of the quieter but more important shifts happening right now is how roles are evolving alongside these systems.

There’s a tendency to frame AI in terms of replacement, but what we’re actually seeing is a rebalancing. Work that was previously focused on throughput, moving cases, reviewing alerts, and completing repetitive tasks is gradually giving way to work focused on judgment, oversight, and quality.

The “level one” work that has traditionally been high-volume and process-driven doesn’t disappear entirely, but it does change. It becomes more about validating outputs, handling exceptions, and ensuring that the system is behaving as expected.

At the same time, new responsibilities emerge around monitoring performance, managing risk, and demonstrating that these systems are operating in a way that is explainable and defensible.

Organizations that treat AI as just another tool layered onto existing workflows will miss this shift. The ones who step back and rethink how work gets done will be in a much stronger position.
‍

The question most people avoid is the one that matters most

‍

For all the focus on capabilities and performance, there’s one question that consistently gets glossed over:
‍

Who is accountable when the AI gets it wrong?

‍

It’s an uncomfortable question, but it’s unavoidable, especially in regulated environments. If a decision leads to a missed risk or a compliance issue, where does that responsibility sit? With the vendor? With your internal team? With the underlying model provider?

‍

Right now, there isn’t a universal answer. But if you don’t have a clear point of view before deployment, you’re creating exposure that’s difficult to manage after the fact. In many ways, this is where the real maturity of the market will be tested, not just in what these systems can do, but in how responsibility is understood and managed around them.
‍

We’re still early, and that’s both the challenge and the opportunity

‍

It’s easy to feel like AI is already everywhere, but the reality is more nuanced. Most organizations are still in early stages of testing, piloting, learning what works and what doesn’t. That creates a strange dynamic. There’s urgency to move forward, but also uncertainty about how to do it correctly.
‍

The organizations that navigate this well won’t be the ones that rush into every new tool or, conversely, sit on the sidelines waiting for perfect clarity. They’ll be the ones that approach AI with discipline: asking better questions, demanding real evidence, and building their understanding as they go. Because in a market full of noise, that’s what ultimately separates signal from hype.
‍

Filip Verley

Chief Innovation Officer at Liminal

Filip is the Chief Innovation Officer at Liminal, where he leads new initiatives to help organizations address challenges in identity verification and risk management. With a background in Criminology, he brings a unique perspective to the connection between technology, identity, and security.

Learn more about Unit21

Unit21 is the leader in AI Risk Infrastructure, trusted by over 200 customers across 90 countries, including Sallie Mae, Chime, Intuit, and Green Dot. Our platform unifies fraud and AML with agentic AI that executes investigations end-to-end—gathering evidence, drafting narratives, and filing reports—so teams can scale safely without expanding headcount.