Unit21 for AML

Agentic AI for AML Compliance: A Practitioner's Guide

Published

April 13, 2026

Read Time

mins

Gal Perelman

Product Marketing Lead, Unit21

Subscribe to stay informed

Home

Blog

Unit21 for AML

Agentic AI for AML Compliance: A Practitioner's Guide

Table of contents

Text Link

If you work in AML compliance, you already know the math doesn't add up. Alert volumes are growing. Regulatory expectations are rising. And hiring more analysts is expensive, slow, and doesn't scale.

‍

Agentic AI for AML compliance has emerged as one of the most-discussed solutions to this problem, but the term is used loosely. Vendors apply it to everything from basic workflow automation to large language models that summarize alerts. For practitioners trying to evaluate what's real and what's hype, that ambiguity is a problem.

‍

This guide cuts through it. We'll cover what agentic AI actually means in an AML context, where it fits (and doesn't fit) in the compliance workflow, what separates genuinely useful implementations from demos that don't hold up under regulatory scrutiny, and how to evaluate solutions if your team is considering adoption.

‍

Let's start with what agentic actually means.

‍

What Agentic AI Actually Means for AML Compliance

Most AI tools you've encountered in compliance are assistive: they surface information, score risk, or highlight anomalies, and then hand the work back to an analyst. They reduce friction. They don't reduce the analyst's workload in any fundamental way.

‍

Agentic AI is different in one important respect: it executes the work, not just informs it.

‍

An agentic AI system doesn't just flag an alert as higher risk. It pulls transaction history, checks the entity against watchlists, analyzes counterparty relationships, runs the alert against your SOPs, drafts an investigation narrative, and recommends a disposition, all before a human analyst opens the case.

‍

The distinction matters because it determines what kind of impact you should realistically expect:

AI-assisted tools reduce time per alert. An analyst still touches every case; they just have better information faster.
Agentic AI reduces the number of alerts that require analyst attention altogether. L1 cases get handled autonomously; analysts focus on escalations, edge cases, and final decisions.

That's not a subtle difference. For AML teams handling thousands of alerts per month, it's the difference between managing the queue and clearing it.

‍

Where Agentic AI Fits in the AML Investigation Workflow

To understand what agentic AI can realistically do, it helps to map it to the actual AML investigation workflow: a process that has changed surprisingly little over the past decade despite the explosion in transaction volumes.

‍

Alert Detection: AI That Writes and Tunes Its Own Rules

Transaction monitoring rules flag activity that meets predefined thresholds: velocity limits, structuring patterns, geographic anomalies, and counterparty risks. This step generates the alert queue that analysts work through.

‍

Agentic AI is now entering this layer too, not just executing rules, but recommending improvements to them. Detection agents analyze which rules are generating high false-positive rates, stale configurations, or unnecessary noise, and then suggest optimized variations backed by real alert data. The analyst reviews and approves; the AI does the pattern recognition and drafting.

‍

This matters because rule tuning has historically been one of the most time-intensive tasks for compliance teams, often requiring a combination of analyst expertise, engineering resources, and weeks of iteration. Detection agents compress that into hours.

‍

L1 Alert Triage: The Highest-Volume, Highest-Impact Entry Point

This is where agentic AI has the most immediate and measurable impact today. L1 triage, the first review of an alert to determine whether it warrants escalation, is high-volume and often repetitive. Analysts are reviewing the same alert types, pulling the same contextual data, and making the same categories of decisions hundreds of times per week.

‍

An AI investigation agent handles this autonomously for routine cases: ingesting the alert context, gathering transaction history, checking watchlists, evaluating the alert against policy thresholds, and producing a recommendation with its full reasoning visible. The analyst reviews, can override, and approve the disposition.

‍

The results from teams that have gone live with this are meaningful. Nexo automated 57% of their alert reviews through AI agents and achieved a 93% reduction in false positives. Uphold reduced alert review time by 44%. Across 62+ institutions running AI agents today, the system processes more than 213,000 alerts per month, work that would otherwise require a significant headcount expansion.

‍

Case Investigation: Evidence Gathering at Scale

For alerts that escalate to full cases, agentic AI supports the investigation by handling the evidence-gathering work that would otherwise fall to an analyst: expanding entity networks, surfacing linked accounts, identifying correlated activity across time periods, and assembling the investigation timeline.

‍

This is where the quality of the underlying system architecture matters significantly. A well-configured AI agent doesn't just retrieve data; it reasons against it, applying your SOPs and escalation criteria to determine what's relevant and why. The output is a structured investigation package, not a data dump.

‍

For multi-entity cases or investigations requiring cross-product or cross-jurisdictional data, this step alone can save hours per case.

‍

Narrative Drafting and SAR Filing: The Underestimated Bottleneck

One of the most time-consuming parts of AML compliance isn't the investigation itself; it's the documentation. Writing a SAR narrative that meets regulatory expectations, accurately reflects the investigation, and is consistent across analysts and cases is a significant and persistent source of analyst hours.

‍

AI agents are handling this step in production today. They draft SAR narratives in the format and structure your compliance program requires, incorporating investigation findings, entity relationships, and relevant transaction details, ready for analyst review and approval before filing. Uphold reduced SAR preparation time from approximately one week to under 30 minutes after implementing AI agents.

‍

Why Agentic AI Is a Different Category, Not Just an Upgrade

For most of its history, the technology debate in AML compliance has been framed as a choice between rules and machine learning. It's worth understanding why neither alone was sufficient, because it explains what makes the agentic AI approach structurally different.

‍

Rules-based systems are well-suited to AML because they're transparent, auditable, and can capture emerging patterns with very few examples. A fraud analyst can write a rule today based on a single incident and deploy it by the end of the week, with full visibility into exactly what it does. But rules require expertise to build, effort to maintain, and don't learn or adapt on their own.

‍

Machine learning models promised automation and scale. But traditional ML struggles with novel fraud vectors (which require significant historical examples to detect), produces outputs that are difficult to explain to regulators, and typically requires engineering resources to retrain or update.

‍

Agentic AI doesn't choose between them; it integrates them. Rules remain the foundation for detection: transparent, self-service, auditable. AI agents operate on top, doing the reasoning, evidence gathering, and narrative drafting. Detection agents recommend rule improvements based on observed patterns, with human approval required before any changes are made. And every step in the AI's work produces an auditable log of what data it accessed, what criteria it applied, what it found, and why it recommended what it recommended.

‍

This combination is what makes agentic AI viable in regulated environments where "the model said so" has never been an acceptable explanation to an examiner.

‍

The Human-in-the-Loop Requirement (And Why It's Not a Limitation)

One of the more important things to understand about agentic AI in AML is what it isn't trying to do. Full automation, AI making final compliance decisions with no human review, isn't the goal, and the most credible implementations don't pitch it.

‍

This isn't primarily a technical limitation. It's a compliance reality. AML decisions carry consequences: accounts get frozen, SARs get filed, and criminal referrals are made. Regulators expect institutions to explain every decision, demonstrate that controls operate as designed, and confirm that human judgment is part of the process.

‍

The right framework for thinking about this is progressive autonomy, a trust ladder that teams can move up as confidence in the AI's performance grows:

Level 1: The AI researches, summarizes, and recommends. The analyst decides everything. Full review of every AI output before any action is taken.
Level 2: The AI handles routine, lower-risk cases autonomously. Analysts review a defined sample and handle all escalations and edge cases.
Level 3: The AI operates with higher autonomy on well-defined case types. Human review reserved for exceptions, overrides, and quality sampling.

Most production implementations today operate at Level 1 or early Level 2. That's the right starting point; it lets teams validate AI quality against their own actual cases before expanding autonomy, builds internal trust incrementally, and keeps the compliance program on solid regulatory ground.

‍

What should be non-negotiable regardless of where a team sits on this spectrum: a full audit trail of every AI action, clear analyst override controls, and reasoning tied to evidence rather than opaque scores.

‍

What to Look For When Evaluating Agentic AI for AML

If your team is evaluating solutions, these are the questions that matter most in practice, not the ones that make for a good demo.

‍

Can you show a regulator the AI's reasoning? Every disposition an AI agent makes should be traceable: what data it accessed, which policy criteria it applied, what it found, and why it recommended what it recommended. If the answer to "why did the AI close this alert?" is a probability score with no supporting rationale, that's not going to hold up under exam. Look for systems that produce explicit reasoning chains, not just outputs.

‍

Can analysts review, override, and document disagreement? Override isn't just a nice-to-have; it's how AI systems remain accountable and how they improve over time. Pay attention to how easy it is for an analyst to reject a recommendation, document why, and have that feedback inform future performance.

‍

Is the AI configured to your SOPs rather than a generic model? Agentic AI that isn't calibrated to your specific thresholds, risk appetite, escalation criteria, and narrative format will produce outputs that require heavy editing, which negates much of the time savings. The best implementations allow compliance teams to configure this without requiring engineering changes.

‍

How is quality validated before go-live? Rigorous implementations run AI recommendations in parallel against historical disposition data before anything changes in production. Shadow mode or validation mode testing lets you see how the AI performs against your own cases before it touches live decisions. If a vendor can't support this, treat that as a signal.

‍

Does the system improve over time? AI that learns from analyst feedback, which cases were correctly handled, where escalations were warranted, and where the AI was wrong, compounds in value. Static AI that doesn't incorporate feedback quickly loses relevance as fraud patterns evolve. Ask specifically what feedback loops exist between analyst decisions and AI improvement.

‍

What's the network intelligence story? This is an underappreciated dimension. AI that only learns from your institution's transaction history is limited by your own data. The most advanced implementations pool anonymized intelligence across multiple institutions, so the system benefits from patterns identified across an entire network, not just your own portfolio. For emerging fraud typologies or money-laundering patterns that appear at one institution before spreading, this early-warning capability can be the difference between detecting a pattern on day 10 and on day 100.

‍

What AML Teams Are Seeing in Production

The impact numbers from early agentic AI implementations are significant, with the caveat that results vary by alert type, data quality, and system configuration.

‍

A few data points from teams that have gone live:

Up to 93% reduction in false positives in high-volume monitoring environments
Up to 80% reduction in investigation handle time
SAR preparation has been reduced from approximately one week to under 30 minutes
90% faster investigations with greater than 99% accuracy, validated across 125,000+ alerts
$10M+ in analyst time saved across production deployments

The pattern across implementations is consistent: the highest gains come from L1 triage automation on well-defined, high-volume alert types. More complex investigative work, novel typologies, multi-entity networks, and cross-border activity still benefit substantially from AI support but require more configuration and entail a higher degree of analyst involvement.

‍

Teams that try to start with the most complex use case before validating on simpler ones tend to move more slowly, not faster. The compliance teams seeing the best outcomes are the ones that defined a narrow starting scope, validated rigorously, and expanded from there.

‍

A Practical Path to Adoption

For most AML compliance teams, the right starting point isn't the most ambitious use case; it's the highest-volume, most routine one.

‍

Start narrow. Pick a single alert type with high volume and a relatively consistent pattern (transaction velocity alerts, structuring flags, or geographic anomalies are common starting points). Configure the AI to align with your SOPs for that alert type specifically.

‍

Run in parallel before going live. Shadow or validation mode lets you see how the AI recommendation compares to your analysts' decisions on the same cases, without any real-world impact. Use this to measure accuracy, identify configuration gaps, and build internal confidence.

‍

Measure what matters. Track handle time, escalation rate, override rate, and SAR quality, not just how many alerts the AI processed. The compliance metrics your examiners care about should improve, not just your operational efficiency numbers.

‍

Expand incrementally. Once you've validated quality on the first alert type, expand to the next. The teams that scale fastest are the ones that move deliberately through this process rather than trying to deploy broadly before the foundation is solid.

‍

Conclusion

‍

Agentic AI for AML compliance isn't a future concept; it's in production at real compliance teams handling real alert volumes today. The question for most practitioners isn't whether it works; it's whether a given implementation is production-ready, regulatory-defensible, and configured to your specific program.

‍

The compliance professionals who get the most out of it are the ones who approach it the way they'd approach any significant operational change: with clear questions about explainability and oversight, realistic expectations about where human judgment remains essential, and a concrete validation plan before full deployment.

‍

If you're evaluating how AI agents could fit into your AML operations, from L1 triage through SAR filing, see how Unit21's AI Agents work in practice →

‍

Gal Perelman

Product Marketing Lead, Unit21

Gal Perelman is the Product Marketing Lead at Unit21, where she spearheads go-to-market strategies for AI-driven risk and compliance solutions. With over a decade of experience in the fintech and fraud sectors, she has led high-impact launches for products like Watchlist Screening and AI Rule Recommendations.

Previously, Gal held marketing leadership roles at Design Pickle, Sightfull, and Lusha. She holds a Master’s degree from American University and a Bachelor’s from UCLA, and is dedicated to helping banks and fintechs navigate complex regulatory landscapes through innovative technology.

‍

Learn more about Unit21

Unit21 is the leader in AI Risk Infrastructure, trusted by over 200 customers across 90 countries, including Sallie Mae, Chime, Intuit, and Green Dot. Our platform unifies fraud and AML with agentic AI that executes investigations end-to-end—gathering evidence, drafting narratives, and filing reports—so teams can scale safely without expanding headcount.