AI

How AI Agents for Financial Crime Run the Full Compliance Lifecycle

Subscribe to stay informed
Table of contents

Alert volumes are growing faster than any compliance team can hire to keep up. Legacy tools don’t adapt. And the typical response, throwing more analysts at the problem, doesn’t scale and doesn’t improve the quality of your program. This is the challenge that AI agents for financial crime were built to solve, and it’s no longer theoretical.

Unit21 recently walked through exactly how its AI agents operate in production. Not a demo of what’s coming, but a look at what’s already running across dozens of financial institutions and fintechs today. The system has processed over 500,000 alert reviews, saved more than $10 million in analyst time, and delivered up to 93% fewer false positives for customers already live on the platform.

Here’s what’s actually happening under the hood, and why it matters for compliance and fraud teams evaluating where AI fits in their operations.

Two Kinds of AI Agents for Financial Crime, and Why You Need Both

Most conversations about AI in compliance focus on one thing: automating alert reviews. That’s important, but it’s only half the picture.

Unit21’s approach splits AI agents into two categories that mirror what human teams actually do:

Investigation agents handle the alert-level work: reviewing flagged activity, pulling transaction histories, checking watchlists, analyzing behavioral patterns, and drafting investigation narratives. They operate at the L1 triage level, producing a complete evidence package and a recommendation (escalate for human review, or close as false positive) for every alert they touch.

Detection agents work on the rule side. They analyze completed investigations, both AI-completed and human-completed, look at patterns in false positives and true positives, and recommend new rules or adjustments to existing ones. If your many-to-one rule is generating noise because it’s catching business entities that don’t need scrutiny, the detection agent will surface that and suggest a threshold change or an exclusion.

The critical insight: these two agent types feed each other. Investigation outcomes become training data for detection improvements. Better detection produces cleaner alerts. Cleaner alerts produce more consistent investigations. It’s a compounding loop that gets smarter over time, the kind of system-level improvement that hiring more analysts alone can never produce.

What an Investigation Agent Actually Does

The phrase “AI agent” gets thrown around loosely in compliance tech. What Unit21 means by it is specific: a system of discrete, parallelized tasks, each one engineered to do a single investigative step well.

An agent might include tasks for:

  • Flagged activity review: analyzing the current alert, plus historical alerts, cases, and filings for the same entity
  • Account risk factors: evaluating customer age, geographic risk, and network connections (shared phone numbers, addresses, Social Security numbers linked to other entities)
  • Behavioral pattern analysis: checking for structuring, impossible travel, repeat payments, or corridor anomalies
  • Online research: running adverse media searches, employment verification, or counterparty research
  • Sanctions and watchlist matching: analyzing 314(a) matches, OFAC hits, and PEP screening with context-aware fuzzy matching
  • Document analysis: reading PDFs, spreadsheets, and other unstructured evidence alongside structured data

Each task runs independently and in parallel, producing its own output. A meta-prompt then synthesizes all task outputs into a draft narrative, written in whatever format the customer’s compliance program requires.

This task-based architecture matters for two reasons. First, each task can be individually optimized with different prompts, different context windows, and even different LLM models, without affecting the others. Second, customers can customize the agent to match their actual SOPs. If your compliance program requires a specific narrative format for your sponsor bank, or you need a task that checks industry-specific risk factors, you can configure that without starting from scratch.

Unit21 provides a set of out-of-the-box tasks that have been tuned against years of historical investigation data. The platform also supports building custom tasks from scratch, including testing them against historical alerts before putting them into production.

How Detection Agents Close the Loop

Investigation agents handle what’s already been flagged. Detection agents ask a different question: are you flagging the right things?

The AI rule recommendation engine analyzes your completed alert dispositions, both the ones closed by AI and the ones resolved by human analysts, and looks for patterns. It might notice that a particular rule generates a high false positive rate because it catches low-risk business entities, or that true positives in your data share characteristics (rapid multi-counterparty activity, account compromise indicators) that your current rules don’t specifically target.

From there, it recommends concrete changes: adjust a threshold from 5 to 6 transactions, exclude new accounts, add a NAICS code filter for business entities, or incorporate KYC data that the rule currently ignores. Every recommendation comes with a justification; not just “change this number” but “here’s the pattern in your data that supports this change.”

This is the part that’s hardest to do manually. Most compliance teams know their rules could be better, but the analysis required to justify a change to a regulator (pulling disposition data, running statistical comparisons, documenting the rationale) takes time that teams don’t have. The detection agent surfaces those insights automatically, and customers can backtest recommended rule changes against historical data or deploy them as shadow rules before going live.

Unit21 has also shipped a text-to-rule capability: describe the kind of activity you’re looking for in plain language, and the system will analyze your available data fields, interpret the intent, and generate a complete rule configuration, including variables, trigger conditions, and thresholds. It’s designed for the compliance practitioner who knows exactly what fraud pattern they want to catch but doesn’t want to manually map data fields and build conditions from scratch.

A Multi-Model Architecture, and Why That Matters for Privacy

One of the more technically interesting details: Unit21 doesn’t rely on a single LLM provider. The platform uses a multi-model orchestration system where different tasks route to different models based on benchmark performance.

For example, document analysis tasks currently run on Mistral because it outperforms other models on that specific task type. Other tasks might use Claude or another frontier model. The team continuously monitors AI benchmarks and swaps models when performance shifts, which in this landscape happens frequently.

All of this runs within AWS Bedrock, which means customer data never leaves the cloud environment. Models are hosted within AWS rather than sending data to external model provider servers. And critically, customer data is never used to train the underlying models. For compliance teams operating in regulated environments, this architecture addresses the privacy and model risk concerns that are often the first objection to AI adoption.

The Trust Question: When Are You Ready to Auto-Close?

The agents produce two recommendation types: escalate for human review, or close as false positive. The question every compliance leader eventually asks is whether the AI can close false positives on its own.

Unit21 supports auto-closure as an opt-in capability, not a default, and configurable per agent and per queue. But the more interesting question is what it takes to get there. In a recent poll of compliance practitioners, the top requirements were:

  • A full audit trail of reasoning: not just the decision, but every step, every data source, every assumption the agent made
  • Human review of a sample: random sampling of AI agent decisions, using the same QA/QC process teams already run for human analysts
  • Proven accuracy over time: demonstrated consistency across a meaningful volume of alerts

Unit21 provides all three. The platform retains a complete audit trail for every agent decision. Its QA sampling feature, originally built for reviewing human analyst work, is now used by customers to review AI agent output. And customers can segment agents by queue, running them only on specific alert types (like sanctions screening) where they’ve built enough confidence.

An emerging pattern worth noting: some customers are now using AI agents to QA human analyst work, not just the other way around. The agent reviews a sample of human dispositions for consistency with SOPs and flags deviations. It’s a natural evolution that strengthens the overall quality of the compliance program regardless of whether a human or AI made the original decision.

What This Means for Your Team

The compliance industry is at an inflection point. Alert volumes aren’t going back down. Regulatory expectations are increasing; FinCEN’s 2026 proposed rule explicitly encourages AI adoption, provided it’s explainable, validated, and keeps humans in the loop. And the teams that figure out how to deploy AI agents operationally, not experimentally, will reduce false positives, scale their programs without proportional headcount growth, and produce higher-quality investigations across the board.

The keyword is operationally. Unit21’s AI agents have been in production for nearly a year; this isn’t a proof of concept. It’s infrastructure.

If your team is still manually triaging every alert, writing every narrative, and tuning every rule by hand, the math doesn’t work anymore. The question isn’t whether AI agents belong in your financial crime program. It’s how quickly you can get them running.

See how Unit21’s AI agents work →

Gal Perelman
Gal Perelman
Product Marketing Lead, Unit21

Gal Perelman is the Product Marketing Lead at Unit21, where she spearheads go-to-market strategies for AI-driven risk and compliance solutions. With over a decade of experience in the fintech and fraud sectors, she has led high-impact launches for products like Watchlist Screening and AI Rule Recommendations.

Previously, Gal held marketing leadership roles at Design Pickle, Sightfull, and Lusha. She holds a Master’s degree from American University and a Bachelor’s from UCLA, and is dedicated to helping banks and fintechs navigate complex regulatory landscapes through innovative technology.

Learn more about Unit21
Unit21 is the leader in AI Risk Infrastructure, trusted by over 200 customers across 90 countries, including Sallie Mae, Chime, Intuit, and Green Dot. Our platform unifies fraud and AML with agentic AI that executes investigations end-to-end—gathering evidence, drafting narratives, and filing reports—so teams can scale safely without expanding headcount.
NACHA
|
7
min

How to Build an ACH Fraud Monitoring Program That Passes a NACHA Exam

Gal Perelman
Gal Perelman
Product Marketing Lead, Unit21
This is some text inside of a div block.
AML
|
8
min

FinCEN Proposed Rule 2026 FAQs: What Compliance Teams Need to Know

Gal Perelman
Gal Perelman
Product Marketing Lead, Unit21
This is some text inside of a div block.
AML
|
10
min

Agentic AI for AML Compliance: A Practitioner's Guide

Gal Perelman
Gal Perelman
Product Marketing Lead, Unit21
This is some text inside of a div block.
See Us In Action

Boost fraud prevention & AML compliance

Fraud can’t be guesswork. Invest in a platform that puts you back in control.
Get a Demo