

Alert volumes are growing faster than any compliance team can hire to keep up. Legacy tools don’t adapt. And the typical response, throwing more analysts at the problem, doesn’t scale and doesn’t improve the quality of your program. This is the challenge that AI agents for financial crime were built to solve, and it’s no longer theoretical.
Unit21 recently walked through exactly how its AI agents operate in production. Not a demo of what’s coming, but a look at what’s already running across dozens of financial institutions and fintechs today. The system has processed over 500,000 alert reviews, saved more than $10 million in analyst time, and delivered up to 93% fewer false positives for customers already live on the platform.
Here’s what’s actually happening under the hood, and why it matters for compliance and fraud teams evaluating where AI fits in their operations.
Most conversations about AI in compliance focus on one thing: automating alert reviews. That’s important, but it’s only half the picture.
Unit21’s approach splits AI agents into two categories that mirror what human teams actually do:
Investigation agents handle the alert-level work: reviewing flagged activity, pulling transaction histories, checking watchlists, analyzing behavioral patterns, and drafting investigation narratives. They operate at the L1 triage level, producing a complete evidence package and a recommendation (escalate for human review, or close as false positive) for every alert they touch.
Detection agents work on the rule side. They analyze completed investigations, both AI-completed and human-completed, look at patterns in false positives and true positives, and recommend new rules or adjustments to existing ones. If your many-to-one rule is generating noise because it’s catching business entities that don’t need scrutiny, the detection agent will surface that and suggest a threshold change or an exclusion.
The critical insight: these two agent types feed each other. Investigation outcomes become training data for detection improvements. Better detection produces cleaner alerts. Cleaner alerts produce more consistent investigations. It’s a compounding loop that gets smarter over time, the kind of system-level improvement that hiring more analysts alone can never produce.
The phrase “AI agent” gets thrown around loosely in compliance tech. What Unit21 means by it is specific: a system of discrete, parallelized tasks, each one engineered to do a single investigative step well.
An agent might include tasks for:
Each task runs independently and in parallel, producing its own output. A meta-prompt then synthesizes all task outputs into a draft narrative, written in whatever format the customer’s compliance program requires.
This task-based architecture matters for two reasons. First, each task can be individually optimized with different prompts, different context windows, and even different LLM models, without affecting the others. Second, customers can customize the agent to match their actual SOPs. If your compliance program requires a specific narrative format for your sponsor bank, or you need a task that checks industry-specific risk factors, you can configure that without starting from scratch.
Unit21 provides a set of out-of-the-box tasks that have been tuned against years of historical investigation data. The platform also supports building custom tasks from scratch, including testing them against historical alerts before putting them into production.
Investigation agents handle what’s already been flagged. Detection agents ask a different question: are you flagging the right things?
The AI rule recommendation engine analyzes your completed alert dispositions, both the ones closed by AI and the ones resolved by human analysts, and looks for patterns. It might notice that a particular rule generates a high false positive rate because it catches low-risk business entities, or that true positives in your data share characteristics (rapid multi-counterparty activity, account compromise indicators) that your current rules don’t specifically target.
From there, it recommends concrete changes: adjust a threshold from 5 to 6 transactions, exclude new accounts, add a NAICS code filter for business entities, or incorporate KYC data that the rule currently ignores. Every recommendation comes with a justification; not just “change this number” but “here’s the pattern in your data that supports this change.”
This is the part that’s hardest to do manually. Most compliance teams know their rules could be better, but the analysis required to justify a change to a regulator (pulling disposition data, running statistical comparisons, documenting the rationale) takes time that teams don’t have. The detection agent surfaces those insights automatically, and customers can backtest recommended rule changes against historical data or deploy them as shadow rules before going live.
Unit21 has also shipped a text-to-rule capability: describe the kind of activity you’re looking for in plain language, and the system will analyze your available data fields, interpret the intent, and generate a complete rule configuration, including variables, trigger conditions, and thresholds. It’s designed for the compliance practitioner who knows exactly what fraud pattern they want to catch but doesn’t want to manually map data fields and build conditions from scratch.
One of the more technically interesting details: Unit21 doesn’t rely on a single LLM provider. The platform uses a multi-model orchestration system where different tasks route to different models based on benchmark performance.
For example, document analysis tasks currently run on Mistral because it outperforms other models on that specific task type. Other tasks might use Claude or another frontier model. The team continuously monitors AI benchmarks and swaps models when performance shifts, which in this landscape happens frequently.
All of this runs within AWS Bedrock, which means customer data never leaves the cloud environment. Models are hosted within AWS rather than sending data to external model provider servers. And critically, customer data is never used to train the underlying models. For compliance teams operating in regulated environments, this architecture addresses the privacy and model risk concerns that are often the first objection to AI adoption.
The agents produce two recommendation types: escalate for human review, or close as false positive. The question every compliance leader eventually asks is whether the AI can close false positives on its own.
Unit21 supports auto-closure as an opt-in capability, not a default, and configurable per agent and per queue. But the more interesting question is what it takes to get there. In a recent poll of compliance practitioners, the top requirements were:
Unit21 provides all three. The platform retains a complete audit trail for every agent decision. Its QA sampling feature, originally built for reviewing human analyst work, is now used by customers to review AI agent output. And customers can segment agents by queue, running them only on specific alert types (like sanctions screening) where they’ve built enough confidence.
An emerging pattern worth noting: some customers are now using AI agents to QA human analyst work, not just the other way around. The agent reviews a sample of human dispositions for consistency with SOPs and flags deviations. It’s a natural evolution that strengthens the overall quality of the compliance program regardless of whether a human or AI made the original decision.
The compliance industry is at an inflection point. Alert volumes aren’t going back down. Regulatory expectations are increasing; FinCEN’s 2026 proposed rule explicitly encourages AI adoption, provided it’s explainable, validated, and keeps humans in the loop. And the teams that figure out how to deploy AI agents operationally, not experimentally, will reduce false positives, scale their programs without proportional headcount growth, and produce higher-quality investigations across the board.
The keyword is operationally. Unit21’s AI agents have been in production for nearly a year; this isn’t a proof of concept. It’s infrastructure.
If your team is still manually triaging every alert, writing every narrative, and tuning every rule by hand, the math doesn’t work anymore. The question isn’t whether AI agents belong in your financial crime program. It’s how quickly you can get them running.
See how Unit21’s AI agents work →

Gal Perelman is the Product Marketing Lead at Unit21, where she spearheads go-to-market strategies for AI-driven risk and compliance solutions. With over a decade of experience in the fintech and fraud sectors, she has led high-impact launches for products like Watchlist Screening and AI Rule Recommendations.
Previously, Gal held marketing leadership roles at Design Pickle, Sightfull, and Lusha. She holds a Master’s degree from American University and a Bachelor’s from UCLA, and is dedicated to helping banks and fintechs navigate complex regulatory landscapes through innovative technology.