UA
Back to Projects
Confidential Case StudyFinancial Crime · Graph Analytics · Banking Risk

Network Link Analytics for Mule Account and Fraud Network Detection

A graph-based analytical framework designed to detect hidden mule account networks, suspicious transaction communities, and multi-hop fraud relationships across banking data.

PythonDuckDBParquetSQLNetwork AnalyticsCommunity DetectionCentrality AnalysisEntity ResolutionRisk Indicator Design

Executive Summary

Traditional account-level monitoring evaluates accounts in isolation. Sophisticated fraud networks — operating through shared identifiers, indirect relationships, and coordinated multi-account behaviour — are largely invisible to rule-based systems.

This project addressed that gap by designing a network analytics framework capable of mapping relationships across customers, accounts, transactions, and shared identifiers to identify suspicious clusters and bridge accounts.

Recognised with the CIMB Group Excellence Award 2025 under the "Safeguarding the Bank" category.

Business Problem

Financial institutions face increasing sophistication in fraud and financial crime. Mule account networks — groups of accounts used to move, layer, and extract illicit funds — are a core mechanism in money laundering, fraud schemes, and scam operations.

Standard monitoring approaches evaluate each account against predefined thresholds. They do not capture the relationships between accounts: who shares a phone number, which accounts appear as intermediaries in multi-hop fund movements, or which communities of accounts transact unusually with each other.

My Role

I led the end-to-end design and development: defining scope with stakeholders, designing the data model and graph architecture, building the analytical pipeline in Python and DuckDB, developing risk indicator logic, and presenting methodology and findings to leadership. I also managed validation and produced governance documentation for internal audit review.

Architecture

The pipeline moves from raw banking data through entity resolution and graph construction to risk-scored investigative output:

Data Sources
Data Cleaning
Entity Resolution
Nodes & Edges
Graph Construction
Community Detection
Risk Indicators
Investigation Output

Graph Model

The graph model represents the full relationship landscape around banking entities:

Nodes
· Customers
· Accounts
· Transactions
· Shared identifiers
Edges
· Account ownership
· Transaction flows
· Shared attributes
· Indirect relationships
Analytics
· Multi-hop traversal
· Community detection
· Centrality scoring
· Suspicious clusters

Centrality analysis identified bridge accounts — nodes connecting otherwise separate communities, often key intermediaries in fraud networks.

Validation

Validation was conducted against a labelled set of known suspicious accounts and confirmed mule network cases, with expert review by the financial crime investigation team.

Validation showed very strong true-positive performance, surfacing confirmed mule network members not flagged by existing rule-based monitoring systems.

False positive management was an explicit design consideration, with risk scoring thresholds calibrated in consultation with investigators to prioritise actionable output.

Business Impact

Identified hidden mule account networks invisible through existing transaction monitoring
Supported investigator workflows with structured network visualisation and risk scoring output
Enabled targeted account review and escalation based on network risk signals
Validation demonstrated very strong true-positive detection performance
Received CIMB Group Excellence Award 2025 — Safeguarding the Bank category

Lessons Learned

01Entity resolution is often the hardest part. The quality of the graph depends entirely on the quality of entity linkage — invest here before building analytics.
02Investigators need interpretable outputs, not raw graph objects. Designing the output with end-users was as important as the model itself.
03Risk indicator thresholds need calibration against real investigation outcomes. The first set rarely survives first contact with domain experts.
04Performance engineering matters at scale: DuckDB and columnar formats were essential for handling large relationship tables efficiently.

Confidentiality Note: Due to employer obligations, code, raw data, proprietary models, and internal investigation details are not disclosed. This case study presents architecture, methodology, and business impact only.

All ProjectsDiscuss a similar challenge