Varaksha — UPI Fraud Defense Network

Together

Feb 28

Defining the Architecture

“Demonstrability is a first-class design constraint.”

Five-layer architecture scoped: Rust privacy gateway, ML classifier, graph topology analyser, multilingual alert agent, and ops dashboard. System designed to be comprehensible in under a minute, evaluated under pressure.

5-Layer DesignArchitectureCore Pipeline

Together

Feb 28

Defining the Architecture

“Demonstrability is a first-class design constraint.”

Five-layer architecture scoped: Rust privacy gateway, ML classifier, graph topology analyser, multilingual alert agent, and ops dashboard. System designed to be comprehensible in under a minute, evaluated under pressure.

5-Layer DesignArchitectureCore Pipeline

Security

Mar 1

Privacy Gateway in Rust

“Sensitive identifiers must not persist beyond the perimeter. Everything downstream operates on hashes.”

The Actix-Web 4 gateway is the sole component that handles raw Virtual Payment Addresses — SHA-256 hashing is applied at ingress so all downstream services receive only derived identifiers. DashMap provides a lock-free concurrent risk cache across the Actix worker pool; score_to_verdict() threshold logic determines ALLOW, FLAG, and BLOCK classifications.

RustSHA-256DashMapActix-Web 4

Security

Mar 1

Privacy Gateway in Rust

“Sensitive identifiers must not persist beyond the perimeter. Everything downstream operates on hashes.”

The Actix-Web 4 gateway is the sole component that handles raw Virtual Payment Addresses — SHA-256 hashing is applied at ingress so all downstream services receive only derived identifiers. DashMap provides a lock-free concurrent risk cache across the Actix worker pool; score_to_verdict() threshold logic determines ALLOW, FLAG, and BLOCK classifications.

RustSHA-256DashMapActix-Web 4

ML

Mar 1

ML Baseline Established

“A working baseline yields insights that an unimplemented optimal architecture cannot.”

Random Forest + XGBoost soft-vote ensemble on transaction velocity, round-amount flag, network out-degree, and time-of-day encoding. Stratified 50K PaySim sample with SMOTE rebalancing. Reference point established for subsequent iterations.

RF + XGBoostSMOTEPaySim

ML

Mar 1

ML Baseline Established

“A working baseline yields insights that an unimplemented optimal architecture cannot.”

Random Forest + XGBoost soft-vote ensemble on transaction velocity, round-amount flag, network out-degree, and time-of-day encoding. Stratified 50K PaySim sample with SMOTE rebalancing. Reference point established for subsequent iterations.

RF + XGBoostSMOTEPaySim

Security

Mar 2

Graph-Based Mule Detection

“Network fan-out is a consistent topological signature across all known money-mule architectures.”

A NetworkX graph agent runs asynchronously outside the payment critical path, detecting all four BIS Project Hertha mule typologies: fan-out, fan-in, directed cycles, and scatter patterns. Score aggregation uses the maximum across detected patterns to prevent false positives on legitimate high-volume merchants; results push to the Rust risk cache via HMAC-SHA256-signed webhooks.

NetworkXFan-outDirected CyclesAsyncHMAC-SHA256

Security

Mar 2

Graph-Based Mule Detection

“Network fan-out is a consistent topological signature across all known money-mule architectures.”

A NetworkX graph agent runs asynchronously outside the payment critical path, detecting all four BIS Project Hertha mule typologies: fan-out, fan-in, directed cycles, and scatter patterns. Score aggregation uses the maximum across detected patterns to prevent false positives on legitimate high-volume merchants; results push to the Rust risk cache via HMAC-SHA256-signed webhooks.

NetworkXFan-outDirected CyclesAsyncHMAC-SHA256

Security

Mar 2–3

Multilingual Alert Delivery

“A fraud alert has no utility if the recipient cannot read the language in which it is issued.”

Alerts synthesised in 8 Indian languages via Microsoft Neural TTS (edge-tts) embed the transaction ID, blocked amount, and risk score in the recipient’s preferred language. BLOCK verdicts cite IT Act 2000 §66D and BNS §318(4) verbatim; the template engine is swappable for IndicTrans2 at production time.

8 languagesNeural TTSIT Act 2000 §66Dedge-tts

Security

Mar 2–3

Multilingual Alert Delivery

“A fraud alert has no utility if the recipient cannot read the language in which it is issued.”

Alerts synthesised in 8 Indian languages via Microsoft Neural TTS (edge-tts) embed the transaction ID, blocked amount, and risk score in the recipient’s preferred language. BLOCK verdicts cite IT Act 2000 §66D and BNS §318(4) verbatim; the template engine is swappable for IndicTrans2 at production time.

8 languagesNeural TTSIT Act 2000 §66Dedge-tts

Together

Mar 3

Integration Proof-of-Concept

“End-to-end verdicts validated — from Rust ingress to multilingual alert.”

A live operations dashboard confirmed verdicts flowing through all five layers: transaction ingress, hashing, ML scoring, graph analysis, and multilingual alert dispatch. Force-directed network visualization, Hindi alert panel, and 50-event audit log. All data is synthetic—no real PII processed.

5-Layer PipelineLive DashboardAudit LogSynthetic

Together

Mar 3

Integration Proof-of-Concept

“End-to-end verdicts validated — from Rust ingress to multilingual alert.”

A live operations dashboard confirmed verdicts flowing through all five layers: transaction ingress, hashing, ML scoring, graph analysis, and multilingual alert dispatch. Force-directed network visualization, Hindi alert panel, and 50-event audit log. All data is synthetic—no real PII processed.

5-Layer PipelineLive DashboardAudit LogSynthetic

ML

Mar 5–7

Model Architecture Overhaul

“At 450 MB combined, the ensemble consumed nearly the entire memory budget for a sub-0.005 accuracy gain.”

XGBoost was removed from the serving stack: RF-300 achieves ROC-AUC 0.9869 in isolation and the marginal ensemble gain was insufficient to justify 450 MB combined weight. Feature engineering expanded from 8 to 16 variables, incorporating balance_drain_ratio, account_age_days, previous_failed_attempts, and transfer_cashout_flag; the output artefact became varaksha_rf_model.onnx.

RF-300 only16 features75K rowsONNXROC-AUC 0.9869

ML

Mar 5–7

Model Architecture Overhaul

“At 450 MB combined, the ensemble consumed nearly the entire memory budget for a sub-0.005 accuracy gain.”

XGBoost was removed from the serving stack: RF-300 achieves ROC-AUC 0.9869 in isolation and the marginal ensemble gain was insufficient to justify 450 MB combined weight. Feature engineering expanded from 8 to 16 variables, incorporating balance_drain_ratio, account_age_days, previous_failed_attempts, and transfer_cashout_flag; the output artefact became varaksha_rf_model.onnx.

RF-300 only16 features75K rowsONNXROC-AUC 0.9869

Together

Mar 9–10

Production Deployment

“Static export to a global edge network eliminates cold starts and infrastructure overhead from the demonstration path entirely.”

Next.js 15 configured with static export and deployed to Cloudflare Pages eliminates cold starts and Node.js server overhead from the demonstration path. The frontend ships three routes: a live stats landing page, an animated architecture walkthrough, and a real-time transaction feed with Security Arena and Cache Visualizer panels.

Next.js 15Cloudflare PagesStatic Exportframer-motion

Together

Mar 9–10

Production Deployment

“Static export to a global edge network eliminates cold starts and infrastructure overhead from the demonstration path entirely.”

Next.js 15 configured with static export and deployed to Cloudflare Pages eliminates cold starts and Node.js server overhead from the demonstration path. The frontend ships three routes: a live stats landing page, an animated architecture walkthrough, and a real-time transaction feed with Security Arena and Cache Visualizer panels.

Next.js 15Cloudflare PagesStatic Exportframer-motion

ML

Mar 11 AM

Dataset Coverage Audit

“Model timestamps revealed the training pipeline had never ingested the complete dataset.”

Three missing dataset files discovered: supervised_dataset.csv, remaining_behavior_ext.csv, and ton-iot.csv. All loaders written, validated against schema, and integrated into the merge pipeline. 54,142 rows recovered.

Dataset Audit54K Rows3 Loaders

ML

Mar 11 AM

Dataset Coverage Audit

“Model timestamps revealed the training pipeline had never ingested the complete dataset.”

Three missing dataset files discovered: supervised_dataset.csv, remaining_behavior_ext.csv, and ton-iot.csv. All loaders written, validated against schema, and integrated into the merge pipeline. 54,142 rows recovered.

Dataset Audit54K Rows3 Loaders

ML

Mar 11 PM

85.24% (V1)

“V1 retraining on the leakage-corrected dataset reached 85.24% accuracy (later superseded by V2).”

The expanded 111,499-row dataset rebalanced by SMOTE to 51,735/51,735 yielded: RF Accuracy 85.24%, ROC-AUC 0.9546, Precision 0.7709, Recall 0.9229, F1 0.8401. Stale artefacts — lightgbm, xgboost, voting ensemble — were removed from the repository.

111K rowsV1 baselineROC-AUC 0.9546Artefact cleanup

ML

Mar 11 PM

85.24% (V1)

“V1 retraining on the leakage-corrected dataset reached 85.24% accuracy (later superseded by V2).”

The expanded 111,499-row dataset rebalanced by SMOTE to 51,735/51,735 yielded: RF Accuracy 85.24%, ROC-AUC 0.9546, Precision 0.7709, Recall 0.9229, F1 0.8401. Stale artefacts — lightgbm, xgboost, voting ensemble — were removed from the repository.

111K rowsV1 baselineROC-AUC 0.9546Artefact cleanup

Together

Mar 11

V1 Finalisation and Deployment

“A deployable system is defined by finishing details—texture, colour, and interactive feedback.”

Frontend polish: dot-grid body texture, surface-gradient card utility, amber token separated from saffron for distinct FLAG verdict rendering. Next.js static export deployed to Cloudflare Pages. Core pipeline hardened and ready for production integration.

Next.js 15Static ExportPolishProduction-Ready

Together

Mar 11

V1 Finalisation and Deployment

“A deployable system is defined by finishing details—texture, colour, and interactive feedback.”

Frontend polish: dot-grid body texture, surface-gradient card utility, amber token separated from saffron for distinct FLAG verdict rendering. Next.js static export deployed to Cloudflare Pages. Core pipeline hardened and ready for production integration.

Next.js 15Static ExportPolishProduction-Ready

Security

Mar 12–14

Gateway Hardening — Rate Limiting & Auth

“A gateway without rate limiting is a door without a lock.”

Production-grade security hardening: per-VPA rate limiter enforcing NPCI OC-215/2025-26 caps (100 req/24h), mTLS mutual authentication layer, HMAC-SHA256 webhook signing for all graph agent push events, and audit log ring-buffer in DashMap. CORS policy tightened to allowlist PSP bank origins only.

Rate LimitermTLSHMAC-SHA256CORSAudit Log

Security

Mar 12–14

Gateway Hardening — Rate Limiting & Auth

“A gateway without rate limiting is a door without a lock.”

Production-grade security hardening: per-VPA rate limiter enforcing NPCI OC-215/2025-26 caps (100 req/24h), mTLS mutual authentication layer, HMAC-SHA256 webhook signing for all graph agent push events, and audit log ring-buffer in DashMap. CORS policy tightened to allowlist PSP bank origins only.

Rate LimitermTLSHMAC-SHA256CORSAudit Log

ML

Mar 14–16

LightGBM Secondary Model & Feature Expansion

“The ensemble gap that justified removing XGBoost does not apply to a gradient-boosted lightweight.”

LightGBM trained as a secondary scorer on the 111K-row corpus. Feature set expanded to 18 variables, adding merchant_risk_freq and amount_log. Both models exported to ONNX; inference pipeline updated to fuse RF and LightGBM scores via weighted average (0.7 / 0.3). Sweep artifacts saved as lgbm_sweeper.onnx.

LightGBM18 featuresONNX fusionWeighted 0.7/0.3

ML

Mar 14–16

LightGBM Secondary Model & Feature Expansion

“The ensemble gap that justified removing XGBoost does not apply to a gradient-boosted lightweight.”

LightGBM trained as a secondary scorer on the 111K-row corpus. Feature set expanded to 18 variables, adding merchant_risk_freq and amount_log. Both models exported to ONNX; inference pipeline updated to fuse RF and LightGBM scores via weighted average (0.7 / 0.3). Sweep artifacts saved as lgbm_sweeper.onnx.

LightGBM18 featuresONNX fusionWeighted 0.7/0.3

Together

Mar 17–19

Three-Tier Deployment Architecture

“Cloud, enterprise, and edge are not three products — they are one system at three integration depths.”

Formalised the three-tier deployment model: Cloud (hosted Rust gateway on Railway + Cloudflare Pages), Enterprise (API-first with HMAC webhooks, graph topology streaming, PSP bank integration), Embedded SDK (quantized ONNX < 5MB, ONNX Runtime Mobile, zero round-trip on-device scoring). Each tier shares the same model artefacts and scoring logic.

Cloud TierEnterprise APIEmbedded SDKONNX Mobile

Together

Mar 17–19

Three-Tier Deployment Architecture

“Cloud, enterprise, and edge are not three products — they are one system at three integration depths.”

Formalised the three-tier deployment model: Cloud (hosted Rust gateway on Railway + Cloudflare Pages), Enterprise (API-first with HMAC webhooks, graph topology streaming, PSP bank integration), Embedded SDK (quantized ONNX < 5MB, ONNX Runtime Mobile, zero round-trip on-device scoring). Each tier shares the same model artefacts and scoring logic.

Cloud TierEnterprise APIEmbedded SDKONNX Mobile

ML

Mar 20–22

IsolationForest Calibration & Sweep

“An anomaly detector miscalibrated at 5% contamination flags legitimate high-value merchants every hour.”

IsolationForest contamination tuned from 5% to 2% after simulation revealed excessive false positives on recurring high-value UTILITY payments. Bayesian sweep across n_estimators (100–400) and max_samples: optimal 300 trees, 256 max_samples. False positive rate halved without recall loss. Saved as isolation_forest.onnx v2.

Contamination 2%Bayesian sweepFP reductionv2

ML

Mar 20–22

IsolationForest Calibration & Sweep

“An anomaly detector miscalibrated at 5% contamination flags legitimate high-value merchants every hour.”

IsolationForest contamination tuned from 5% to 2% after simulation revealed excessive false positives on recurring high-value UTILITY payments. Bayesian sweep across n_estimators (100–400) and max_samples: optimal 300 trees, 256 max_samples. False positive rate halved without recall loss. Saved as isolation_forest.onnx v2.

Contamination 2%Bayesian sweepFP reductionv2

Security

Mar 24–28

Graph Agent Streaming & Consortium Layer

“Batch topology analysis finds yesterday's mules. Streaming graph analytics finds today's.”

Graph agent migrated from batch NetworkX snapshots to event-driven incremental updates: each transaction edge appended and fan-out/fan-in/cycle metrics recomputed in O(k) per event. Consortium risk-sharing prototype implemented: anonymised score deltas federated via HMAC-signed shared registry — zero PII exposure.

Streaming graphIncremental O(k)ConsortiumNo PII

Security

Mar 24–28

Graph Agent Streaming & Consortium Layer

“Batch topology analysis finds yesterday's mules. Streaming graph analytics finds today's.”

Graph agent migrated from batch NetworkX snapshots to event-driven incremental updates: each transaction edge appended and fan-out/fan-in/cycle metrics recomputed in O(k) per event. Consortium risk-sharing prototype implemented: anonymised score deltas federated via HMAC-signed shared registry — zero PII exposure.

Streaming graphIncremental O(k)ConsortiumNo PII

Together

Mar 29–31

V2 Three-Tier Launch

“One codebase. Three deployment surfaces. One fraud intelligence system.”

Varaksha V2 deployed across all three tiers. Cloud: Rust gateway on Railway + Cloudflare Pages CDN, live SSE stream, <10ms P99. Enterprise: graph topology network monitor, SecurityArena attack simulations, webhook delivery. Embedded SDK: on-device ONNX scoring simulation. All tiers share model artefacts from the 111K-row corpus.

V2 LaunchAll Three TiersCloudflareRailwayLive

Together

Mar 29–31

V2 Three-Tier Launch

“One codebase. Three deployment surfaces. One fraud intelligence system.”

Varaksha V2 deployed across all three tiers. Cloud: Rust gateway on Railway + Cloudflare Pages CDN, live SSE stream, <10ms P99. Enterprise: graph topology network monitor, SecurityArena attack simulations, webhook delivery. Embedded SDK: on-device ONNX scoring simulation. All tiers share model artefacts from the 111K-row corpus.

V2 LaunchAll Three TiersCloudflareRailwayLive

How We BuiltVaraksha in a Sprint

The Problem

The Architecture

The Outcome

Defining the Architecture

Defining the Architecture

Privacy Gateway in Rust

Privacy Gateway in Rust

ML Baseline Established

ML Baseline Established

Graph-Based Mule Detection

Graph-Based Mule Detection

Multilingual Alert Delivery

Multilingual Alert Delivery

Integration Proof-of-Concept

Integration Proof-of-Concept

Model Architecture Overhaul

Model Architecture Overhaul

Production Deployment

Production Deployment

Dataset Coverage Audit

Dataset Coverage Audit

85.24% (V1)

85.24% (V1)

V1 Finalisation and Deployment

V1 Finalisation and Deployment

Gateway Hardening — Rate Limiting & Auth

Gateway Hardening — Rate Limiting & Auth

LightGBM Secondary Model & Feature Expansion

LightGBM Secondary Model & Feature Expansion

Three-Tier Deployment Architecture

Three-Tier Deployment Architecture

IsolationForest Calibration & Sweep

IsolationForest Calibration & Sweep

Graph Agent Streaming & Consortium Layer

Graph Agent Streaming & Consortium Layer

V2 Three-Tier Launch

V2 Three-Tier Launch

What We Build Next

All 22 Scheduled Languages

Mobile SDK Packaging

On-Device Edge Inference

Streaming Graph Analytics

Live LLM Legal Summaries

NPCI Consortium Risk Sharing

Automated Regulatory Reporting

Open-Source Release

How We Built
Varaksha in a Sprint