Canso AI

A company that needs to move currency in some form is likely to be affected by fraud. Financial Services, Gaming, E-commerce, Advertising — all of these sectors are facilitating transactions in one form or another and they are taking a hit on margins due to fraud. Understandably, this space is so vast that Fraud Detection has been one of the most widely spoken areas in ML research.

The first time I built a Fraud Detection and Prevention System, it took me 1.5 yrs to see it running in production and creating value. Yes, 1.5 yrs is way too long to invest on an ML use case. After the first release, we would spend a few months upgrading it to address new fraud patterns.

‍

Although the value we created through those systems is in millions USD per year, many companies would fear picking up initiatives with that high time to value.

In this blog, I’m going to speak about various challenges I experienced in my journey from developing robust real-time solutions and scaling them in production. More specifically, I’ll deep dive into what it took to convince the business leaders of the value it can unlock. If you are leading Fraud Control initiatives, my recommendation is to consider addressing these in the beginning itself to avoid surprises later in your journey.

Fig 2 — Focus areas to achieve faster time to value

Aligning Initiatives with Company’s North Star

Imagine you’ve identified that 10% of your company’s transactions are fraudulent, but only 4% of those transactions actually result in financial losses. Would you still prioritize solving for the additional 6%? While it may seem like a straightforward decision to eliminate all fraud, often, solving for fraud can inadvertently conflict with your company’s core growth objectives.

‍

Fig 3 — Fraud may not always result in losses. How to tackle such cases!

‍

As a leader in fraud risk management, you want to see your work create tangible value. That’s your primary motivator. But growth leaders and founders are often more focused on driving topline revenue growth. This creates a natural tension across the organization: fraud teams want to minimize risk, while growth teams want to maximize opportunity. And this is applicable for all sectors.

In Banking, companies are not liable for fraud that’s perpetrated due to customer negligence, such as social engineering.
In AdTech, Networks don’t lose money if advertising partners are not asking for clawbacks.
In Gaming, companies are not always losing money if a bot is playing games.

At times, it may even be the case that companies make more revenue due to fraud.

In these cases, fraud may not appear to be an urgent problem for the company itself. However, it’s critical to recognize that fraud always results in a cost to someone — whether it’s your company or your customers. And if you’re not solving for that additional 6%, it’s your customers who are bearing the brunt of those losses.

Forward-thinking leaders understand that tackling fraud today may come with short-term costs but will ultimately strengthen customer trust and drive long-term ROI. While some leaders focus solely on reducing direct financial losses, those with foresight prioritize building sustainable trust and protecting the broader ecosystem.

Although addressing both fraud and financial losses may appear similar, they require distinct strategies. The way you analyze data, develop machine learning models, and implement decision-making processes in production is driven by your KPIs. That’s why it’s essential to know what you’re solving for and plan the roadmap accordingly.

Based on my experience of working with various types of leaders, I’ve found a phased approach to be most effective. Here’s how you can align your fraud detection efforts with your company’s North Star:

Fig 4 — Achieve small wins to develop trust with stakeholders

‍

1. Prioritize Solving for Direct Losses — Start by addressing fraud that results in direct financial losses. This is the easiest win, as there’s minimal internal resistance to solving for losses that are clearly impacting the company’s bottom line. Seeing the immediate impact of these efforts builds trust and buy-in from stakeholders across the organization.

2. Conduct Controlled Experiments in Other Areas — Once you’ve addressed direct losses, gradually expand your fraud detection efforts to other areas where fraud doesn’t result in immediate financial loss but still impacts customer trust. Use controlled experiments and feedback loops to measure the impact of your solutions. The key metrics will vary by industry:

a. In Banking, focus on improving customer satisfaction and reducing churn.
b. In AdTech, measure the impact on return on ad spend (ROAS).
c. In Gaming, track user engagement and retention rates.

These experiments will help you build a case for the broader value of your fraud detection efforts.

3. Scale Solutions with Proven ROI — Once you have sufficient evidence of ROI, it becomes much easier to achieve consensus across the organization. Use the data from your experiments to demonstrate how fraud detection initiatives align with the company’s growth goals. At this stage, a full-scale rollout is typically met with little resistance, as you’ve already proven the value of your solutions.

This phased approach ensures that your fraud detection initiatives are focused on specific, measurable goals. It minimizes development time and reduces the risk of pushback when it’s time to deploy your solutions. Additionally, by aligning your initiatives with the company’s North Star, you ensure that your work creates long-term value — not just for your company, but for your customers and partners as well.

Developing Complex Engineering Systems

It’s recommended to have an ML and Engineering team/leadership who have experience of developing such scalable systems. This can easily save you months of effort. Otherwise, there will be many surprises in your journey that will lead to a longer time to value. However, if you don’t have the right expertise in the team, I am sharing a list of potential challenges that you may encounter while developing the solution. Being aware can minimize surprises. This is not an exhaustive but a good enough list of pitfalls that may delay your plan by months if not taken care of -

Low latency Feature Engineering

Fraud Detection is one of the biggest use cases of real-time data. Many teams start with batch features only to realize later that the reliance on stale/outdated data in batch hinders the ability to identify fraud significantly. And that’s when they have to migrate all features to real-time streaming applications. Streaming applications is a different ballgame altogether. Processing and aggregating large volumes of data in seconds and making them available for serving in Feature Store such as Redis requires deep engineering skills. Even with the right skills in a team, it’s a big effort to develop such applications and optimize performance. Not having such skills can easily add months to your timeline.

I recently worked with an Ecommerce company incurring ~4% revenue loss due to fraud. On measuring the effectiveness of features at different latencies, we realized that the same feature can miss detecting 33% of fraudulent transactions when served at latency of 2mins compared to only 2% when served at latency of 30sec. Imagine, if a feature has potential to add an incremental value of $100K in identifying fraud, you will only be able to realize an impact of $67K if the feature is served with 2mins latency. This data is specific to a set of features and the product and may vary for others.

‍

It’s important to gather such insights before you start developing features to accelerate time to value. And have the right expertise in the team to implement real-time streaming applications.

Explainability

Fraud detection systems rarely succeed without an explainability framework. Stakeholder consuming decisions made by fraud detection systems require proper reasoning which if not present adds hours to often days of work for Analysts and Data Scientists to deep-dive into data to find proper reasoning.

Before jumping into explainability, let’s first briefly understand how outcomes of fraud detection systems may be consumed by various stakeholders -

Investigation — Fraud alert is often followed by data analysis to reaffirm the decisions before making the final decision to block the account/transaction. Investigators/Analysts often invest hours to days deep-diving into the data to understand whether the alert was a True/False positive. Having proper justifications from an explainability framework helps getting a headstart and save hours of effort on analyzing suspicious alerts.
Customer Support — Customers may come back to raise concerns for blocked transactions or accounts, in which case support teams need to have easy understandable reasons for why the transaction/account was blocked.
Chargeback — Customers may raise complaints for fraud and chargeback, in which case support or analytics teams need to have proper tools to investigate the complaints and come to a resolution soon enough.
Reporting — REs need to report fraud events with proper justification to central bodies.

While most of the stakeholders are only interested in understanding reasons for flagging an event/account as suspicious, there are scenarios where it is equally important to justify why certain accounts, devices or users are deemed trustworthy. For e.g. in Gaming, identifying and safeguarding whale users is essential to ensure they are not mistakenly affected by the fraud detection systems. In FS and AdTech as well, customer support teams may use it to fast track resolution if the user is considered a high value user.

Here is an example of what explanation should look like -

1. The transaction has been assigned a high-risk score of 0.97 due to suspicious device activity. 
2. The device has total of 11 geo-country missmatch counts across all events in the last 24hrs which is suspicious. The feature contribution is 37%.
3. Additionally, the device also has very fast click activity that resembles to that of a bot and its feature weightage is 13%.

‍

Besides this, the explainability framework must also provide various other details around the account’s historical data and alerts for deeper investigation. While basic explainability is a must have feature, better frameworks help provide a 50–70% lift in productivity.

Minimizing Adverse Impact of False Positives

Machine Learning models are not perfect and so are the Business Rules used for fraud detection. Therefore, whichever method you implement, you are bound to see false positives, i.e. genuine customers getting affected due to incorrect alerts from fraud detection systems.

If you’re a Risk or Machine Learning leader, you would want to take a shot at the problem thinking that you’ll be able to develop a good ML powered fraud detection system with minimal false positives. While that’s the right approach and top teams globally are using ML, even for real-time fraud detection, it’s tough to develop models with good performance.

Imaging, your team invested 6 months in training an anomaly detection model with 5% false positives. When you take this to growth stakeholders, they may not be very happy about it as their expectation was to see a model with “great” performance. This will now either lead to weeks of conversation, alignment and pushback before deployment or your team getting back to identifying ways to reduce false positives that will take more time.

Your stakeholders or leadership may not really understand False Positives and how to tackle it. However, as Risk or Machine Learning leaders, we must understand that ML models are not perfect and that there will always be false positives. This is primarily because of following reasons -

Limitations of algorithm and data fed to the model — ML Models learn from data and data is not perfect. This introduces errors in predictions.
Many fraudulent and genuine devices/attempts may show similar behavior which is difficult for the models to interpret and distinguish fraud from genuine.
In the absence of labeled data for the types of fraud we’re intending to identify, there is no clear line between what’s fraud and what’s genuine. ML models and Data Scientists with their oversight try to improve accuracy but it’s never perfect.

It is important for the stakeholders to understand this fact and come up with strategies in the initial stages itself to avoid conflicts or delays later. A few things that help in this regard are -

Acceptable level of False Positives — Stakeholders come to a consensus on the acceptable level of false positives that the business can live with. For e.g. if the model flags $100 worth of transactions as fraud, we’ll be fine if up to $5 were actually legitimate. This allows the tech (data science) teams enough room to experiment with and train the best possible model that’s within the false positives thresholds decided by the business.
Aim for a much lower recall and higher precision — Capturing more fraud means more false positives. Hence, begin by capturing fraud at a very small scale to keep FP at a minimum. Gradually scale up while keeping other business health metrics in check. This will ensure we’re keeping FP in control.
Leveraging hybrid model “Machine Learning + Rules” — Many risk rules similar to ML result in false positives. For e.g. imagine you are using rules around device checks to identify mule account activity such as “number of accounts accessed from the same device within the last 24hrs is greater than 3 and there are at least 2 linked transactions among those accounts”. In this case accounts accessed from the same device in cyber cafe or home may also get flagged that will lead to false positives and poor user experience if blocked. One approach to tackle that is to keep adding more rules that will only lead to increased overhead of managing so many rules. A more efficient approach is to leverage a Machine Learning model trained on device and account behavioral history to identify anomalies and suspicious activities. Combining ML risk scores with few important rules can increase precision of your decisioning engine.
Human in the loop to investigate and reduce false alarms — In all companies where cost of false positives is high such as in payments, ‘Investigation’ is a necessary and a must have step in the process. The teams develop multiple ML models that run hourly or daily for monitoring accounts or transactions. These monitoring systems trigger 100s of alerts. The investigation team filters high risk alerts and investigates their historical data thoroughly. A few of those accounts with high risk alerts are eventually marked as suspicious and are blocked. This is how the “human in the loop” layer helps in reducing the adverse impact of false alarms. Even for real-time systems, many events with critical alerts are held from processing for 10–15mins which is when investigation is performed and then a decision is made to block/approve.

Getting to a consensus early on ensures your team has a target to achieve that is agreed with all stakeholders and that there are no surprises later when it’s time to deploy.

Of all the challenges discussed here and beyond, False Positives is probably the most critical one to take care of beforehand.

Laying a Strong Foundation for Continuous Innovation

Fraud is an evolving space. With democratization of the dark web and access to sophisticated tools fraudsters are continuously upgrading their strategies to bypass the controls. As a result, even after deploying your first fraud detection system, your Risk and Data Science/Analytics teams will need to engage in continuous enhancements to keep up with emerging threats.

However, making these enhancements can be a significant struggle due to the time-consuming nature of the process. Here’s why:

Identifying Emerging Fraud Patterns: Detecting new fraud tactics requires weeks or even months of analysis across various data sources. Teams must validate hypotheses using both internal data (such as transactional logs) and external intelligence (such as crime reports or industry trends).
Retraining ML Models: When a new pattern is identified, teams often need to retrain machine learning models and optimize their performance. This process must ensure that false positives remain within acceptable limits, which is crucial to maintaining business operations and customer experience.
Taking Systems from Development to Production: Moving a fraud control system from development to production is a complex, multi-step process. It involves developing streaming applications, deploying updated models, running experiments, and evaluating performance to ensure the enhancement is effective without introducing new risks.

Even minor enhancements can take 4–8 weeks of effort to go from ideation to deployment. In practice, this means your team might only have 3–4 chances per year to improve fraud detection systems. However, by reducing your time to value by 50%, you could potentially double the number of production releases, allowing you to stay ahead of evolving threats.

Invest in Efficient Systems to Boost Time to Value

To achieve faster time to value, it’s critical to build fraud detection systems with scalability and efficiency in mind from the outset. Investing in MLOps practices and AI platforms can significantly enhance your team’s productivity and reduce time spent on repetitive processes.

Here’s how these tools can help:

Real-Time Feature Engineering and Model Deployment: By leveraging MLOps and Feature Platforms, your team can reduce the time required for feature engineering and deployments, ML deployments etc. by up to 90%. This allows for faster adaptation to new fraud patterns.
No-Code Frameworks to Manage Fraud Controls: Use no-code platforms to deploy and gradually scale fraud controls without extensive engineering efforts. With such tools your teams can create and deploy new workflows in minutes — a task that would otherwise take days or weeks.
AI Agents for Threat Intelligence: Use AI agents to automate manual analysis tasks such as reconciliation and extraction of insights from various sources like crime reports, customer complaints, and external threat feeds. These agents with Data Scientists and Analysts in the loop can help surface emerging fraud threats faster and ensure that your systems are addressing new attack vectors before they cause significant impact.

By adopting these strategies, you can make your fraud detection systems more agile, scalable, and future-proof, ensuring your organization remains one step ahead in the ongoing battle against fraud.

Concluding Thoughts

As Chief Risk Officers or Head of Fraud and Risk Management, balancing Compliance, Risk and Growth simultaneously is no small feat. It requires not only strategic navigation through complex business hurdles, but also a deep understanding of how to equip data, analytics, ML and platform teams with the right tools and strategies to build advanced fraud detection systems that are fast, adaptive and resilient to emerging threats.

I hope the article was helpful in gaining insights around key challenges and practical approaches to overcome, to ensure a “Faster Time to Value” in fraud detection.

I’m happy to connect and exchange thoughts if this interests you. Please drop an email on kumar.sanjog@canso.ai. Alternatively, you can also connect with me on LinkedIn here.

‍

Driving Faster Time to Value in Fraud Detection — A CRO Guide

Aligning Initiatives with Company’s North Star

Developing Complex Engineering Systems

Low latency Feature Engineering

Explainability

Minimizing Adverse Impact of False Positives

Laying a Strong Foundation for Continuous Innovation

Invest in Efficient Systems to Boost Time to Value

Concluding Thoughts

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Kumar Sanjog

Subscribe to email updates

Learn more about Canso

Learn more about this topic

Explore similar blogs

Real Time Fraud Detection Using Apache Flink — Part 2

Real Time Fraud Detection Using Apache Flink — Part 1

How to Build a Real-time Fraud Prevention System

Company

Resources

Company

Social