A Multi-Agent System Based Approach to Fight Financial Fraud : An Application to Money Laundering

A Multi-Agent System Based Approach to Fight Financial Fraud: An Application to Money Laundering Claudio Alexandre 1,∗ ID and João Balsa 1 1 Faculdade de Ciências da Universidade de Lisboa (BioISI-MAS), Campo Grande, 1749-016, Lisboa, Portugal * Author to whom correspondence should be addressed. Academic Editor: name Version December 19, 2017 submitted to Appl. Sci. Abstract: The anti-money laundering (AML) process has failed both in identifying suspicious cases in 1 due time as in assisting the AML analysts in decision making. Starting from a new generic anti-fraud 2 approach, this article presents the main aspects related to the development of a multi-agent system 3 that goes beyond the capture of suspicious transactions, seeking to assist the human expert in the 4 analysis of suspicious behaviour. First, a transactional behavioural profile of clients is obtained 5 in a data mining process. A set of rules, obtained through data mining over a real database, in 6 conjunction with specific rules based on legal aspects and in the expertise of the AML analysts make 7 up the agents’ knowledge base. The cases for which the system was unable to suggest a decision are 8 flagged as requiring more detailed analysis. The system analysed 6 months of real transactions and 9 indicated several suspicious profiles, a set of these suspects was investigated by the AML analysts 10 who proved the suspicion of several cases, including some that had not been identified by the systems 11 in execution. 12


Introduction
In the last decades, money laundering has been increasingly recognized as a significant global problem and was given special attention by almost every government in the world.An evidence that money laundering is a global worry was the prioritization of its combat at the same level of the most relevant global issues [1].The large amount of money involved in this crime and the social issues involved, justify the prioritisation in anti-money laundering (AML) [2].The stages and a graphical scheme of a typical money laundering process was shown and explained in [3].
Most of the current rules and recommendations are targeted to transactions involving cash, which are the most common at the beginning of the money laundering process.This type of bank transaction, normally carried out in person, favours monitoring.Unlike subsequent virtual transactions, whose objective is to hinder the monitoring of this money's trajectory.
Most financial institutions already use semi-automated processes, determined by current regulations, for signalling suspicious transactions of money laundering, based on client information bank register, averages, standard deviations and pre-established fixed rules, usually with origin in empirical observations or the human experience of the AML analysts.However, the growing volume of transactions, coupled with the frequent publication of new national and international regulations, have led to inefficiency in this human analysis process.
With the aim of making the AML process more effective, we developed a multi-agent based approach to support decision making in this process.Introducing a version of the methodology of client behavioural profiles, intelligent agents analyse and signal suspicious transactions, and assist the AML analyst in making signalling decisions.The Belief-Desire-Intention (BDI) model, adopted in the development and implementation of the system, easily uses the behavioural profiles found, as well as the implementation of specific rules, both normative based and according to the risk involved.These characteristics, among other benefits, improve the efficiency of the process, facing the increase in the volume of transactions with the future gradual reduction of human intervention.
The experiment we conducted, over one year of real transactions, revealed a set of suspicious profiles considered adequate by the human specialists of the bank that finances this project, since, in addition to the cases already known, new suspects were signalised and validated.
In the following sections, the new anti-fraud strategy is presented (section two), followed by the analysis of related work (section three).The process of creating and building the agents' knowledge bases is presented in section four.The AML multi-agent system design and implementation is described in section five.The article concludes with an analysis of results (section six) and with some conclusions and future work comments (section seven).

Standard Anti-Fraud Strategy and a New Approach
The Know Your Customer (KYC) policy, defined by the Basel Committee 1 in [4], is like a best practices guide in the sense that it details the procedures to be followed to prevent fraud.Based on the cited document it is possible to map a generic flow to combat fraud or swindling in economic activity (Figure 1), in the cases in which this type of crime occurs in operations directly or indirectly computerized.The generic process of fraud prevention is based on regulations and recommendations issued by control offices and uses parameters that set limits for quantities and values involved in transactions.
The analysis of the transactions is carried out in a short period of time and the majority is carried out by a human analyst.
Some authors classify money laundering as a predicate crime, i.e., a crime that always occurs due to some other underlying crime that would illicitly provide its author proceeds that later he, or others, will intend to camouflage [5].However, it is possible to analyse the money laundering crime as an instance of a generic process of fraud or swindle and and we can see the AML activity as a special case of what happens in a general flow of fraud-fighting in the financial sector. 1 The BCBS -Basel Committee on Banking Supervision, linked to the BIS -Bank for International Settlements, was established in 1974 to enhance financial stability by improving the quality of banking supervision worldwide, and to serve as a forum for regular cooperation between its member countries on banking supervisory matters to strengthen regulation, supervision and best practices in the financial market.A new version of the general flow to fight fraud is showed in Figure 2, whose objective is to mitigate the identified risk.The creation of profiles of the actors participating in the activity are based on 1) their complete history of performed operations; 2) the replacement of the current fixed parameters by production rules, based on these profiles and the norms and existing recommendations; and 3) the use of intelligence at some points of the process.This way, the quality level in the capture of suspicious operations and, especially, of decision making by the specialist, is clearly improved.
Creating a new instance of the generic flow proposed in Figure 2 for the ML crime, it is possible to draw a stream like the one shown in the Figure 3, that corresponds to what is commonly used by financial institutions, and presented in [6].

Related Work
The adaptability of the "modus operandi" of fraudsters and the lack of systematized information linking suspect transactions to evidence of crime are obstacles to a more rapid advance in automating the process of preventing and combating money laundering activities.However, since the first widely publicized system in the AML area, the FinCEN Artificial Intelligence System (FAIS) [7], developed and used by the Financial Crimes Enforcement Network (FinCEN), many artificial intelligence techniques have been used in the search for a good system.
Data mining, machine learning and clustering techniques have been widely used in the attempt to identify suspected money laundering cases, as in Zhang [8], where a discretization process was applied to a data set to find a more adequate set of clusters.Kingdon [9] proposed that an artificial intelligence approach should model individual clients and look for unusual rather than suspicious behaviour.There are also statistical approaches, like the ones described in Liu and Zhang [10] and Tang and Yin [11].
In Le-Khac and Kechadi [12], the authors present a case study corresponding to the application of a knowledge base solution that combines data mining techniques, clustering, neural networks and genetic algorithms to detect money laundering patterns.Chang and Chang [13] proposed the use of decision trees based on C4.5 to induce rules and use them to validate the identified cluster.Larik & Haider [14] focus their work on the debit and credit information made by clients of a financial institution to identify suspicious transactions.
In his survey about clustering, Sabau [15] asserts that clustering has proven itself a recurrently applied solution for detecting fraud and concludes that k-means based clustering algorithms with Euclidean distance as dissimilarity metric are the most commons used ones.Regarding the use of agent based approaches in AML, there are few authors that have considered them.In Gao [16], an agent architecture is defined to include a set of specialized agents, such as data collecting agents, monitoring agents, a behaviour diagnosis agent, and a reporting agent.
Xuan and Pengzhu [17] present an agent-based approach that, besides the inclusion of reporting and user agents, include Negotiation and Diagnosing agents that are ultimately responsible for the most critical decisions, taken on the basis of information provided by two other groups of agents: data collecting and supervising.Rajput [18] use ontologies and rules in the creation of a specialized system to detect money laundering suspicious transactions.
The works mentioned above focus only on the signalling of suspicious transactions stage; use a small amount of data for testing; and do not assist the human expert in decision making.The system we present in this paper seeks to overcome the above mentioned limitations.

Building the Agents' Knowledge Bases
A database with real data reflecting the transactional behaviour of clients has a significant amount of attributes necessary for the control and management of the business involved.Of course, not all of these attributes are relevant to a search for suspicious transactions.The selection of the relevant attributes and the possible generation of new attributes must reflect the transactional database and allow the identification of suspicious behaviour [19].
Two years of current account transactions from a Brazilian bank were used in the first stage of the process.The accounts of the 5.2 million clients of this bank receive, on average, 85 million transactions per year.The analysis of the database showed that less than 10% of the clients are made up of corporate entities (commercial companies, industries, governments, etc.), however, they are responsible for more than 90% of the values transacted.
Tests carried out showed that the client type attribute was not enough to offer a good characterization of the groups formed in this unsupervised data base.Thus, to better characterize the clients, the database was divided: one with individual clients and another with corporate type clients.
The procedure described below was performed independently for each database.

Client Transactional Behaviour Strategy
The rules that constitute the Agents' knowledge bases were obtained through a process detailed in Algorithm 1. From each base, transactions whose characteristics are irrelevant in this context (for example tariffs, commissions, interest, taxes, etc.) were excluded, resulting in 35 million relevant transactions (Algorithm 1 -Step 1).
With the purpose of establishing transactional behaviour profiles, in a certain period, we created a set of attributes, which aggregate quantities and segment characteristics for each actor of the process.
The period measured is directly related to the nature of the business involved, presenting the maximum possible duration, for example quarterly, semi-annual, annual, etc.
One year of transactions was used to generate the client transactional behaviour profile, that is, database information such as: the account age, the number of transactions generated, the number of used services, the amount sent to other banks and to the accounts of the own bank, and the number of movements divided into six ranges of values.A twelfth attribute was created and named debt percentage, representing, in a weighted way in the period analysed, the time that the money remained in the customer account [20]  Seto f Clusters-Cl ← Classification algorithm (algc (CP, k)) 8: VectorError1-E1 (k) ← Calculates Classification Error (algr1 (Cl)) 9: VectorError2-E2 (k) ← Calculates Classification Error (algr2 (Cl)) 10: end while

Cautious Approach to Risk
The analysis of the generated clusters, for the two customer segments, allowed the identification of characteristics such as: high turnover of high values, with full transfer to other financial institutions (high risk); or movement of values close to the legal limit to communication to regulatory agencies (moderate risk).This analysis resulted in the classification shown in Table 1.
With this classification, it is possible to define a better strategy, offering differentiated treatment to the groups of clients, according to their level of risk.Despite the excellent level of accuracy obtained in the evaluation of the generated rules, around 99% for both customer segments, an error of one percent represents more than 26 thousand transactions and cannot be ignored.
The confusion matrix generated by the rules algorithms has identified the rules that, because they are applicable to two or more groups of clients, represent the one percent error mentioned.That is, rules classify customers as belonging to more than one profile.Probably there is a configuration of the parameters of the algorithms used that allows to minimize this error, however, the number of rules increases significantly, making the cost / benefit little attractive.The decision was to reclassify the profiles that do not represent risk or have low risk (profiles one, two and three), as shown in Table 1.
Thus, the transaction that belongs to one of these three groups will be reclassified, only for analysis process, if there is a rule that satisfies the condition.Thus, the problem was corrected conservatively using the same amount of rules.
For example, in the database used for data mining, 33 rules classify customers as individual client belonging to risk profiles two and three, corresponding to 1.85% of the total.However, these rules also classify 0.06% of customers originally belonging to the standard profile.The reclassification consists of, during the process of searching for a suspicious transaction, consider these standard clients as belonging to the risk groups, without modifying the original classification.
The classification of profiles also allows the creation of specific rules, whether based on current regulations or inspired by transactional behaviour.As already mentioned, this work used one year of information to generate the profiles, obtaining monthly totals and allowing to select maximum values in the year for each relevant attribute.It was established that the search for suspicious transactions will always start one month before the date requested for analysis.In this way the behaviour for one month of transactions will always be used for comparison with the profiles.

AML Multi-Agent System
The goal of this system is to support the AML process in a financial institution and the global system architecture is presented in Figure 4.
The system will keep up a profile for each customer, based on the transaction history, which will be used along with the rules created from official regulations to combat money laundering, to the capture and signal suspicious transactions processed by the various business systems.The system will decide on some marked cases and learn from the aid provided by the AML analyst during the decision making process of the most complex cases.It will also monitor norms and recommendations posted by organs of control, flagging those involving money laundering.New rules to capture suspicious transactions, including new regulations and possible changes in profiles, will be suggested.
The database with profile history reflecting the learning period and a set of rules constitute the primary knowledge base of the agents.The set of rules is formed by: 1. DM Rules -rules generated in the data mining process, which will be used to review the original classification of clusters 2. Legal Rules -rules based on legal regulations and general guidelines that guide the fight against the crime of money laundering 3. Profile Rules -rules that reflect the specific knowledge of the bank about AML and the risk management according to the groups of profiles generated The profiles to be analysed complete the set of information that will be manipulated by the system in the process of capture and analysis of suspicious transactions, the main focus of this work.
To achieve the proposed objectives, it is necessary to define a set of entities capable of making decisions, using existing knowledge and learning from the decisions taken by the human analysts.
These entities need to have action autonomy and be able to communicate with one another.Besides, the system should be scalable and flexible.These features point to a set of deliberative agents that are able to review and extend their knowledge, working in a environment fully observable, deterministic, static and discrete [21].
To model this system we chose a methodology based on Wooldridge [22], and in the study about the major Agent-Oriented Software Engineering (AOSE) carried out by Bawa [23].Bawa performed a comparison between five methodologies.The Prometheus methodology was selected, tested and verified to comply with all the criteria that we consider relevant to this work (ease of use; few graphics representation; accessible documentation, clear and preferably with examples; support tool for easy installation, preferably for windows environment).
Padgham [24]   and environmental characteristics are supported by the selected methodology.Although the detailed design produced by PDT can be straightforwardly converted to a specific and proprietary language, it is generic and can be used in a range of agent programming platforms.
Considering characteristics such as: quantity and relevance of available historical data that will be used for decision making; the aforementioned need, during the process, for review and expansion of the knowledge acquired; the possibility of developing new sub-objectives/objectives; we understand that Belief-Desire-Intention (BDI) is the appropriate model to adopt.Moreover,Prometheus is strongly targeted to BDI. Figure 5 shows the main system elements that make up the adopted BDI model and how they link to our problem.agents.Many other diagrams, with additional details, were generated, however, they are omitted in this paper 2 .The system overview is enough in this context and the following sections describe the agents.

Agents Description
There is a Suspect Profiles Captor (SPC) agent for each product (current accounts, investment funds, currency exchanges, etc.).This approach has two advantages.Firstly, this allows us to model each agent's knowledge according to the specificities each product has.Secondly, it makes scalability easier, in the sense that the creation of a new product can be incorporated in the system just by adding a new agent specialized in it.In the analysis of transactions, agents use the current control rules generated by the previously described process, based on customer profiles, the norms on AML and in the internal rules of the financial institution.
Whenever a SPC agent identifies a suspicious transaction, it informs the Suspect Capture Manager agent (SCM) that in its turn is responsible for forwarding it to some other SPC agents.So, SPC agents have two working modes: transaction oriented, in which the agents try to capture suspicious transactions with no assumptions regarding clients; client oriented, in which agents try to capture The SPC agent decision making occurs when at least one rule in Legal Rules or Profile Rules is triggered and decision is based on the risk level of the profile, characterized mainly by the cluster to which it belongs.
A Suspect Capture Manager agent can receive an external analysis request and command for execution by specialist SPC or command it autonomously.When receiving information from an SPC that a suspicious transaction has been identified, it forwards it to some other SPC agents search in client mode.After receiving all reply messages, SCM announces to the Decision Making Helper agent (DMH) the existence of suspicious transactions.The default time for automatic capture execution and knowledge about existing SPC agents compose the basic information for agent decision making.
The Decision Making Helper agent performs the analysis of the previously signaled transactions and conducts a learning process.The DMH agent has autonomy to decide amongst three possibilities regarding a signalization: accept it, discard it, or send it for further (human) analysis.So, this agent assumes the role of the AML Analyst in the analysis of suspicious transactions.Decisions taken by this agent and those reported by the AML Analyst, are stored in the historical decisions database and are used in the learning process to evolve the decision matrix.This agent is also responsible for the changes suggested in the decision matrix and the update of this knowledge base, after the suggestions are validated by the AML Analyst.
The decision making of this agent uses the defined decision matrix and what was learnt from the history of decisions about complex cases.
The internal knowledge of the system is formed by the profiles base, the norms and recommendations base, the control rules base and the decision matrix.For each one of these knowledge bases there is an agent responsible for its evolution.
The Client Profiles Database Updater (CPDU) agent acts in the analysis of transaction history to generate client profiles and subsequent comparison with the base of existing profiles.This process can be triggered by a user request or autonomously by the agent.New arising profiles are suggested to the AML Analyst.The CPDU agent updates the profiles data-base with the profiles that are validated.
The Norms Database Updater (NDU) agent purpose is to search for norms and regulations newly published by official agencies, and that are related to the AML process.This agent acts independently seeking new published norms, saving them and then selecting those related with AML, suggesting them to the AML Analyst for study and possible validation.The validated norms are then incorporated into the historic database of norms.
The Control Rules Database Updater (CRDU) agent has as main objective to ensure the evolution of the control rules database, that is a key element in the system architecture.The first stage of this development process is the generation of new rules, confronted with the current rules and based on new profiles and new norms existing.CRDU then suggests these new rules to AML Analyst and the validated rules are incorporated into the control rules database.
The Knowledge Evolution Manager (KEM) agent executes tasks such as: controling the suggestions made by agents for evolution of the knowledge bases; maintaining interface with AML Analyst in the validation procedure, deletion or extension of these suggestions; and maintain communication with other agents to command and control all process.The interface maintained with the authenticated AML Analyst allows it to analyze and validate the suggestions.
The evolution of the knowledge bases and the learning process existing in system are intended, primarily, to mitigate the risk of the occurrence of false positive and / or false negative, common in systems based only on a set of rules and behaviour norms [16].

Agents Interaction
The framework chosen to implement the system, which will be discussed below, uses natively the internal structure of the decision making mechanism of the Procedural Reasoning System (PRS)  goals", that is, with the performative label it is possible to identify the intent of the message sender [25].
Cooperation among SPC agents happens through their direct interaction with the SCM agent, that coordinates the tasks and receives the results.Only the SCM agent knows all the SPC agents existing; besides, it can identify the products that the suspect client uses in the institution.Figure 7 shows, briefly, an example of the interaction process between agents.
Interaction amongst other agents (DMH, MER, NDU, CRDU, CPDU) has the role to trigger in each of these agents the goal to perform the task under its responsibility.In other words, all agents have their own specific expertise, and they have independent and not conflicting goals.On the other hand, there is plenty of cooperation for the achievement of a common goal.

System Implementation
Departing from [26] and based on [27] and [28], the JaCaMo framework [29] was chosen.The tutorial presented in AAMAS2015 [30] reinforced this understanding.The framework is flexible in interaction with other programming languages, mainly java; native access to major database platforms; good interface and documentation; license free; and BDI native.
JaCaMo is based on three independent platforms: a) Jason for programming the agents level, inspired by the BDI architecture; b) CArtAgO for programming the environment level, that is composed of one or more workspaces, used to define the topology of the environment; c) Moise for programming the organizations level, defining the existing social groups and sub-groups, social tasks and behaviour rules that will allow the achievement of social goals [29] [25].
The rules obtained in the data mining process were written as Jason rules and used in a similar process as used in Prolog.The knowledge bases, and external databases, are accessed in MySQL using artefacts written in Java to improve the performance in access, due to the volume of data.All knowledge and external databases are used by the system as beliefs, representing the states of the environment and the knowledge acquired.

Results
The real data used in this work refer to two years, with 30.The systems in use in most banking institutions are strongly based on client information bank register, that's why they can identify as suspicious the incompatibility between the sum of transaction values and the customer income or billing information.The system proposed in this article does not use the same client information bank register and therefore will not signal such situations as suspicious.
This system is based on the client's transactional behaviour and should be adopted as a complementary tool.The initial strategy used to select suspect profiles for verification by human analysts and examples of reports issued by system was shown in [31].
The characteristics described above justify why the result obtained by the system here presented did not identify the same suspects identified by systems actually running.However, it is important to highlight that this system has signalled cases that were confirmed by human analysts and that were not identified in the past by the systems in execution.
The process of searching for suspicious transactions, implemented to date, can be divided into 3 phases: reclassifying the profiles or adjusting the confusion matrix; the capture of suspicious transactions; and the analysis of the captured transactions.The system uses as beliefs a set of 132 rules (104 of classification, 28 normative and profiles based), as well as the classification of customers in the indicated profiles.These beliefs are re-evaluated only once for each capture process.For example, in the first month evaluated, 418 profiles were reclassified, leaving the "standard profile" to "medium" and "high risk" profiles, only in that analysis.Table 2 shows a summary of the results obtained, where profiles are indicated as suspicious in the six months analysed.Figure 8 shows some examples of the rules used in the process.The average percentage of 0.05% of suspect profiles with respect to the period and the amount of profiles analysed is a feasible amount to be analysed by human analysts, considering a normal AML process.However, considering that the analysis of these results is an additional task to the analysts' daily routine, we have reduced the number of profiles that will be investigated.The strategy adopted was to investigate 38 profiles of which 16 were repeatedly flagged as suspects in the six months analysed plus 22 other profiles flagged in the five months.
Table 3 shows the AML Analyst investigation result.It is important to highlight the following aspects about the 38 analysed cases: a) 26 cases were not considered suspects, but the analysts did not consider them false positive, because there were indications of non-standard procedures; b) 12 cases were confirmed as suspects, six of which will required more in-depth investigations by bank branch involved, and six were clearly identified as suspects; c) all cases classified by the system as being "high risk" had not been previously reported; d) of the six fully confirmed suspects 4 had never been previously reported.
Considering the purpose of the system to assist the AML analyst and its learning ability, it is possible to observe that the analysis phase need refinement, considering the 26 cases indicated as suspicious but not confirmed, even though these were not considered false positive.

Conclusions
The systems currently in use by financial institutions are strongly based on client information bank register.They have failed in the steps of capture suspicious transactions and do not assist human specialists.
With our work we explored new approaches to combat both fraud and money laundering, and have described a multiagent system that is successful in this task, using specialization and cooperation between intelligent agents, in order to optimize and improve the quality of the process of signalling suspicious profiles in the anti-money laundering process.To build the agents' knowledge bases we used some machine learning techniques to identify risk groups and to create the client transactional behaviour profiles.These profiles were used as a marker for future behaviour.
The results obtained show the feasibility of systematic use and establish a new front to combat this crime.The quality of the results has been attested by the verification realized by the anti-money laundering analyst in the signalled suspicious transactions, in which it is worth highlighting that all cases classified by the system as being "high risk" suspects had not been previously reported; of the 6 fully confirmed suspects 4 had never been previously reported by any other system running.
As next future step we will review and improve the analysis phase, considering the 26 cases indicated as suspicious but not confirmed, even though these were not considered false positive.The strategy will be to complete the learning based on the decisions of the AML analyst, thus allowing better signalling.

Figure 1 . 2 .
Figure 1.General Flow to Fight Fraud Figure 2. New General Flow to Fight Fraud

Figure 6
Figure 6 shows a partial system overview (only the main elements) that results from the application of the methodology.The graphical symbols used in the system overview diagram represent actors like the AML Analyst; data which can be internal data indicating knowledge bases as the KB Control Rules or external as the DB Transaction History; percepts as Requests Suspicious Transactions; and

Figure 7 .
Figure 7. Example of the Interaction Process Between Agents

PreprintsFigure 8 .
Figure 8. Examples of rules, as codified in the KB.
5 million and 35.2 million relevant transactions, respectively.The client transactional behavioural profiles were generated from the data of the first year, the reference base.The search for suspicious transactions was executed over six months of relevant transactions in the second year, resulting in 17.1 million transactions.Over these transactions, 3.2 million transactional client behavioural profiles were generated for those months.

22 January 2018 doi:10.20944/preprints201801.0193.v1
Two sets of rules, one of each algorithm used, with the least number of incorrectly sorted instances were selected (Algorithm 1 -Step 4) and, together with the clusters that gave rise to these rules, represent the result of the process (Algorithm 1 -Step 5).
[3](Algorithm 1 -Step 2).This database table with active customer profiles in the analysed year has 2.4 million lines, each line representing a unique element formed by 3-tuple: client, agency and account.Over this table, the non-supervised inductive learning procedure was executed using clustering, seeking groups of clients with similar and mutually exclusive characteristics.The algorithm K-means was used in classification and the algorithms PART and J48 in generation of production rules, executed 11 times (number of attributes minus 1) (Called algc, algr1 and algr2, respectively, in Algorithm 1 -Step 3).Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted:

Posted: 22 January 2018 doi:10.20944/preprints201801.0193.v1
presents all aspects of this methodology, accompanied by a set of examples.The support tool is called PDT (Prometheus Design Tool), and works as an Eclipse plug-in.All these agents Preprints (www.preprints.org)| NOT PEER-REVIEWED |

Table 2 .
Summary of results for 6 months