Graphomaly | >
Graphomaly - software package for anomaly detection in graphs modeling financial transactions |
||||
Project PN-III-P2-2.1-PED-2019-3248, PED 2019 | ||||
Description and objectives | ||||
Description and objectives | ||||
The Graphomaly project aims to create a Python software package for anomaly detection in graphs that model financial transactions, with the purpose of discovering fraudulent behavior like money laundering, illegal networks, tax evasion, scams, etc. Such a toolbox is necessary in banks, where fraud detection departments still use mostly human experts. We aim to propose and test specific algorithms for financial graphs analysis and apply anomaly detection tools, among which those based on dictionary learning will have a prominent place, on the resulting features and characteristic information. The implemented methods will be able to process large graphs. Online and distributed forms of the algorithms will be derived, such that reaction time is decreased and thus frauds can be discovered in their incipient stages. This work is funded by Romanian Ministry of Education and Research, CCCDI - UEFISCDI, project number PN-III-P2-2.1-PED-2019-3248, within PNCDI III. |
||||
Team | ||||
The three partners involved in this project and the team members are
|
||||
Documents | ||||
|
||||
Journal papers | ||||
|
||||
Conference papers | ||||
|
||||
Software | ||||
| ||||
Datasets | ||||
| ||||
Short presentation of the results | ||||
The Graphomaly library has been implemented in Python. The structure of its processing flow is shown in the figure below. The list of bank transactions over a certain period of time is transformed into graph, the node being clients and the edges being cumulated transactions. Features are extracted from the graph, using inovative methods that are the theoretic contribution of the project. The most succesful features were those derived from egonets (the graph formed by the neigbors of a central node) and reduced egonets (the egonet from which the nodes having a single edge, that with the central node, are eliminated). The figure below illustrates these concepts for a real case. The red node is the center of the egonet and the transactions between it and the orange node are suspect. The reduced egonet is obtained by removing the green nodes. The set of features derived from the egonet and reduced egonet, like edge density, transferred amounts, central node degree, allows good anomaly detection using unsupervised machine learning methods. The experimental validation of the Graphomaly library was made on two data types: i) real data provided by Libra Internet Bank; ii) synthetic data (here is an example), consisting of random graph in which abnormal structures (cliques, rings, stars) are inserted. The results are very good, on both types of data; other methods tend to perform well only on certain cases, especially for artificial data. Execution times are reasonable for rather large graphs, with hundreds of thousands of nodes, and so can be used in banking applications. More information on results and contributions can be found in this paper. Taking also into account the software structure of the library, similar to that of popular packages in machine learning and hence easy to use for people working in this field, we can say that we have produces an efficient, flexible and robust software package. | ||||