>

 Graphomaly < < <

Romana EN

Graphomaly - software package for anomaly detection in graphs modeling financial transactions

Project PN-III-P2-2.1-PED-2019-3248, PED 2019

Description and objectives

Team

Documents

Journal papers

Conference papers

Software

Datasets

Short presentation of the results

Description and objectives

The Graphomaly project aims to create a Python software package for anomaly detection in graphs that model financial transactions, with the purpose of discovering fraudulent behavior like money laundering, illegal networks, tax evasion, scams, etc. Such a toolbox is necessary in banks, where fraud detection departments still use mostly human experts.

We aim to propose and test specific algorithms for financial graphs analysis and apply anomaly detection tools, among which those based on dictionary learning will have a prominent place, on the resulting features and characteristic information.

The implemented methods will be able to process large graphs. Online and distributed forms of the algorithms will be derived, such that reaction time is decreased and thus frauds can be discovered in their incipient stages.

This work is funded by Romanian Ministry of Education and Research, CCCDI - UEFISCDI, project number PN-III-P2-2.1-PED-2019-3248, within PNCDI III.

Team
The three partners involved in this project and the team members are
  • University Politehnica of Bucharest: prof. Bogdan Dumitrescu, prof. Florin Stoican, drd. Denis Ilie-Ablachim
  • University of Bucharest: conf. Paul Irofti, conf. Andrei Pătraşcu, drd. Andra Băltoiu, prof. Marius Popescu
  • Tremend Software Consulting SRL: Ioana Rădulescu, Ştefania Budulan, Alexandra Bodîrlău, Andrei Anton
External partner: Libra Internet Bank, providing transaction data.
Documents
Journal papers
  • B.Dumitrescu, A.Băltoiu, Ş.Budulan, "Anomaly Detection in Graphs of Bank Transactions for Anti Money Laundering Applications", IEEE Access, vol.10, pp.47699-47714, June 2022. Files: revised version, published version
  • A. Pătraşcu, P.Irofti, "Computational complexity of Inexact Proximal Point Algorithm for Convex Optimization under Holderian Growth", 2022. Files: arXiv version
Conference papers
  • B.Dumitrescu, D.Ilie-Ablachim, "Classification with Incoherent Kernel Dictionary Learning", Int. Conf. Control Systems and Computer Science, Bucharest, May 2021. Files: final version
  • F.I.Miertoiu, B.Dumitrescu, "Shape Parameter and Sparse Representation Recovery under Generalized Gaussian Noise", EUSIPCO, Dublin, Ireland, Aug. 2021. Files: final version
  • C.Rusu, P.Irofti, "Efficient and Parallel Separable Dictionary Learning", IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS), Beijing, China, Dec. 2021. Files: final version
  • P.Irofti, C.Rusu, A.Pătraşcu, "Dictionary Learning with Uniform Sparse Representations for Anomaly Detection", ICASSP, 2022. Files: accepted version
  • D.C.Ilie-Ablachim, B.Dumitrescu, "Anomaly Detection with Selective Dictionary Learning", CoDIT, Istanbul, Turkey, May 2022. Files: accepted version
  • P.Irofti, L.Romero-Ben, F.Stoican, V.Puig, "Data-driven Leak Localization in Water Distribution Networks via Dictionary Learning and Graph-based Interpolation", submitted to 2022 American Control Conference (ACC). Files: submitted version
  • D.C.Ilie-Ablachim, B.Dumitrescu, "Reduced Kernel Dictionary Learning", submitted to EUSIPCO, Belgrade, Serbia, Aug. 2022. Files: submitted version
Software
  • Graphomaly - a Python library for anomaly detection in graphs, the main purpose of this project: sources, docs
  • dictlearn - a dictionary learning library in Python: sources, docs
  • Anomaly detection in graphs with reduced egonet and random walk features: sources. Programs for paper "Anomaly Detection in Graphs of Bank Transactions for Anti Money Laundering Applications", IEEE Access.
  • Synthetic graph generation, with anomalies: sources
Datasets
Short presentation of the results

The Graphomaly library has been implemented in Python. The structure of its processing flow is shown in the figure below.

Graphomaly block structure

The list of bank transactions over a certain period of time is transformed into graph, the node being clients and the edges being cumulated transactions. Features are extracted from the graph, using inovative methods that are the theoretic contribution of the project. The most succesful features were those derived from egonets (the graph formed by the neigbors of a central node) and reduced egonets (the egonet from which the nodes having a single edge, that with the central node, are eliminated). The figure below illustrates these concepts for a real case. The red node is the center of the egonet and the transactions between it and the orange node are suspect. The reduced egonet is obtained by removing the green nodes. The set of features derived from the egonet and reduced egonet, like edge density, transferred amounts, central node degree, allows good anomaly detection using unsupervised machine learning methods.

Egonets Egonets

The experimental validation of the Graphomaly library was made on two data types: i) real data provided by Libra Internet Bank; ii) synthetic data (here is an example), consisting of random graph in which abnormal structures (cliques, rings, stars) are inserted. The results are very good, on both types of data; other methods tend to perform well only on certain cases, especially for artificial data. Execution times are reasonable for rather large graphs, with hundreds of thousands of nodes, and so can be used in banking applications. More information on results and contributions can be found in this paper. Taking also into account the software structure of the library, similar to that of popular packages in machine learning and hence easy to use for people working in this field, we can say that we have produces an efficient, flexible and robust software package.