AntiBenford Subgraphs: Unsupervised Anomaly Detection in Financial Networks
Benford's law describes the distribution of the first digit of numbers appearing in a wide variety of numerical data, including tax records, and election outcomes, and has been used to raise "red flags" about potential anomalies in the data such as tax evasion. In this work, we ask the following novel question: given a large transaction or financial graph, how do we find a set of nodes that perform many transactions among each other that also deviate significantly from Benford's law? We propose the AntiBenford subgraph framework that is founded on well-established statistical principles. Furthermore, we design an efficient algorithm that finds AntiBenford subgraphs in near-linear time on real data. We evaluate our framework on both real and synthetic data against a variety of competitors. We show empirically that our proposed framework enables the detection of anomalous subgraphs in cryptocurrency transaction networks that go undetected by state-of-the-art graph-based anomaly detection methods. Our empirical findings show that our framework is able to mine anomalous subgraphs, and provide novel insights into financial transaction data. The code and the datasets are available at <https://github.com/tsourakakis-lab/antibenford-subgraphs>.
READ FULL TEXT