Unsupervised Anomaly Detection in Journal-Level Citation Networks
Journal Impact Factor is a popular metric for determining the quality of a journal in academia. The number of citations received by a journal is a crucial factor in determining the impact factor, which may be misused in multiple ways. Therefore, it is crucial to detect citation anomalies for further identifying manipulation and inflation of impact factor. Citation network models the citation relationship between journals in terms of a directed graph. Detecting anomalies in the citation network is a challenging task which has several applications in spotting citation cartels and citation stack and understanding the intentions behind the citations. In this paper, we present a novel approach to detect the anomalies in a journal-level scientific citation network, and compare the results with the existing graph anomaly detection algorithms. Due to the lack of proper ground-truth, we introduce a journal-level citation anomaly dataset which consists of synthetically injected citation anomalies and use it to evaluate our methodology. Our method is able to predict the anomalous citation pairs with a precision of 100% and an F1-score of 86 categorize the detected anomalies into various types and reason out possible causes. We also analyze our model on the Microsoft Academic Search dataset - a real-world citation dataset and interpret our results using a case study, wherein our results resemble the citations and SCImago Journal Rank (SJR) rating-change charts, thus indicating the usefulness of our method. We further design `Journal Citation Analysis Tool', an interactive web portal which, given the citation network as an input, shows the journal-level anomalous citation patterns and helps users analyze citation patterns of a given journal over the years.
READ FULL TEXT