Prototype Selection Based on Clustering and Conformance Metrics for Model Discovery
Process discovery aims at automatically creating process models on the basis of event data captured during the execution of business processes. Process discovery algorithms tend to use all of the event data to discover a process model. This attitude sometimes leads to discover imprecise and/or complex process models that may conceal important information of processes. To address this problem, several techniques, from data filtering to model repair, have been elaborated in the literature. In this paper, we introduce a new incremental prototype selection algorithm based on clustering of process instances. The method aims to iteratively compute a unique process model with a different set of selected prototypes, i.e., representative of whole event data and stops when conformance metrics decrease. The proposed method has been implemented in both the ProM and the RapidProM platforms. We applied the proposed method on several real event data with state-of-the-art, process discovery algorithms. Results show that using the proposed method leads to improve the general quality of discovered process models.
READ FULL TEXT