A Guided FP-growth algorithm for multitude-targeted mining of big data

03/18/2018
by   Lior Shabtay, et al.
0

In this paper we present the GFP-growth (Guided FP-growth) algorithm, a novel method for multitude-targeted mining: finding the count of a given large list of itemsets in large data. The GFP-growth algorithm is designed to focus on the specific multitude itemsets of interest and optimizes the time and memory costs. We prove that the GFP-growth algorithm yields the exact frequency-counts for the required itemsets. We show that for a number of different problems, a solution can be devised which takes advantage of the efficient implementation of multitude-targeted mining for boosting the performance. In particular, we study in detail the problem of generating the minority-class rules from imbalanced data, a scenario that appears in many real-life domains such as medical applications, failure prediction, network and cyber security, and maintenance. We develop the Minority-Report Algorithm that uses the GFP-growth for boosting performance. We prove some theoretical properties of the Minority-Report Algorithm and demonstrate its performance gain using simulations and real data.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset