HARPP: HARnessing the Power of Power sets for Mining Frequent Itemsets

Authors

  • Muhammad Yasir University of Engineering and Technology Lahore, Faisalabad Campus
  • Muhammad Asif Habib National Textile University; Faisalabad
  • Shahzad Sarwar
  • Chaudhry Muhammad Nadeem Faisal
  • Mudassar Ahmad
  • Sohail Jabbar

DOI:

https://doi.org/10.5755/j01.itc.48.3.21137

Keywords:

Association Rules, Frequent Itemset Mining, Apriori, FP-Growth, Recommendation Systems

Abstract

Modern algorithms for mining frequent itemsets face noteworthy deterioration of performance when minimum support tends to decrease, especially for sparse datasets. Long-tailed itemsets, frequent itemsets found at lower minimum support, are significant for present-day applications such as recommender systems. In this study, we have developed a novel power set based method named as HARnessing the Power of Power sets (HARPP) for miningĀ  frequent itemsets. HARPP iteratively generates power sets to make combinations of overlapping varying-sized subsets of I, where I is a set of items in a large database. Intrinsic feature of creating power sets along with the use of set data structure ensures the agility of HARPP because most of its operations take constant running time. Without storing it entirely in memory, HARPP scans the dataset only once and mines frequent itemsets on the fly. In contrast to state-of-the-art, efficiency of HARPP increases with decrease in minimum support that makes it a viable technique for mining long-tailed itemsets. Performance study shows that HARPP is efficient and scalable, and is faster up to two orders of magnitude than FP-Growth algorithm at lower minimum support particularly when datasets are sparse.

Author Biographies

Muhammad Yasir, University of Engineering and Technology Lahore, Faisalabad Campus

Working as Assistant Professor (Computer Science) at University of Engineering and Tecnhology Lahore, Faisalabad Campus, Pakistan

Muhammad Asif Habib, National Textile University; Faisalabad

Working as Assistant Professor in Computer Science Department National Textile University, Faisalabad, Pakistan

Downloads

Published

2019-09-24

Issue

Section

Articles