Frequent itemset mining optimization and computational complexity analysis based on improved Apriori algorithm

Abstract

Apriori algorithm is a classic frequent itemset mining algorithm, but it has the problems of more time consumed by the self-connection process and high overhead of conversion between memories. In order to improve the frequent itemset mining effect of Apriori algorithm, this paper improves the existing adaptive genetic algorithm by using the average population fitness and fitness value discretization, and improves the Apriori algorithm by using the optimized genetic algorithm, so as to solve the strong association rules. Compared with the traditional Apriori algorithm, the algorithm in this paper has less time overhead and improves 2.4%, 2.4%, and 2.7% on average in recall, accuracy, and F1 value. On the Accidents and Retail datasets, the improved Apriori algorithm is faster than the NSFI algorithm by 6.12% and 13.52% on average, reducing the computational complexity. Using the improved algorithm to analyze the characteristics of cross-provincial migrants, it is found that the migrant population is younger in age, with lower education level, mostly of agricultural household registration, and mostly of Han nationality, which verifies the practical application value of the improved algorithm.

Keywords: genetic algorithm; Apriori algorithm improvement; frequent itemset mining; computational complexity analysis