Research on Optimization Methods of Knowledge Discovery Algorithms for Library Data Mining

doi:10.61091/jcmcc127b-140

Abstract

References

Journal of Combinatorial Mathematics and Combinatorial Computing

In Press
Volume 127b
Pages: 2471-2488

Research article

Research on Optimization Methods of Knowledge Discovery Algorithms for Library Data Mining

^¹, ^²

¹Library (Archives), Zhejiang College of Security Technology, Wenzhou, Zhejiang, 325000, China

²School of Physical Education and Health, Wenzhou University, Wenzhou, Zhejiang, 325000, China

Received: 25/02/2024
Revised: 10/06/2024
Accepted: 05/11/2024
Published Online: 16/04/2025

Copyright Link
License

Abstract

This study focuses on library data mining scenarios and proposes an optimization method for the deficiencies of existing knowledge discovery algorithms in terms of efficiency, accuracy and interpretability. The method first uses principal component analysis to downscale library high-dimensional data to extract the main features and improve the data mining efficiency. Then, the fuzzy clustering algorithm is used to cluster the dimensionality reduced data to more accurately identify the user groups, resource categories and other implicit knowledge. The clustering results are interpreted and analyzed to provide data support for knowledge discovery in library data mining. The algorithm in this paper demonstrates better performance in data dimensionality reduction at the level of memory usage as well as time consumption, and identifies three major components with cumulative contribution of more than 80%. In addition, the algorithm achieves an average purity of 95.45% for book data clustering and a clustering time consumption of 3.47s with a data stream of 300unit k, both of which are better than the comparison algorithms. The comprehensiveness weight of a university’s book resources is 0.17, which is the highest performance, while the practicality and standardization are the next highest, 0.155 and 0.152, respectively. It can be seen from the clustering that the book category with the highest borrowing rate is science and technology, and the lowest one is literature, which reflects the user’s demand for knowledge of a specific field.

Keywords: data mining, knowledge discovery algorithm, principal component analysis, fuzzy clustering, library

Contents

Journal of Combinatorial Mathematics and Combinatorial Computing

Research on Optimization Methods of Knowledge Discovery Algorithms for Library Data Mining

Abstract

Information

Guidelines

CP Initiatives

Follow CP