This study focuses on library data mining scenarios and proposes an optimization method for the deficiencies of existing knowledge discovery algorithms in terms of efficiency, accuracy and interpretability. The method first uses principal component analysis to downscale library high-dimensional data to extract the main features and improve the data mining efficiency. Then, the fuzzy clustering algorithm is used to cluster the dimensionality reduced data to more accurately identify the user groups, resource categories and other implicit knowledge. The clustering results are interpreted and analyzed to provide data support for knowledge discovery in library data mining. The algorithm in this paper demonstrates better performance in data dimensionality reduction at the level of memory usage as well as time consumption, and identifies three major components with cumulative contribution of more than 80%. In addition, the algorithm achieves an average purity of 95.45% for book data clustering and a clustering time consumption of 3.47s with a data stream of 300unit k, both of which are better than the comparison algorithms. The comprehensiveness weight of a university’s book resources is 0.17, which is the highest performance, while the practicality and standardization are the next highest, 0.155 and 0.152, respectively. It can be seen from the clustering that the book category with the highest borrowing rate is science and technology, and the lowest one is literature, which reflects the user’s demand for knowledge of a specific field.