In recent years, the deepening of reform and opening up, the deepening of the socialization of college management, the trend of students’ thinking is more and more diversified leading to the frequent occurrence of college students’ behavior. This paper is based on Spark’s parallel H-mine cluster computing to mine the behavioral characteristics data of students in frequent item sets. Using the K-Means clustering algorithm optimized by information entropy and density, the clustering and classification process is carried out according to the central value of the obtained behavioral features. Construct the class model of student behavioral features, realize student behavior prediction by K-nearest neighbor algorithm, and build the early warning model of student behavior prediction based on Spark cluster. The results of clustering analysis show that the average number of times a class of students, the second class of students, and the third class of students eat at breakfast is 120.07, 107.66, and 118.25, respectively, and the first class of students has the most number of times of breakfast meals, which shows that this class of students has better eating habits. The number of students studying on March 24, 2023 is predicted by the model based on the K nearest neighbor algorithm, and the trajectory of the real value and the predicted value The number of students with relative error less than 0.2 accounted for 86.42%, indicating that the model is good at predicting the number of students as a whole.