Analysis and evaluation of college entrance examination questions based on clustering algorithm

Yuqing Mo 1
1Hunan College of Information, Changsha, Hunan, 410200, China

Abstract

This paper analyzes and evaluates high school examination questions based on machine learning. The study first introduces Bloom’s classification method and constructs a categorized dataset of high school exam questions according to three steps of data collection, data annotation and data analysis. Then an automatic assessment model (WoBERT-CNN) based on WoBERT and Text-CNN is designed. The semantic similarity of word vector mapping is used to label the cases for determination, the improved WoBERT encoder is used to represent the text in word vectors, Text-CNN is used as a text classifier to extract the textual semantic features, and the features are integrated and screened, so as to realize the automatic classification of the cases in Bloom’s taxonomy. Finally, based on the deep representation framework, the text information of the test questions is deeply mined and utilized to establish the relationship between the text of the test questions and the actual difficulty, and to realize the difficulty prediction of the test questions.The classification accuracy of the WoBERT-CNN model reaches more than 92%.The prediction error range of the H-MIDP model on the score rate of the test questions is between 1.3% and 3.2%, which is not too far from the real value. In conclusion, the automatic assessment model and difficulty prediction model designed in this paper can be applied in the analysis and evaluation of high school test questions, helping the high school test paper proposition and talent cultivation strategy.

Keywords: Machine learning; Bloom taxonomy, WoBERT-CNN, deep characterization, high school examination questions