In the field of artificial intelligence education, teaching emotion, as the main assessment basis for teaching evaluation, profoundly affects the teaching method, classroom atmosphere and teaching effect of teachers. This thesis proposes a combined network structure, CRNN, by taking advantage of CNN for speech emotion feature extraction and RNN for sequence modeling, and realizes emotion recognition of classroom discourse through DenseNet neural network to realize the crosstalk between each layer and other layers, and LSTM neural network to complete the task of speech emotion classification. On this basis, the open classroom video of the sixth grade of an elementary school is analyzed for sentiment, and the teaching practice of the application of speech emotion recognition model is carried out to study the optimization effect of the model application on the classroom atmosphere of the elementary school. The overall sentiment value of the classroom interaction video floats in the range of 0~1.9, showing a trend of first increasing and then decreasing, reflecting the feasibility of applying the speech emotion recognition model of this paper to classroom sentiment analysis. Through the teaching experiment, the positive emotional performance of the experimental group is more obvious than that of the control group, and 95.46% of the students agree that the application of the model can improve classroom interaction and the overall atmosphere. The speech emotion recognition model studied here can mobilize the classroom atmosphere, and has more important classroom guidance and application significance.