Digital teaching strategies can significantly stimulate students’ interest in learning and provide personalized learning pathways. This paper proposes a multimodal action recognition method that integrates the word vector method, and designs a teaching decision optimization strategy based on this idea. Firstly, we compare the information of different modalities, complete the construction of multimodal action recognition network through the processing of image information and optical flow information, and combine the word vector method to guide the semantic learning of students’ actions. Then the design and realization process of the teaching decision aid system is introduced. Based on the above proposed action recognition method to collect students’ classroom behavior data for model training to be used in the system, the system consists of four modules: model training, classroom data collection, behavior recognition and data presentation. After the data collection, the action recognition of student behavior is carried out to provide teachers with feedback on student behavior information and assist them in making teaching decisions. In this paper, the above algorithms and systems have been verified by relevant experiments. After comparison with other algorithms, it is verified that the multimodal action recognition method designed in this paper, which incorporates the word vector method, has a high accuracy rate. In the comparison of the overall quality of instructional design decisions, the average value of the instructional decision aid system in this paper is 17.35, which is higher than the average score of excellent human teachers in the overall quality of instructional design decisions, indicating that the instructional decision aid system designed in this paper achieves the optimization of instructional decisions and reaches the level of excellent decisions.