Computational study of multimodal action and identity recognition systems for bandwidth-constrained scenarios

Ziyang Guo1
1Department of Information Science and Technology, Shanghai Ocean University, Shanghai, 201316, China

Abstract

With the rapid development of video surveillance and multimedia applications, video data is requiring higher bandwidth demands for its transmission, storage, and retrieval. This paper presents a novel approach to video processing based on skeletal information and the recognition of identities. The skeletal data enables the extraction of skeletal data features from video frames and integrates this with the recognition of identities in such a way that the video data gets segmented into skeletal data, identity information, and other relevant data. A multimodal approach like this one spans a broad range in data transmission volume, optimizes bandwidth use, and significantly improves storage efficiency and increases retrieval speed. Experimental results have verified that the proposed method is able to transmit information with efficacy even in complex scenarios and further enable significant improvement in the accuracy and speed of performing storage and retrieval tasks. Such improvements turn into an effective solution for real-time monitoring, behavior analysis, and identity recognition applications featuring strong robustness and adaptability.

Keywords: Multi-Modal Fusion, Action Recognition, Identity Recognition, Skeletal Data, Frame Interval Extraction, Data Compression, Dual-Stream Architecture, text processing