With the rapid development of video surveillance and multimedia applications, video data is requiring higher bandwidth demands for its transmission, storage, and retrieval. This paper presents a novel approach to video processing based on skeletal information and the recognition of identities. The skeletal data enables the extraction of skeletal data features from video frames and integrates this with the recognition of identities in such a way that the video data gets segmented into skeletal data, identity information, and other relevant data. A multimodal approach like this one spans a broad range in data transmission volume, optimizes bandwidth use, and significantly improves storage efficiency and increases retrieval speed. Experimental results have verified that the proposed method is able to transmit information with efficacy even in complex scenarios and further enable significant improvement in the accuracy and speed of performing storage and retrieval tasks. Such improvements turn into an effective solution for real-time monitoring, behavior analysis, and identity recognition applications featuring strong robustness and adaptability.