Optimization Design of Massive Data Storage System Based on Distributed Computing Model

doi:10.61091/jcmcc127b-148

Abstract

References

Journal of Combinatorial Mathematics and Combinatorial Computing

In Press
Volume 127b
Pages: 2633-2650

Research article

Optimization Design of Massive Data Storage System Based on Distributed Computing Model

^¹

¹Image and Text Information Center, Jiangsu Province Nantong Industry & Trade Technician College, Nantong, Jiangsu, 226010, China

Received: 20/12/2024
Revised: 05/02/2025
Accepted: 25/03/2025
Published Online: 16/04/2025

Copyright Link
License

Abstract

With the arrival of the big data era, the demand for massive data storage is growing, and distributed storage systems have become a key technology to solve this problem. The traditional HDFS system has a large storage overhead, this paper in order to improve the storage efficiency of massive data, the introduction of corrective deletion code (RS code) technology, to ensure the reliability of the data at the same time significantly reduce the cost of storage. In order to improve the storage efficiency of massive data, this paper introduces the corrective censoring code (RS code) technology, which ensures the data reliability and significantly reduces the storage cost. In addition, to address the problems of low coding efficiency and high repair overhead in the practical application of RS code, this paper further introduces the local repair code (LRC) technology, which reduces the data repair overhead, and compares and analyzes the application effect of optimization model (RS-LRC-HDFS). The experimental results show that after RS-LRC optimization, the time overhead of the HDFS storage system in the write process and read process is significantly improved by 81.12% and 93.01%, respectively, compared with the pre-optimization period, and the repair time of massive file data is reduced by 87.25%. It can be seen that it provides an efficient and reliable solution for massive data storage.

Keywords: HDFS system, corrective censoring code, local repair code, RS-LRC-HDFS, data storage

Contents

Journal of Combinatorial Mathematics and Combinatorial Computing

Optimization Design of Massive Data Storage System Based on Distributed Computing Model

Abstract

Information

Guidelines

CP Initiatives

Follow CP