UltraCDC:A Fast and Stable Content-Defined Chunking Algorithm for Deduplication-based Backup Storage Systems DOI
Peng Zhou, Zhenyu Wang, Wen Xia

et al.

Published: Oct. 12, 2022

Content-Defined Chunking(CDC) is the key stage of data deduplication since it has a significant impact on system's throughput and efficiency. However, existing CDC algorithms suffer from high computation overhead, weak stability, poor ability to handle low-entropy strings. In this paper, we propose UltraCDC, fast stable, high-efficiency deal with strings, algorithm for deduplication-based storage systems. There are four techniques behind namely, rolling compute boundary conditions, skipping sub-minimum chunk size, normalized chunking, jumping detect Using sliding window conditions not only accelerates chunking but also makes more resistant shift, two size can complement each other speed up without sacrificing ratio too much, detection strings than AE-opt2 affecting speed. We implemented UltraCDC in Destor, experimental results show that using above techniques, 1.5–10× faster state-of-the-art approaches, while comparable or even higher classic Rabin-base CDC. terms capability approach highest 10 2 × 2× Rabin-based AE-opt2, respectively.

Language: Английский

Exploring Query Processing on CPU-GPU Integrated Edge Device DOI
Jiesong Liu, Feng Zhang, Hourun Li

et al.

IEEE Transactions on Parallel and Distributed Systems, Journal Year: 2022, Volume and Issue: 33(12), P. 4057 - 4070

Published: May 26, 2022

Huge amounts of data have been generated on edge devices every day, which requires efficient analytics and management. However, due to the limited computing capacity these devices, query processing at faces tremendous pressure. Fortunately, in recent years, hardware vendors integrated heterogeneous coprocessors, such as GPUs, into device, can provide much more power. Furthermore, CPU-GPU device has shown significant benefits a variety situations. Therefore, exploration becomes an urgent need. In this article, we develop fine-grained engine, called FineQuery, perform devices. Particularly, FineQuery take advantage both architectural features characteristics by performing workload scheduling between CPU GPU. Experiments show that TPC-H workloads, reduces 42.81% latency improves 2.39× bandwidth utilization average compared implementation using only GPU or CPU. bring performance-per-cost energy efficiency. On average, brings 21× ratio 4× efficiency with discrete platform.

Language: Английский

Citations

13

Certificateless integrity auditing scheme for sensitive information protection in cloud storage DOI

Jian Wen,

Lunzhi Deng

Journal of Systems Architecture, Journal Year: 2024, Volume and Issue: 156, P. 103267 - 103267

Published: Aug. 30, 2024

Language: Английский

Citations

1

Efficient Container Image Updating in Low-bandwidth Networks with Delta Encoding DOI
Naoki Matsumoto, Daisuke Kotani,

Yasuo Okabe

et al.

Published: Sept. 25, 2023

Containers are the technology for Linux to isolate execution environments. By distributing a container image, which is collection of files contained in container, users can use an environment that includes necessary and libraries. However, images tens hundreds megabytes size require many network resources be transferred. Especially low-bandwidth environments like edge computing, frequent image updating difficult affect other services' communication. In this paper, we propose method reduce data required updates using delta encoding. We encoding finish quickly, but generating applying deltas time-consuming operation. Our proposes DeltaMerging enables faster generation by merging existing deltas, Di3FS applies lazily. The proposed reduces update from 5 40% methods. Also, time generate apply greatly reduced with Di3FS. Furthermore, performance degradation application was almost negligible.

Language: Английский

Citations

1

Emerging Research Trends in Data Deduplication: A Bibliometric Analysis from 2010 to 2023 DOI

Anjuli Goel,

Chander Prabha,

Preeti Sharma

et al.

Archives of Computational Methods in Engineering, Journal Year: 2024, Volume and Issue: 31(6), P. 3313 - 3330

Published: Feb. 26, 2024

Language: Английский

Citations

0

UltraCDC:A Fast and Stable Content-Defined Chunking Algorithm for Deduplication-based Backup Storage Systems DOI
Peng Zhou, Zhenyu Wang, Wen Xia

et al.

Published: Oct. 12, 2022

Content-Defined Chunking(CDC) is the key stage of data deduplication since it has a significant impact on system's throughput and efficiency. However, existing CDC algorithms suffer from high computation overhead, weak stability, poor ability to handle low-entropy strings. In this paper, we propose UltraCDC, fast stable, high-efficiency deal with strings, algorithm for deduplication-based storage systems. There are four techniques behind namely, rolling compute boundary conditions, skipping sub-minimum chunk size, normalized chunking, jumping detect Using sliding window conditions not only accelerates chunking but also makes more resistant shift, two size can complement each other speed up without sacrificing ratio too much, detection strings than AE-opt2 affecting speed. We implemented UltraCDC in Destor, experimental results show that using above techniques, 1.5–10× faster state-of-the-art approaches, while comparable or even higher classic Rabin-base CDC. terms capability approach highest 10 2 × 2× Rabin-based AE-opt2, respectively.

Language: Английский

Citations

2