Abstract
We propose a method of high-throughput file-level deduplication for primary file servers, called partial data background prefetch (PDBP). To achieve high throughput of deduplication, the method reduces the number of disk I/Os issued during deduplication process. Before running deduplication process, the proposed method prefetches a part of data of shred files referred by deduplicated files. After that, the method processes the files that are larger than a file-size threshold defined by administrators. In this paper, we evaluate a deduplication processing time by using a simulation model of PDBP. Consequently, we confirm that the processing time of PDBP is reduced by about 50% compared to a conventional file deduplication method when the threshold is set to 4 KB.
Original language | English |
---|---|
Pages (from-to) | 54-64 |
Number of pages | 11 |
Journal | Electronics and Communications in Japan |
Volume | 99 |
Issue number | 12 |
DOIs | |
Publication status | Published - 2016 Dec 1 |
Keywords
- file cache
- file system
- file-level deduplication
ASJC Scopus subject areas
- Signal Processing
- Physics and Astronomy(all)
- Computer Networks and Communications
- Electrical and Electronic Engineering
- Applied Mathematics