A Method for High-Throughput Deduplication for Primary File Server by Using Prefetch Cache

Hitoshi Kamei, Takaki Nakamura

    Research output: Contribution to journalArticle

    1 Citation (Scopus)

    Abstract

    We propose a method of high-throughput file-level deduplication for primary file servers, called partial data background prefetch (PDBP). To achieve high throughput of deduplication, the method reduces the number of disk I/Os issued during deduplication process. Before running deduplication process, the proposed method prefetches a part of data of shred files referred by deduplicated files. After that, the method processes the files that are larger than a file-size threshold defined by administrators. In this paper, we evaluate a deduplication processing time by using a simulation model of PDBP. Consequently, we confirm that the processing time of PDBP is reduced by about 50% compared to a conventional file deduplication method when the threshold is set to 4 KB.

    Original languageEnglish
    Pages (from-to)54-64
    Number of pages11
    JournalElectronics and Communications in Japan
    Volume99
    Issue number12
    DOIs
    Publication statusPublished - 2016 Dec 1

    Keywords

    • file cache
    • file system
    • file-level deduplication

    ASJC Scopus subject areas

    • Signal Processing
    • Physics and Astronomy(all)
    • Computer Networks and Communications
    • Electrical and Electronic Engineering
    • Applied Mathematics

    Fingerprint Dive into the research topics of 'A Method for High-Throughput Deduplication for Primary File Server by Using Prefetch Cache'. Together they form a unique fingerprint.

  • Cite this