Large-scale storage systems often attempt to achieve two seemingly conflicting goals: (1) the systems need to reduce the copies of redundant data to save space, a process called deduplication; and (2) users demand encryption of their data to ensure privacy. Conventional encryption makes deduplication on ciphertexts ineffective, as it destroys data redundancy. A line of work, originated from Convergent
Encryption [28], and evolved into Message Locked Encryption [12], strives to solve this problem. The latest work, DupLESS [11], proposes a server-aided architecture that provides the strongest privacy. The DupLESS architecture relies on a key server to help the clients generate encryption keys that result in convergent ciphertexts. In this paper, we first provide a rigorous proof of security, in the random oracle model, for the DupLESS architecture which is lacking in the original paper. Our proof shows that using additional secret, other than the data itself, for generating encryption keys achieves the best possible security under current deduplication paradigm.We then introduce a distributed protocol that eliminates the need for a key server and allows less managed systems such as P2P systems to enjoy the high security level. Implementation and evaluation show that the scheme is both robust and practical.
↧