BIG DATA PROJECTS IN MADURAI
Abstract:

Enterprise storage clusters increasingly adopt erasure coding to protect stored data against transient and permanent failures. Existing erasure code designs not only introduce extra parity information in a storage-inefficient manner, but also consume substantial cross-rack recovery bandwidth. To relieve both storage and recovery burdens of erasure coding, we adapt our previously proposed STAIR codes into recovery-oriented STAIR (R-STAIR) codes, which achieve storage efficiency, recovery efficiency, and configuration generality against a mix of node and rack failures. We evaluate R-STAIR codes via analysis and Hadoop experiments. We show that by supporting mixed fault tolerance, R-STAIR codes can significantly reduce both storage and recovery burdens in storage clusters.