Content area

Abstract

While commodity file systems trust disks to either work or fail completely, modern disks exhibit complex failure modes such as latent sector faults and block corruptions.

In this thesis, we focus on understanding the failure policies of file systems and improving their robustness to disk failures. We suggest a new fail-partial failure model for disks, which incorporates realistic localized faults such as latent sector faults and block corruption. We then develop and apply a novel semantic failure analysis technique, which uses file system block type knowledge and transactional semantics, to inject interesting faults and investigate how commodity file systems react to a range of more realistic disk failures.

We apply our technique to five important journaling file systems: Linux ext3, ReiserFS, JFS, XFS, and Windows NTFS. We classify their failure policies in a new taxonomy that measures their Internal RObustNess (IRON), which includes both failure detection and recovery techniques. Our analysis results show that commodity file systems store little or no redundant information, and contain failure policies that are often inconsistent, sometimes buggy, and generally inadequate in their ability to recover from partial disk failures.

We remedy the reliability short comings in commodity file systems by addressing two issues. First, we design new low-level redundancy techniques that a file system can use to handle disk faults. After qualitatively and quantitatively evaluating various redundancy machineries, we propose a new probabilistic model to account for spatially correlated faults. We also develop new techniques to update data and parity atomically without NVRAM support. Overall, we show that low-level redundant information can greatly enhance file system robustness while incurring modest time and space overheads.

Second, to remedy the problem of failure handling diffusion, we develop a modified ext3 (called ext3c) that unifies all failure handling in a Centralized Failure Handler (CFH). We then showcase the power of centralized failure handling in ext3 c by demonstrating its support for flexible, consistent, and fine-grained policies. By carefully separating policy from mechanism, ext3 c demonstrates how a file system can provide a thorough, comprehensive, and easily understandable failure-handling policy.

Details

Title
Iron file systems
Author
Prabhakaran, Vijayan
Year
2006
Publisher
ProQuest Dissertations & Theses
ISBN
978-0-542-88520-4
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
304978637
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.