Transient error handling (2)
Linux is paranoid with respect to transients
- stops using affected disk (and reconstructs) on any error, transient or not
- fragile: system is more vulnerable to multiple faults
- disk-inefficient: wastes two disks per transient
- but no chance of slowly-failing disk impacting perf.
Solaris and Windows are more forgiving
- both ignore most benign/transient faults
- robust: less likely to lose data, more disk-efficient
- less likely to catch slowly-failing disks and remove them
Neither policy is ideal!
- need a hybrid that detects streams of transients