A survey of the approaches to storage systems fault detection

System Analysis and Control

In present paper, we have carried out a comparative analysis of existing software used for health monitoring in enterprise-level storage systems, described commonly used approaches to monitoring data collection, processing and storage, fault detection methods. Based on this analysis we proposed criteria for monitoring software classification and comparison, generalized monitoring software architecture, its modules and module interaction. We also carried out a survey of the recent publications dedicated to anomalies detection, fault diagnosis in a field of data storage and computing systems, and described commonly used algorithms, including clusterization and classification methods, statistical analysis, SVM, isolated forest, artificial immune system, invariant networks.