Turning Privacy Constraints into Syslog Analysis Advantage
The mean time between failures (MTBF) of HPC systems is rapidly reducing, and that current failure recovery mechanisms e.g., checkpoint-restart, will no longer be able to recover the systems from failures. Early failure detection is a new class of failure recovery methods that can be beneficial for HPC systems with short MTBF. System logs (syslogs) are invaluable source of information which give us a deep insight about system behavior, and make the early failure detection possible. Beside normal information, syslogs contain sensitive data which might endanger users' privacy. Even though analyzing various syslogs is necessary for creating a general failure detection/prediction method, privacy concerns discourage system administrators to publish syslogs. Herein, we ensure user privacy via de-identifying syslogs, and then turning the applied constraint for addressing users' privacy into an advantage for system behavior analysis. Results indicate significant reduction in required storage space and 3 times shorter processing time.
READ FULL TEXT