Quantcast
Channel: Intel Communities : All Content - Servers
Viewing all articles
Browse latest Browse all 3917

raid still reporting predictive failure after drive replacement

$
0
0

I have a machine running centOS 5.3.  It has a 6-disk raid 5 array. According to the raid web console 2, The raid card appears to be SRCSASBB81.

About a week ago, I started to receive these predictive failure warnings (once per day).

 

 

Controller ID: 0  PD Predictive failure: --:--:4

Generated on:Mon Sep 16 08:29:57 2013

 

 

SYSTEM DETAILS---

IP Address: REDACTED

OS Name: Linux

OS Version: 2.6

Driver Name: megaraid_sas

Driver Version: 00.00.04.01-RH1

 

 

IMAGE DETAILS---

BIOS Version: 1.12.122-0393

Firmware Package Version: 8.0.1-0029

Firmware Version: NT16

 

So, I started the intel raid web console, looked at all the drives and saw that drive 4 did have a "pred fail count" of 1.  All the other disks had 0 in that field.  I figured that's what the "--:--:4" in the warning was referring to.  I backed up everything on the raid, identified the physical location of all drives then using the raid web console took drive 4 off line (putting the raid into a degraded state).  The light on the physical drive in the expected location turned orange - as expected.  I removed the disk and replaced it with a new one.  The raid rebuilt and came back to optimal with the new disk.  All went as planned. Yay!

 

However, every morning at 7:30 AM, I still get this same predictive failure warning.  The "pred fail count" on the new disk (like all the others) is now 0.  Everything looks fine.  Is there some file where I have to manually reset some failure count?  I can't see anything in the UI that indicates there is something else I need to do.

 

Please help me understand what's going on and what further steps I should take

 

Thanks.

 

-J 


Viewing all articles
Browse latest Browse all 3917

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>