cancel
Showing results for 
Search instead for 
Did you mean: 

320 / 600 GB in Proliant DL380/G7 - shows as overheating

idata
Esteemed Contributor III

Just installed 5 of these in a RAID configuration on a test server. Working perfectly, except:

The server is reporting that the drives are overheating (they're not). It appears that possibly I can turn OFF DIPM on these drives and the SMART info will be reported correctly.

How do I do that?

FYI, running windows 2008 r2

Thanks,

Rob

46 REPLIES 46

idata
Esteemed Contributor III

I also have the same results.

HP DL380 G7, Intel SSD 320 600GB.

idata
Esteemed Contributor III

For those technical: I can speculate exactly what the problem is, and in my opinion HP needs to deal with the issue, not Intel.

Chances are, whatever monitoring software HP provides -- or, if it's the system BIOS complaining, then HP's BIOS -- is making a horrible assumption that the disk attached is a classic MHDD. Most MHDDs -- but not all -- provide SMART attribute capability that includes SMART attribute 194 (0xc2), which is commonly used for temperature. This is the only way I know of (and I'm http://bsdhwmon.parodius.com/ quite familiar with hardware monitoring and http://jdc.parodius.com/freebsd/atacontrol/ quite familiar with SMART) to obtain hard disk temperature. Some drive vendors also track minimum, maximum, and average temperatures seen (possibly they tie this into hard disk SCT capability; unknown).

Anyway, Intel SSDs don't provide any data for SMART attribute 194; specifically, the attribute doesn't exist on Intel SSDs. A non-existent attribute is 100% normal (it's not the same thing as "it has the attribute but related data is zero").

So, the HP software (or BIOS) may therefore be buggy/broken; it seems they may be making the assumption that attribute 194 exists on all drives installed in their ProLiant systems, but that simply isn't true. If this is true, then their code is simply wrong and needs to be fixed. Again: if this is an error that either happens during or immediately after system POST, then HP needs to fix their BIOS.

I think it's a safe assumption HP *is not* doing something like using a Winbond or LM-series chip to monitor hard drive bay enclosure temperature (nothing would report 250C if that were the case). My above theory is much more plausible.

If someone technical (read: not someone saying "derk derk my system says the temperature is high!! fix it!!" -- I'm talking about an engineer) can provide full details of **what** is reporting an excessive drive temperature, maybe some light can be shed on **how** the software is obtaining that information. I sure hope HP provides full technical documentation of whatever their software is monitoring. In the case it's their BIOS, then they should be ashamed.

idata
Esteemed Contributor III

While this may be true, why did prior generations of Intel SSDs not have this problem reported, and why do other SSD manufacturers not have this problem?

Is it not possible, ever, for an Intel SSD to overheat?

To maintain consistency and to get a fix sooner, I would hope that Intel would choose to report this data point, and either report it correctly, or at least report it with a "neutral" value.

Otherwise, HP would have to maintain a table of SSDs, and which one reports "what".

Rob

idata
Esteemed Contributor III

To All,

The Intel 320 and 510 Series Drives do not contain a Temperature Sensor. Intel 320 and 510 series drives are responding within the proper limits of the SATA specification for drives without temperature sensing capabilities. We are working closely to help HP better understand the issue. Thanks,

Scott, Intel Corporation

idata
Esteemed Contributor III

I state the below under the assumption that HP is requiring (and incorrectly assuming that such exists) SMART attribute 194 to determine temperature of the drive:

As a senior UNIX system administrator (read: I work in an enterprise environment) who's also well-versed in ATA specs, I strongly disagree with the opinion that Intel should violate specification and return "fake SMART attributes" just to keep some particular model of chassis/mainboard or piece of OS software happy. There is *absolutely nothing* that requires a manufacturer to implement SMART -- and if they do, there is *absolutely nothing* that mandates a pre-requisite list of SMART attributes that must be implemented. Period. This isn't hearsay, it's fact. If anyone would like to read the ATA-7 (production) and ATA-8 ACS-2 (working draft) specifications, head on over to t13.org and read 'em. SMART is 100% optional and what attributes a drive vendor implements is completely and entirely up to them. If my theory is correct, then HP *requiring* a drive to provide a thermistor tied into SMART attribute 194 is absolutely 100% the fault of HP and not the fault of Intel. I can absolutely see a company who has a long-term history of mandating use of their "own hard disks" (Sun, HP, Compaq, IBM, etc. -- all of whom use rebranded Seagate, Hitachi, or Fujitsu drives) making this sort of assumption. Furthermore, most SSDs (truly!) do not provide a temperature thermistor -- it's not limited to just Intel.

I'm very glad Intel is working with HP directly to solve this problem, and I look forward to seeing what the root cause is. In my opinion, customers should be happy that Intel is assisting HP to rectify this issue. My advice to ProLiant customers would be to put serious pressure on your sales reps to get something done about this. You have support contracts for this exact reason.