11-29-2017 02:06 AM
Hi,
we got a 4 node cluster running Storage Spaces direct, using P3700 NVMes as caching devices. There seems to be a problem with this combination:
https://support.microsoft.com/en-gb/help/4052341/slow-performance-or-lost-communication-io-error-det... https://support.microsoft.com/en-gb/help/4052341/slow-performance-or-lost-communication-io-error-det...
This can lead up to BSOD, which you can imagine is horrible in a production Cluster.
I could not find any information from Intel on this topic. Can anyone help me out with this? I don't know if this is a Microsoft problem, or if a new driver/firmware could help out.
I have also posted this in Microsoft TechNet.
Any help is highly apreciated!
Best regards,
Andreas
11-29-2017 10:01 AM
Hello AFurtenbacher,
Thanks for being part of Intel® communities.Reading the article you shared, the problem seems to be Windows* Server 2016 Datacenter related.I checked the latest firmware releases change notes, but there's was no mention of fixes/updates regarding Storage Spaces Direct (S2D). You can check these notes at the Release Notes for Intel® SSD Data Center Tool (found at the bottom of the download page) under "Product Firmware Revision History":- https://downloadcenter.intel.com/product/87278/Intel-SSD-Data-Center-Tool Downloads for Intel® SSD Data Center ToolPlease let us know if there''s anything else we can help you with.Best regards,Eugenio F.11-29-2017 11:58 PM
Hi,
in the Release Notes of the latest version, there is no hint on any fix pointing in our direction.
Pointing at Microsoft, being honest, is a bit too simple, because the problem only occurs on Intel P3x00 discs. All other vendors cause no problems. I can't imagine, that Microsoft handles Intel NVMes differently than those of other vendors. So after all, I still think a change in the driver or the firmware could make a big difference.
Best regards,
Andreas
p.s. I am not sure if the problem is connected to S2D. The NVMes in our area simply only get used now that S2D is in available, because now we can use the internal drives of the server and don't Need SAS disks in SANs anymore. Before this, I didn't know anyone, who used those disks in servers, because we couldn't build an HA solution with them.
11-30-2017 03:11 PM
Hello AFurtenbacher,
Thanks for your observations.Our engineers are already working with Microsoft* to try and find a solution for this issue. For now, we recommend keeping the drives firmware and your operating system up to date in addition to applying the changes described in the Microsoft* article you shared.Best regards,Eugenio F.11-30-2017 10:24 PM
Hi Eugenio,
thank you for the information. Is there any chance for an estimation when whis possibly will be solved? As I said: people are loosing their production clusters, so it is quite urgent.
Concerning the firmware I got one more question: The Intel SSD Tool shows, that my disks are on firmware 8DV1FJP5. As far as I have seen, 8DV1FJP9 ist already available, so I tried to update my drives. The Problem I got was, that the ISDCT always tells me, that the drives already have the latest version coming with the software. I have tried this with ISDCT 3.0.3 and 3.0.9 and manually downloades the firmware and put it into the "FirmwareModules" folder.
How do I flash the disks?
Best regards,
Andreas