01-12-2017 04:25 PM
About 6 months ago I bought the Intel SSD 750 400GB, and have been using it for various database-related benchmarking tasks and such. It was working fine until this week, when the kernel suddenly started reporting strange issues about aborted commands:
Jan 12 13:10:27 bench2 kernel: nvme nvme0: I/O 0 QID 12 timeout, aborting
Jan 12 13:10:27 bench2 kernel: nvme nvme0: I/O 1 QID 12 timeout, aborting
Jan 12 13:10:27 bench2 kernel: nvme nvme0: I/O 2 QID 12 timeout, aborting
Jan 12 13:10:27 bench2 kernel: nvme nvme0: I/O 3 QID 12 timeout, aborting
Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:10:27 bench2 kernel: nvme nvme0: Abort status: 0x0
Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
...
Jan 12 13:10:33 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:10:33 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:10:33 bench2 kernel: nvme nvme0: I/O 196 QID 12 timeout, aborting
Jan 12 13:10:33 bench2 kernel: nvme nvme0: I/O 212 QID 12 timeout, aborting
Jan 12 13:10:33 bench2 kernel: nvme nvme0: I/O 273 QID 12 timeout, aborting
Jan 12 13:10:33 bench2 kernel: nvme nvme0: I/O 275 QID 12 timeout, aborting
Jan 12 13:10:33 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:10:33 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:10:33 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
...
Jan 12 13:16:59 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:16:59 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:16:59 bench2 kernel: nvme nvme0: completing aborted command with status: 0000
Jan 12 13:17:00 bench2 kernel: nvme nvme0: completing aborted command with status: fffffffc
Jan 12 13:17:00 bench2 kernel: blk_update_request: I/O error, dev nvme0n1, sector 422162944
Jan 12 13:17:00 bench2 kernel: Buffer I/O error on dev nvme0n1p1, logical block 52770079, lost async page write
Jan 12 13:17:00 bench2 kernel: Buffer I/O error on dev nvme0n1p1, logical block 52770080, lost async page write
Jan 12 13:17:00 bench2 kernel: Buffer I/O error on dev nvme0n1p1, logical block 52770081, lost async page write
Jan 12 13:17:00 bench2 kernel: Buffer I/O error on dev nvme0n1p1, logical block 52770082, lost async page write
Jan 12 13:17:00 bench2 kernel: Buffer I/O error on dev nvme0n1p1, logical block 52770083, lost async page write
I'm regularly testing new kernels / distributions, so at first I thought it's a bug in one of these, but after a lot of experiments I doubt that - I can reproduce the same issue even with older kernels that I've used without any issue.
Interestingly enough, this only affects writes - the reads seem to be working just fine (easily >2GB/s in sequential workload), but only 2MB/s in writes. Not a filesystem issue either - this happens even with simple dd writing /dev/nvme0n1 directly.
I've tried to install the newest firmware using the isdct tool (v 3.0.0), and `isdct show` now reports this:
[root@bench2 ~]# isdct show -a -intelssd 0
- Intel SSD 750 Series CVCQ55020067400AGN -
AggregationThreshold : 0
AggregationTime : 0
ArbitrationBurst : 0
Bootloader : 8B1B0131
CoalescingDisable : 1
DevicePath : /dev/nvme0n1
Device...
Solved! Go to Solution.
01-13-2017 07:34 AM
Hello Tomas_V,
Thanks for posting in our forum. We would like to review the information you've sent to us and try to replicate the situation.In the meantime, we can recommend you to install the latest firmware update, which you can find https://downloadcenter.intel.com/download/26491/Intel-SSD-Firmware-Update-Tool here.The other program says you have the latest version, but it is because that one does not include the latest one. FW: 8EV101F0 with Bootloader 8B1B0133Let us know if after the firmware update it fixes, if not we will be checking the information provided.Regards,NC01-13-2017 07:34 AM
Hello Tomas_V,
Thanks for posting in our forum. We would like to review the information you've sent to us and try to replicate the situation.In the meantime, we can recommend you to install the latest firmware update, which you can find https://downloadcenter.intel.com/download/26491/Intel-SSD-Firmware-Update-Tool here.The other program says you have the latest version, but it is because that one does not include the latest one. FW: 8EV101F0 with Bootloader 8B1B0133Let us know if after the firmware update it fixes, if not we will be checking the information provided.Regards,NC01-13-2017 12:16 PM
OK, I've managed to update the firmware to 8EV101F0. I've actually tried that yesterday before asking the question here, but haven't managed to create a USB stick from the ISO, and haven't realized the "isdct" tool does not include the latest firmware.
That being said, after the update the drive again does ~1GB/s in writes, and so far I haven't seen any error messages in the kernel log. I'll do some more tests over the next couple of days to see if it lasts.
01-13-2017 01:28 PM
Hi Tomas_V,
Great to hear that the firmware update was performed successfully and that your writes values are back to normal.We will follow up with you by next week just to confirm that the situation is fully fixed.Regards,NC01-18-2017 12:54 PM
Hi Tomas_V,
We would like if you have noticed any other weird behavior with the SSD or if you can confirm this situation as resolved.We will be waiting for your response.Regards,NC