cancel
Showing results for 
Search instead for 
Did you mean: 

SSD 760P get error on Linux with nvme PCI

jji5
New Contributor

I get a problem and want to get an answer .

Linux version 4.9.37

hardware is Soc with arm cpu inside.

when it boots up, the error reports like :

irq 45: nobody cared (try booting with the "irqpoll" option)

CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.37 # 9

Hardware name: xxxDEMO Board (DT)

Call trace:

[] dump_backtrace+0x0/0x198

[] show_stack+0x14/0x20

[] dump_stack+0x94/0xb8

[] __report_bad_irq+0x38/0xe8

[] note_interrupt+0x20c/0x2e0

[] handle_irq_event_percpu+0x44/0x58

[] handle_irq_event+0x44/0x78

[] handle_fasteoi_irq+0xb4/0x1c0

[] generic_handle_irq+0x24/0x38

[] __handle_domain_irq+0x5c/0xb0

[] gic_handle_irq+0x64/0xc0

Exception stack(0xffffffc023b7ee10 to 0xffffffc023b7ef40)

ee00: ffffffc023b7ee40 0000007fffffffff

ee20: ffffffc023b7ef70 ffffff800809fe3c 0000000040000005 ffffff8008880000

ee40: 0000000000000000 0000000000000000 00000000fffb6edd ffffff800851d318

ee60: 000000000ccccccd 0000000000000020 0000000003687eb1 0000000000000066

ee80: 0000000000000008 0000000200000000 0000000000000002 7fffffffffffffff

eea0: 0000000000000000 0000000029a9e4c0 000000000000c350 0000000000000033

eec0: 0000000000000019 0000000000000001 0000000000000007 ffffff8008858000

eee0: ffffff8008856b08 0000000000000000 ffffff80088dc600 ffffffc022408000

ef00: ffffff8008880000 00000000fffb6edb ffffffc023b7f090 ffffff800889a136

ef20: 0000000000000082 ffffffc023b7ef70 ffffff80080a025c ffffffc023b7ef70

[] el1_irq+0xac/0x140

[] irq_exit+0x94/0xb8

[] __handle_domain_irq+0x60/0xb0

[] gic_handle_irq+0x64/0xc0

Exception stack(0xffffff8008883df0 to 0xffffff8008883f20)

3de0: 0000000000000000 0000000000000000

3e00: ffffffc023b7fbcc 000000401b329000 0000000000000080 0100000000000000

3e20: 0000000000000155 00000000fffb6ed5 ffffff800888d300 ffffff8008880000

3e40: 0000000000000820 ffffff800885a000 ffffffc021c90080 0000000000000002

3e60: 0000000000000001 dead000000000100 0000000000000019 0000000000000001

3e80: ffffff80088e4578 ffffff8008880000 ffffff8008887240 ffffff80088871a8

3ea0: 0000000000000001 ffffff8008880000 ffffff8008880000 0000000000000001

3ec0: ffffff8008887000 ffffff800889a136 0000000044820018 ffffff8008883f20

3ee0: ffffff8008084eac ffffff8008883f20 ffffff8008084eb0 0000000060000005

3f00: ffffff8008883f20 ffffff8008653cbc ffffffffffffffff ffffff80080d3cf4

[] el1_irq+0xac/0x140

[] arch_cpu_idle+0x10/0x18

[] cpu_startup_entry+0xd0/0x140

[] rest_init+0x6c/0x78

[] start_kernel+0x2dc/0x2f0

[] __primary_switched+0x5c/0x64

handlers:

[] nvme_irq

Disabling IRQ # 45

but the other brand 'KingBand ' works well on this platform.

any reply will be appreciated.

Regards,

JiaGang

7 REPLIES 7

idata
Esteemed Contributor III

Hello Jijiagang.

Thank you for contacting Intel Technical Support. As we understand, you are requesting support for your Intel® SSD 760p Series.If we infer correctly, to begin diagnosis and consequent troubleshooting that could take us to a resolution, we would appreciate if you could, please, reply to this post with the following, important, basic information:
  • System Integration (describe how your system is integrated; please, include the manufacturer and model of all the components and the operating system)
  • Troubleshooting did by you.
  • Steps to reproduce your issue (BIOS settings, specific OS configuration)
We will be looking forward to your reply. Best regards, Josh B.Intel Customer Support.

jji5
New Contributor

Hi Josh,

Thanks for your reply.

Yes, its 128GB of SSD 760P series.

  • System Integration (describe how your system is integrated; please, include the manufacturer and model of all the components and the operating system)

it's embedded system, the Soc is designed by ourselves , the Soc has a PCIE controller. we run Linux 4.9.37 on this platform, and we select these options in Linux:

<*> NVM Express block device

[*] SCSI emulation for NVMe device nodes

<*> NVMe Target support

<*> NVMe loopback device support

  • Troubleshooting did by you.

By adding debug code, we found that it couldn't get the right state when reading one status register, then it caused the interrupt exception.

the below is the output log or Intel Ssd and KingBand Ssd.

ERROR:

# nvmeq->cqes[0].status = 0, phase = 1

# nvmeq->cqes[0].status = 0, phase = 1

# (le16_to_cpu(nvmeq->cqes[0].status) & 1) = 0

# (le16_to_cpu(nvmeq->cqes[0].status) & 1) = 0

------------------__nvme_process_cq,731--------------------

TURE:

# nvmeq->cqes[0].status = 0, phase = 1

# nvmeq->cqes[0].status = 1, phase = 1

# (le16_to_cpu(nvmeq->cqes[0].status) & 1) = 1

# (le16_to_cpu(nvmeq->cqes[0].status) & 1) = 1

the position of debug code :

/* We read the CQE phase first to check if the rest of the entry is valid */

static inline bool nvme_cqe_valid(struct nvme_queue *nvmeq, u16 head,

u16 phase)

{

printk("# nvmeq->cqes[%hd].status = %hd, phase = %hd\n",head, nvmeq->cqes[head].status, phase);

asm("nop");

printk("# nvmeq->cqes[%hd].status = %hd, phase = %hd\n",head, nvmeq->cqes[head].status, phase);

asm("nop");

asm("nop");

printk("# (le16_to_cpu(nvmeq->cqes[%hd].status) & 1) = %hd\n",head,((le16_to_cpu(nvmeq->cqes[head].status)) & 1));

asm("nop");

printk("# (le16_to_cpu(nvmeq->cqes[%hd].status) & 1) = %hd\n",head,((le16_to_cpu(nvmeq->cqes[head].status)) & 1));

return (le16_to_cpu(nvmeq->cqes[head].status) & 1) == phase;

}

volatile struct nvme_completion *cqes;

through the log, it didn't return 1 when read nvmeq->cqes[%hd].status, and it's ssd's status register.

  • Steps to reproduce your issue (BIOS settings, specific OS configuration)

on our platform, it will report this error every time when the system boots up . but the kingbank works well, so I think maybe Intel needs some quirk things to do like this code in drivers/nvme/host/pci.c

static const struct pci_device_id nvme_id_table[] = {

{ PCI_VDEVICE(INTEL, 0x0953),

.driver_data = NVME_QUIRK_STRIPE_SIZE |

NVME_QUIRK_DISCARD_ZEROES, },

please give us a hand, thanks.

Best regards,

jijiagang

idata
Esteemed Contributor III

Hello Jijiagang.

Thank you for your reply. As we understand, you are trying to use your Intel® SSD 760p Series 128GB, M.2 80mm, PCIe NVMe 3.1 x4, 3D2, TLC on an embedded system, the Soc is custom designed by you and your team and this platform is running Linux 4.9.37. Please, take in consideration the following information:
  • Please refer to the http://compatibleproducts.intel.com/ Intel® Product Compatibility Tool to verify the list of motherboards and systems tested and validated by us to work your SSD 760p Series.
-Where can I find drivers for my Intel® SSD 760p / Pro 7600p Series with PCI Express* NVMe*? For Windows*: Client NVMe* Microsoft Windows* Driver for Intel® SSDs For Linux*: Consult your OS vendor for more information about available NVMe drivers and support. - Can I install Linux on the Intel SSD 760p / Pro 7600p Series? We have tested boot functionality on a subset of Linux operating systems. Functionality on all Linux-based OS is not guaranteed. If you encounter an issue, refer to the Linux vendor support page or contact support. Thank you for your patience and understanding. Best regards, Josh B.Intel Customer Support.

jji5
New Contributor

Hi Josh,

thanks for your reply.

I still don't know how to resolve my problem although.

we get the Linux kernel from Linux open source community, port it to our platform. And KingBand is ok, but 760P not.

could you please give us any advices, like add code , get some registers.

that website you showed me don't describe how to add driver for Linux to support this SSD.

I want to get further help from you.

Best regards.

Jijiagang