[Devel] RE: dpt_i2o: cycle with interrupts disabled

Salyzyn, Mark mark_salyzyn at adaptec.com
Tue Oct 3 10:17:50 PDT 2006


This is a sign of a serious hardware problem. The timeout of this loop
was set to 30 seconds when the NMI watchdog used to be set to 60
seconds. Now that it is 30 seconds, I recommend you drop the loop
timeout to 20 seconds so that it times out before the system notices.

The serious hardware problem will then turn from a panic into a slightly
more graceful failure to talk to the adapter as it is no longer
responsive and the devices will all go offline. I suggest you look into
why the Adapter is failing on your system; look into Power Supply, PCI
bridge, Motherboard or the Card itself. I have no body of experience as
to why you may see this failure, but you may wish to contact Adaptec
Technical support as they may have some sage advice.

Sincerely -- Mark Salyzyn


> -----Original Message-----
> From: Vasily Averin [mailto:vvs at sw.ru] 
> Sent: Tuesday, October 03, 2006 5:29 AM
> To: Salyzyn, Mark
> Cc: devel at openvz.org
> Subject: dpt_i2o: cycle with interrupts disabled
> 
> 
> Mark,
> 
> I would like to tell you that we have included your driver 
> into our kernels.
> Unfortunately it does not work well and our customers who 
> tried to use it
> instead of i2o_block driver claims on the node lockups.
> We have received error messages, it shows that NMI watchdog 
> detected that your
> driver loops in the following cycle up to 30 sec with 
> interrupts disabled:
> 
> scsi_dispatch_cmd()    (spin_lock_irqsave(host->host_lock, flags);)
>  host->hostt->queuecommand() == adpt_queue()
>     adpt_scsi_to_i2o()
>       adpt_i2o_post_this():
> ...
>         ulong timeout = jiffies + 30*HZ;
>         do {
>                 rmb();
>                 m = readl(pHba->post_port);
>                 if (m != EMPTY_QUEUE) {
>                         break;
>                 }
>                 if(time_after(jiffies,timeout)){
>                         printk(KERN_WARNING"dpti%d: Timeout 
> waiting for message
> frame!\n", pHba->unit);
>                         return -ETIMEDOUT;
>                 }
>         } while(m == EMPTY_QUEUE);
> ...
> 
> Have you probably some ideas how to fix this issue in a proper way?
> 
> Thank you,
> 	Vasily Averin
> 
> SWsoft Virtuozzo/OpenVZ Linux kernel team
> 




More information about the Devel mailing list