[Devel] Re: dpt_i2o: cycle with interrupts disabled

Vasily Averin vvs at sw.ru
Wed Oct 4 02:16:14 PDT 2006


Hello Mark,

Of course I cannot exclude some hardware issues. However I would note that your
driver is incorrect.
First of all I would note that NMI watchdog have 5 second timeout and it detects
busy loop in your driver correctly. Ok, we can make timeout in your driver
lesser that 5 seconds, but it seems for me it is a bad idea to have such loops
at all. If you have found that hardware is not ready, you can return error to
scsi midlayer. If you want to wait sometime, you can free the locks, enable
interrupts, go to sleep and free the CPU for the other processes.

thank you,
	Vasily Averin

SWsoft Virtuozzo/OpenVZ Linux kernel team

Salyzyn, Mark wrote:
> This is a sign of a serious hardware problem. The timeout of this loop
> was set to 30 seconds when the NMI watchdog used to be set to 60
> seconds. Now that it is 30 seconds, I recommend you drop the loop
> timeout to 20 seconds so that it times out before the system notices.
> 
> The serious hardware problem will then turn from a panic into a slightly
> more graceful failure to talk to the adapter as it is no longer
> responsive and the devices will all go offline. I suggest you look into
> why the Adapter is failing on your system; look into Power Supply, PCI
> bridge, Motherboard or the Card itself. I have no body of experience as
> to why you may see this failure, but you may wish to contact Adaptec
> Technical support as they may have some sage advice.
> 
> Sincerely -- Mark Salyzyn
> 
> 
>> -----Original Message-----
>> From: Vasily Averin [mailto:vvs at sw.ru] 
>> Sent: Tuesday, October 03, 2006 5:29 AM
>> To: Salyzyn, Mark
>> Cc: devel at openvz.org
>> Subject: dpt_i2o: cycle with interrupts disabled
>>
>>
>> Mark,
>>
>> I would like to tell you that we have included your driver 
>> into our kernels.
>> Unfortunately it does not work well and our customers who 
>> tried to use it
>> instead of i2o_block driver claims on the node lockups.
>> We have received error messages, it shows that NMI watchdog 
>> detected that your
>> driver loops in the following cycle up to 30 sec with 
>> interrupts disabled:
>>
>> scsi_dispatch_cmd()    (spin_lock_irqsave(host->host_lock, flags);)
>>  host->hostt->queuecommand() == adpt_queue()
>>     adpt_scsi_to_i2o()
>>       adpt_i2o_post_this():
>> ...
>>         ulong timeout = jiffies + 30*HZ;
>>         do {
>>                 rmb();
>>                 m = readl(pHba->post_port);
>>                 if (m != EMPTY_QUEUE) {
>>                         break;
>>                 }
>>                 if(time_after(jiffies,timeout)){
>>                         printk(KERN_WARNING"dpti%d: Timeout 
>> waiting for message
>> frame!\n", pHba->unit);
>>                         return -ETIMEDOUT;
>>                 }
>>         } while(m == EMPTY_QUEUE);
>> ...
>>
>> Have you probably some ideas how to fix this issue in a proper way?
>>
>> Thank you,
>> 	Vasily Averin
>>
>> SWsoft Virtuozzo/OpenVZ Linux kernel team
>>
> 
> 




More information about the Devel mailing list