[Devel] [PATCH rh7 2/2] block: suppress hard lockup warning in elv_drain_elevator

Dmitry Monakhov dmonakhov at openvz.org
Sat Mar 19 05:12:08 PDT 2016


Vladimir Davydov <vdavydov at virtuozzo.com> writes:

> Under heavy io sg_io() might keep busy-looping in elv_drain_elevator()
> under queue_lock and irqs disabled for quite a bit while trying to
> insert a request to the tail of the queue, resulting in hard lockup:
ACK
>
>   Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 1
>   CPU: 1 PID: 642 Comm: smartd ve: 0 Not tainted 3.10.0-327.3.1.vz7.10.13 #1 10.13
>   Hardware name:   Intel Corporation Blackford & ESB2 Chipset/Blackford CRB, BIOS F8  01/24/2007
>    ffff88042fc45c40 00000000b0b67a0e ffff88042fc45af0 ffffffff81630786
>    ffff88042fc45b70 ffffffff8162ac09 0000000000000010 ffff88042fc45b80
>    ffff88042fc45b20 00000000b0b67a0e ffffffff8101cc99 0000000000000001
>   Call Trace:
>    <NMI>  [<ffffffff81630786>] dump_stack+0x19/0x1b
>    [<ffffffff8162ac09>] panic+0xd8/0x1e7
>    [<ffffffff8101cc99>] ? sched_clock+0x9/0x10
>    [<ffffffff8112c610>] ? restart_watchdog_hrtimer+0x50/0x50
>    [<ffffffff8112c6d2>] watchdog_overflow_callback+0xc2/0xd0
>    [<ffffffff81171411>] __perf_event_overflow+0xa1/0x250
>    [<ffffffff81171ee4>] perf_event_overflow+0x14/0x20
>    [<ffffffff81032e98>] intel_pmu_handle_irq+0x1e8/0x470
>    [<ffffffff8101cc45>] ? native_sched_clock+0x35/0x80
>    [<ffffffff810bf30d>] ? sched_clock_local+0x1d/0x80
>    [<ffffffff8163a18b>] perf_event_nmi_handler+0x2b/0x50
>    [<ffffffff816398d9>] nmi_handle.isra.0+0x69/0xb0
>    [<ffffffff816399f0>] do_nmi+0xd0/0x340
>    [<ffffffff81638cb1>] end_repeat_nmi+0x1e/0x2e
>    [<ffffffff812bec77>] ? elv_dispatch_sort+0x87/0xe0
>    [<ffffffff812bec77>] ? elv_dispatch_sort+0x87/0xe0
>    [<ffffffff812bec77>] ? elv_dispatch_sort+0x87/0xe0
>    <<EOE>>  [<ffffffff812e74d8>] cfq_dispatch_insert+0x158/0x280
>    [<ffffffff8101002f>] ? perf_trace_xen_mmu_set_pud+0x19f/0x1b0
>    [<ffffffff810b7960>] ? finish_task_switch+0xe0/0x180
>    [<ffffffff812ea427>] cfq_dispatch_requests+0xae7/0xc20
>    [<ffffffff812e9252>] ? cfq_set_request+0xa2/0x440
>    [<ffffffff811db350>] ? kmem_cache_alloc+0xf0/0x220
>    [<ffffffff8117e545>] ? mempool_alloc_slab+0x15/0x20
>    [<ffffffff812bf1f2>] elv_drain_elevator+0x22/0x70
>    [<ffffffff812bf2fb>] __elv_add_request+0xbb/0x2d0
>    [<ffffffff812cb82d>] blk_execute_rq_nowait+0xad/0x180
>    [<ffffffff812c544a>] ? get_request+0x39a/0x780
>    [<ffffffff812cb98b>] blk_execute_rq+0x8b/0x150
>    [<ffffffff810a8720>] ? wake_up_atomic_t+0x30/0x30
>    [<ffffffff8108951e>] ? ns_capable+0x2e/0x60
>    [<ffffffff810895a7>] ? capable+0x17/0x20
>    [<ffffffff812d774b>] sg_io+0x27b/0x450
>    [<ffffffff812d8017>] scsi_cmd_ioctl+0x337/0x4d0
>    [<ffffffff811e8bbc>] ? __memcg_kmem_get_cache+0x4c/0x150
>    [<ffffffff812d81f2>] scsi_cmd_blk_ioctl+0x42/0x50
>    [<ffffffffa00c4a5e>] sd_ioctl+0xbe/0x140 [sd_mod]
>    [<ffffffff812d480f>] blkdev_ioctl+0x2df/0x770
>    [<ffffffff81236ff1>] block_ioctl+0x41/0x50
>    [<ffffffff8120de15>] do_vfs_ioctl+0x255/0x4f0
>    [<ffffffff81218e17>] ? __fd_install+0x47/0x60
>    [<ffffffff8120e104>] SyS_ioctl+0x54/0xa0
>    [<ffffffff81640ec9>] system_call_fastpath+0x16/0x1b
>
> There's nothing critical about it - the system should recover sooner or
> later - so let's suppress watchdog there.
>
> https://jira.sw.ru/browse/PSBM-44480
>
> Signed-off-by: Vladimir Davydov <vdavydov at virtuozzo.com>
> ---
>  block/cfq-iosched.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index af66fadb9270..85e091cf2f68 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -14,6 +14,7 @@
>  #include <linux/rbtree.h>
>  #include <linux/ioprio.h>
>  #include <linux/blktrace_api.h>
> +#include <linux/nmi.h>
>  #include <bc/io_acct.h>
>  
>  #include "blk.h"
> @@ -3237,6 +3238,7 @@ static int cfq_forced_dispatch(struct cfq_data *cfqd)
>  	while ((cfqq = cfq_get_next_queue_forced(cfqd)) != NULL) {
>  		__cfq_set_active_queue(cfqd, cfqq);
>  		dispatched += __cfq_forced_dispatch_cfqq(cfqq);
> +		touch_nmi_watchdog();
>  	}
>  
>  	BUG_ON(cfqd->busy_queues);
> -- 
> 2.1.4
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 472 bytes
Desc: not available
URL: <http://lists.openvz.org/pipermail/devel/attachments/20160319/667d8d63/attachment.sig>


More information about the Devel mailing list