[Devel] [PATCH RHEL7 COMMIT] ms/RDMA/rdma_cm: Fix use after free race with process_one_req
Konstantin Khorenko
khorenko at virtuozzo.com
Tue Nov 27 11:55:37 MSK 2018
The commit is pushed to "branch-rh7-3.10.0-862.20.2.vz7.73.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-862.20.2.vz7.73.10
------>
commit 4e1c878a6b9232968352fb4cf14d3b13e3e24dfa
Author: Jason Gunthorpe <jgg at mellanox.com>
Date: Thu Mar 22 14:04:23 2018 -0600
ms/RDMA/rdma_cm: Fix use after free race with process_one_req
process_one_req() can race with rdma_addr_cancel():
CPU0 CPU1
==== ====
process_one_work()
debug_work_deactivate(work);
process_one_req()
rdma_addr_cancel()
mutex_lock(&lock);
set_timeout(&req->work,..);
__queue_work()
debug_work_activate(work);
mutex_unlock(&lock);
mutex_lock(&lock);
[..]
list_del(&req->list);
mutex_unlock(&lock);
[..]
// ODEBUG explodes since the work is still queued.
kfree(req);
Causing ODEBUG to detect the use after free:
ODEBUG: free active (active state 0) object type: work_struct hint: process_one_req+0x0/0x6c0 include/net/dst.h:165
WARNING: CPU: 0 PID: 79 at lib/debugobjects.c:291 debug_print_object+0x166/0x220 lib/debugobjects.c:288
kvm: emulating exchange as write
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 79 Comm: kworker/u4:3 Not tainted 4.16.0-rc6+ #361
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: ib_addr process_one_req
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1dc/0x200 kernel/panic.c:547
report_bug+0x1f4/0x2b0 lib/bug.c:186
fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
fixup_bug arch/x86/kernel/traps.c:247 [inline]
do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986
RIP: 0010:debug_print_object+0x166/0x220 lib/debugobjects.c:288
RSP: 0000:ffff8801d966f210 EFLAGS: 00010086
RAX: dffffc0000000008 RBX: 0000000000000003 RCX: ffffffff815acd6e
RDX: 0000000000000000 RSI: 1ffff1003b2cddf2 RDI: 0000000000000000
RBP: ffff8801d966f250 R08: 0000000000000000 R09: 1ffff1003b2cddc8
R10: ffffed003b2cde71 R11: ffffffff86f39a98 R12: 0000000000000001
R13: ffffffff86f15540 R14: ffffffff86408700 R15: ffffffff8147c0a0
__debug_check_no_obj_freed lib/debugobjects.c:745 [inline]
debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774
kfree+0xc7/0x260 mm/slab.c:3799
process_one_req+0x2e7/0x6c0 drivers/infiniband/core/addr.c:592
process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113
worker_thread+0x223/0x1990 kernel/workqueue.c:2247
kthread+0x33c/0x400 kernel/kthread.c:238
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406
Fixes: 5fff41e1f89d ("IB/core: Fix race condition in resolving IP to MAC")
Reported-by: <syzbot+3b4acab09b6463472d0a at syzkaller.appspotmail.com>
Signed-off-by: Jason Gunthorpe <jgg at mellanox.com>
https://pmc.acronis.com/browse/VSTOR-18115
(cherry picked from commit 9137108cc3d64ade13e753108ec611a0daed16a0)
Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
---
drivers/infiniband/core/addr.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index 89a60ba20621..3c1b876a3458 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -597,6 +597,15 @@ static void process_one_req(struct work_struct *_work)
list_del(&req->list);
mutex_unlock(&lock);
+ /*
+ * Although the work will normally have been canceled by the
+ * workqueue, it can still be requeued as long as it is on the
+ * req_list, so it could have been requeued before we grabbed &lock.
+ * We need to cancel it after it is removed from req_list to really be
+ * sure it is safe to free.
+ */
+ cancel_delayed_work(&req->work);
+
req->callback(req->status, (struct sockaddr *)&req->src_addr,
req->addr, req->context);
put_client(req->client);
More information about the Devel
mailing list