[Devel] [PATCH RHEL7 COMMIT] ms/netfilter: ipset: Missing gc cancellations fixed

Konstantin Khorenko khorenko at virtuozzo.com
Mon Sep 30 15:21:54 MSK 2024


The commit is pushed to "branch-rh7-3.10.0-1160.119.1.vz7.224.x-ovz" and will appear at git at bitbucket.org:openvz/vzkernel.git
after rh7-3.10.0-1160.119.1.vz7.224.3
------>
commit d2765c1d349cd27eb5122089fc5018289dd27f42
Author: Jozsef Kadlecsik <kadlec at netfilter.org>
Date:   Wed Sep 25 17:35:42 2024 +0800

    ms/netfilter: ipset: Missing gc cancellations fixed
    
    The patch fdb8e12cc2cc ("netfilter: ipset: fix performance regression
    in swap operation") missed to add the calls to gc cancellations
    at the error path of create operations and at module unload. Also,
    because the half of the destroy operations now executed by a
    function registered by call_rcu(), neither NFNL_SUBSYS_IPSET mutex
    or rcu read lock is held and therefore the checking of them results
    false warnings.
    
    Fixes: 97f7cf1cd80e ("netfilter: ipset: fix performance regression in swap operation")
    Reported-by: syzbot+52bbc0ad036f6f0d4a25 at syzkaller.appspotmail.com
    Reported-by: Brad Spengler <spender at grsecurity.net>
    Reported-by: Стас Ничипорович <stasn77 at gmail.com>
    Tested-by: Brad Spengler <spender at grsecurity.net>
    Tested-by: Стас Ничипорович <stasn77 at gmail.com>
    Signed-off-by: Jozsef Kadlecsik <kadlec at netfilter.org>
    Signed-off-by: Pablo Neira Ayuso <pablo at netfilter.org>
    
    https://virtuozzo.atlassian.net/browse/PSBM-155867
    (cherry picked from commit 27c5a095e2518975e20a10102908ae8231699879)
    Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
    
    =================
    Patchset description:
    netfilter: ipset: Fix possible cause of memory corruption
    
    Patch [1] fixes possible race between swap/destroy and add/del/test.
    
    Here is is possible order of events when this race can lead to double
    free with kfree_rcu on already freed hash bucket:
    
     # Thread 1
      +-> ip_set_add
        +-> set = ip_set_rcu_get(xt_net(par), index)
        < pause >
    
     # Thread 2
    ipset swap
    ipset destroy
      +-> mtype_destroy
        +-> mtype_ahash_destroy
          +-> n = __ipset_dereference(hbucket(t, i))
          +-> kfree(n)
    
     # Thread 1
        < unpause >
        +-> ip_set_lock(set)
        +-> hash_net4_kadt
          +-> mtype_variant->adt[adt]
            +-> net4_kadt_add (mtype_add)
              +-> n = rcu_dereference_bh(hbucket(t, key))
              +-> if (n->pos >= n->size)
                +-> old = n
              +-> if (old != ERR_PTR(-ENOENT))
                +-> kfree_rcu(old, rcu)
    
    That can in it's turn lead to possible rcu free list corruption if this
    double fried memory is reused just after rcu_free and before actual rcu
    callback.
    
    note1: The patch [1] has a reproducer but, sadly, I was unable to
    reproduce the situation, even adding mdelay(100) to ip_set_add/del/test.
    
    note2: All other patches are fixups to the original fixing patch.
    
    note3: We don't have proof that this a fix to original issue, we only
    know that ipset hash buckets were several times seen double freed just
    before crash, which might indirectly indicate that the original problem
    is related to ipset hash buckets.
    
    Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
    https://virtuozzo.atlassian.net/browse/PSBM-155867
    
    Alexander Maltsev (1):
      netfilter: ipset: Add list flush to cancel_gc
    
    Eric Dumazet (1):
      netns: add pre_exit method to struct pernet_operations
    
    Jozsef Kadlecsik (5):
      netfilter: ipset: fix race condition between swap/destroy and kernel
        side add/del/test [1]
      netfilter: ipset: fix performance regression in swap operation
      netfilter: ipset: Missing gc cancellations fixed
      netfilter: ipset: Fix race between namespace cleanup and gc in the
        list:set type
      netfilter: ipset: Fix suspicious rcu_dereference_protected()
---
 net/netfilter/ipset/ip_set_core.c     | 2 ++
 net/netfilter/ipset/ip_set_hash_gen.h | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index 7cf9da0881d5..41fb34167a15 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -1010,6 +1010,7 @@ ip_set_create(struct sock *ctnl, struct sk_buff *skb,
 	return ret;
 
 cleanup:
+	set->variant->cancel_gc(set);
 	set->variant->destroy(set);
 put_out:
 	module_put(set->type->me);
@@ -2253,6 +2254,7 @@ ip_set_net_exit(struct net *net)
 		set = ip_set(inst, i);
 		if (set) {
 			ip_set(inst, i) = NULL;
+			set->variant->cancel_gc(set);
 			ip_set_destroy_set(set);
 		}
 	}
diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index 9c6383592c20..a6a6719a24c5 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -464,7 +464,7 @@ mtype_ahash_destroy(struct ip_set *set, struct htable *t, bool ext_destroy)
 	u32 i;
 
 	for (i = 0; i < jhash_size(t->htable_bits); i++) {
-		n = __ipset_dereference(hbucket(t, i));
+		n = (__force struct hbucket *)hbucket(t, i);
 		if (!n)
 			continue;
 		if (set->extensions & IPSET_EXT_DESTROY && ext_destroy)
@@ -484,7 +484,7 @@ mtype_destroy(struct ip_set *set)
 	struct htype *h = set->data;
 	struct list_head *l, *lt;
 
-	mtype_ahash_destroy(set, ipset_dereference_nfnl(h->table), true);
+	mtype_ahash_destroy(set, (__force struct htable *)h->table, true);
 	list_for_each_safe(l, lt, &h->ad) {
 		list_del(l);
 		kfree(l);


More information about the Devel mailing list