[Devel] [PATCH RHEL9 COMMIT] net: Primitives to enable conntrack allocation

Konstantin Khorenko khorenko at virtuozzo.com
Wed Oct 20 11:39:35 MSK 2021


The commit is pushed to "branch-rh9-5.14.vz9.1.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh9-5.14.0-4.vz9.10.12
------>
commit 0471849583d7ba6caf6556032550581f76e8c3cf
Author: Stanislav Kinsburskiy <skinsbursky at virtuozzo.com>
Date:   Wed Oct 20 11:39:34 2021 +0300

    net: Primitives to enable conntrack allocation
    
    Patchset description:
    
    Create conntrack structures only if they are really needed
    
    Allocate conntracks only after there is a rule which uses them.
    
    v2: Allow after there is a rule and never prohibit.
    
    khorenko@: the idea behind all of this:
    we want to provide the possibility to Containers to use iptables rules which
    require conntracks. At the same time we'd like to avoid problem we currently
    have in case we just enable conntracks allocation for all Containers and
    Hardware Node by default:
    1) in case conntracks are really not used by a CT - structures are still
       allocated decreasing the performance
    2) number of conntracks in the system is limited => DDoS is possible
    
    So we decided to implement a feature:
    not to allocate conntracks until there are rules in the netspace which require
    them.
    
    Disadvantage: if a user on live system loads iptables rule which requires
    conntracks, connections which are already alive can be handled not that
    precise. i believe this is OK.
    
    Once conntracks allocation is enabled, it cannot be disabled until reboot/CT
    restart. This is done in order to:
    a) simplify the code
    b) to have a possbility to unconditionally enable conntracks, for example for
       userspace conntrack users (http://conntrack-tools.netfilter.org/manual.html)
    c) adding a new iptables rule is implemented in the following way:
       - all rules are unloaded
       - new rule is added to the bunch of rules
       - all rules (including the new one) are uploaded to the kernel
       => each new rule add results in conntrack allocation disable/enable =>
       race window for unhandled connections
    
    =======================
    This patch description:
    
    Allocation are allowed only when there are conntracks users.
    By default they are prohibited.
    
    https://jira.sw.ru/browse/PSBM-51050
    
    Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
    
    Reviewed-by: Andrei Vagin <avagin at virtuozzo.com>
    
    +++
    ve/net: Move net->ct.can_alloc check up to resolve_normal_ct()
    
    Move it up on stack to break creation of a CT earlier.
    This avoids us to search in CT hashes and speeds work up.
    
    So, now nf_conntrack_alloc() creates a CT certanly,
    __nf_conntrack_alloc() doesn't return NULL and it does not
    need to be external.
    
    Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
    
    Reviewed-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
    
    To be merged to commit 874e7b5c6eb9
    "net: Primitives to enable conntrack allocation"
    
    https://jira.sw.ru/browse/PSBM-54823
    
    Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
    
    +++
    ve/net: Do not initialize netns_ct::can_alloc twice
    
    It's already initialized to zero during net creation
    in net_alloc(), so do not do that twice.
    
    Also, some conntrack allowing modules do not depend
    on nf_conntrack.ko, so it rewrites can_alloc to zero,
    if it's loaded later.
    
    (This may be merged with "commit af2b974e4755 "net: Primitives to enable conntrack allocation")
    
    https://jira.sw.ru/browse/PSBM-56500
    
    Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
    
    =======================
    
    net: Do not allow conntrack if netlink conntrack is requested
    
    The scheme with allowing conntracks suggestes to allow conntrack
    only after a rule is inserted. But this place is not inserting
    a rule, it's a manual conntrack creation.
    
    Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
    
    Reviewed-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
    
    (cherry picked from vz7 commit ("550b98d291cb net: Primitives to enable
    conntrack allocation"))
    
    VZ 8 rebase part https://jira.sw.ru/browse/PSBM-127783
    
    Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn at virtuozzo.com>
    
    Port vz8 commit 7071bc7fdbab ("net: Primitives to enable conntrack
    allocation").
    - move can_calloc flag to struct nf_conntrack_net
    - introduce conntrack_allocation_allowed() - symmetric function to
      allow_conntrack_allocation()
    - place both these functions to nf_conntrack.h
    - remove barrier pair, there is no SMP ordering requirements regarding
      set/use of can_alloc flag
    
    Signed-off-by: Nikita Yushchenko <nikita.yushchenko at virtuozzo.com>
---
 include/net/netfilter/nf_conntrack.h | 13 +++++++++++++
 net/netfilter/nf_conntrack_core.c    |  3 +++
 net/netfilter/nf_synproxy_core.c     |  2 ++
 3 files changed, 18 insertions(+)

diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 81983ac7e28a..5ea56d272c19 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -64,6 +64,7 @@ struct nf_conntrack_net {
 	struct delayed_work ecache_dwork;
 	struct netns_ct *ct_net;
 #endif
+	bool can_alloc;
 };
 
 #include <linux/types.h>
@@ -363,4 +364,16 @@ static inline struct nf_conntrack_net *nf_ct_pernet(const struct net *net)
 #define MODULE_ALIAS_NFCT_HELPER(helper) \
         MODULE_ALIAS("nfct-helper-" helper)
 
+static inline void allow_conntrack_allocation(struct net *net)
+{
+#if IS_ENABLED(NF_CONNTRACK)
+	nf_ct_pernet(net)->can_alloc = true;
+#endif
+}
+
+static inline bool conntrack_allocation_allowed(struct net *net)
+{
+	return nf_ct_pernet(net)->can_alloc;
+}
+
 #endif /* _NF_CONNTRACK_H */
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 8dc77131f2bc..c0fb21c87721 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1675,6 +1675,9 @@ resolve_normal_ct(struct nf_conn *tmpl,
 	struct nf_conn *ct;
 	u32 hash;
 
+	if (!conntrack_allocation_allowed(state->net))
+		return 0;
+
 	if (!nf_ct_get_tuple(skb, skb_network_offset(skb),
 			     dataoff, state->pf, protonum, state->net,
 			     &tuple)) {
diff --git a/net/netfilter/nf_synproxy_core.c b/net/netfilter/nf_synproxy_core.c
index 5759f146a24f..0255821cf375 100644
--- a/net/netfilter/nf_synproxy_core.c
+++ b/net/netfilter/nf_synproxy_core.c
@@ -339,6 +339,8 @@ static int __net_init synproxy_net_init(struct net *net)
 	struct nf_conn *ct;
 	int err = -ENOMEM;
 
+	allow_conntrack_allocation(net);
+
 	ct = nf_ct_tmpl_alloc(net, &nf_ct_zone_dflt, GFP_KERNEL);
 	if (!ct)
 		goto err1;


More information about the Devel mailing list