[Devel] [PATCH RH7] netfilter: core: fix NAT hooks collision check
Pavel Tikhomirov
ptikhomirov at virtuozzo.com
Mon Nov 7 15:24:54 MSK 2022
Reviewed-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
On 03.11.2022 21:23, Konstantin Khorenko wrote:
> Pasha,
>
> please review this patch as well.
>
> Thank you.
>
> --
> Best regards,
>
> Konstantin Khorenko,
> Virtuozzo Linux Kernel Team
>
> On 31.10.2022 10:51, Alexander Mikhalitsyn wrote:
>> In commit ("nat: allow nft NAT and iptables NAT work on the same
>> node") we are
>> trying to prevent simultaneous nft and nf NAT hooks registration. But
>> in fact,
>> it affects not only NAT-related hooks but all hooks!
>>
>> Reproducer:
>> -#!/usr/sbin/nft -f
>> flush ruleset
>>
>> table inet filter {
>> chain input {
>> type filter hook input priority 0;
>> }
>> }
>>
>> This simple script should work fine if we run it more than one time in
>> a row,
>> but in fact it breaks with -EBUSY error.
>>
>> This is because we not checking hook type in our code at all!
>>
>> But this bug is a little bit deeper. Consider another reproducer:
>> -#!/usr/sbin/nft -f
>> flush ruleset
>>
>> table inet filter {
>> chain input {
>> type filter hook input priority 0;
>> }
>> }
>>
>> table ip nat {
>> chain postrouting {
>> type nat hook postrouting priority 0; policy accept;
>> }
>> }
>>
>> In this case we have nat hook and we have to allow nat hook collision
>> during nft transaction execution. See analogical mainstream commit:
>> ae6153b50f ("netfilter: nf_tables: permit second nat hook if colliding
>> hook is going away")
>>
>> Our mainstream colleagues introduce nat_hook field in struct
>> nf_hook_ops, but we don't need that,
>> because we only handling hooks from nft side and can easily detect if
>> hook is nat related or not
>> by checking basechain type (basechain->type->type == NFT_CHAIN_T_NAT).
>>
>> I'm addressing both problems here and adding other small cleanups for
>> safety sake.
>>
>> https://jira.sw.ru/browse/PSBM-142895
>>
>> Fixes: d3a05a0552d7 ("nat: allow nft NAT and iptables NAT work on the
>> same node")
>> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn at openvz.org>
>> ---
>> net/netfilter/core.c | 83 ++++++++++++++++++++++++++++++++++++++++++--
>> 1 file changed, 81 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/netfilter/core.c b/net/netfilter/core.c
>> index 74dee8c1623c..6628d73ec5b8 100644
>> --- a/net/netfilter/core.c
>> +++ b/net/netfilter/core.c
>> @@ -66,6 +66,67 @@ EXPORT_SYMBOL(nf_hooks_needed);
>> static DEFINE_MUTEX(nf_hook_mutex);
>> #include <net/netfilter/nf_tables.h>
>> #include <net/netfilter/nf_nat.h>
>> +
>> +/* removal requests are queued in the commit_list, but not acted upon
>> + * until after all new rules are in place.
>> + *
>> + * Therefore, nf_register_net_hook(net, &nat_hook) runs before pending
>> + * nf_unregister_net_hook().
>> + *
>> + * nf_register_net_hook thus fails if a nat hook is already in place
>> + * even if the conflicting hook is about to be removed.
>> + *
>> + * If collision is detected, search commit_log for DELCHAIN matching
>> + * the new nat hooknum; if we find one collision is temporary:
>> + *
>> + * Either transaction is aborted (new/colliding hook is removed), or
>> + * transaction is committed (old hook is removed).
>> + *
>> + * -- OpenVZ specific:
>> + * - reworked not to use struct nf_hook_ops "nat_hook" field which is
>> absent
>> + * in our kernels.
>> + * - rebased to RHEL7 kernel
>> + *
>> + * Please refer to original commit:
>> + *
>> https://github.com/torvalds/linux/commit/ae6153b50f9bf75a4952050f32fe168f68cdd657
>> + * ("netfilter: nf_tables: permit second nat hook if colliding hook
>> is going away")
>> + */
>> +static bool nf_tables_allow_nat_conflict(const struct net *net,
>> + const struct nft_base_chain *basechain)
>> +{
>> + const struct nft_trans *trans;
>> + const struct nf_hook_ops *ops = &basechain->ops[0];
>> + bool ret = false;
>> +
>> + list_for_each_entry(trans, &net->nft.commit_list, list) {
>> + const struct nf_hook_ops *pending_ops;
>> + struct nft_base_chain *pending_chain;
>> + const struct nft_chain *pending;
>> +
>> + if (trans->msg_type != NFT_MSG_NEWCHAIN &&
>> + trans->msg_type != NFT_MSG_DELCHAIN)
>> + continue;
>> +
>> + pending = trans->ctx.chain;
>> + if (!(pending->flags & NFT_BASE_CHAIN))
>> + continue;
>> +
>> + pending_chain = nft_base_chain(pending);
>> + pending_ops = &pending_chain->ops[0];
>> + if ((pending_chain->type->type == NFT_CHAIN_T_NAT) &&
>> + pending_ops->pf == ops->pf &&
>> + pending_ops->hooknum == ops->hooknum) {
>> + /* other hook registration already pending? */
>> + if (trans->msg_type == NFT_MSG_NEWCHAIN)
>> + return false;
>> +
>> + ret = true;
>> + }
>> + }
>> +
>> + return ret;
>> +}
>> +
>> int nf_register_hook(struct nf_hook_ops *reg)
>> {
>> struct nf_hook_ops *elem;
>> @@ -75,11 +136,29 @@ int nf_register_hook(struct nf_hook_ops *reg)
>> if (reg->priority < elem->priority)
>> break;
>> else if ((reg->priority == elem->priority) &&
>> reg->is_nft_ops) {
>> - const struct nft_chain *c = reg->priv;
>> - struct net *net = read_pnet(&nft_base_chain(c)->pnet);
>> + const struct nft_chain *c;
>> + struct nft_base_chain *basechain;
>> + struct net *net;
>> +
>> + /* reg->is_nft_ops is true */
>> + c = reg->priv;
>> +
>> + /* to access nft_base_chain(c) in a safe way */
>> + if (!(c->flags & NFT_BASE_CHAIN))
>> + continue;
>> +
>> + basechain = nft_base_chain(c);
>> + net = read_pnet(&basechain->pnet);
>> +
>> + /* we only interested in nat hooks */
>> + if (basechain->type->type != NFT_CHAIN_T_NAT)
>> + continue;
>> /* fail if netns already have enabled nft/ipt nat */
>> if (netns_nat_check(elem, reg->pf, net)) {
>> + if (elem->is_nft_ops &&
>> nf_tables_allow_nat_conflict(net, basechain))
>> + continue;
>> +
>> mutex_unlock(&nf_hook_mutex);
>> return -EBUSY;
>> }
--
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.
More information about the Devel
mailing list