[Devel] [PATCH vz7 v2] net: neigh: decrement the family specific qlen

Alexander Atanasov alexander.atanasov at virtuozzo.com
Tue Jan 16 11:37:10 MSK 2024


On 16.01.24 9:52, Pavel Tikhomirov wrote:
> 
> 
> On 16/01/2024 15:10, Alexander Atanasov wrote:
>> On 16.01.24 5:12, Pavel Tikhomirov wrote:
>>> I'd prefer to have two separate clean patches ported instead of 
>>> porting a merge of one into another (without a change explaining 
>>> comment).
>>
>> commit 8207f253a097fe15c93d85ac15ebb73c5e39e1e1
>> Author: Thomas Zeitlhofer <thomas.zeitlhofer+lkml at ze-it.at>
>> Date:   Tue Nov 15 23:09:41 2022 +0100
>>
>> @@ -409,7 +427,8 @@ static int __neigh_ifdown(struct neigh_table *tbl, 
>> struct net_device *dev,
>>          write_lock_bh(&tbl->lock);
>>          neigh_flush_dev(tbl, dev, skip_perm);
>>          pneigh_ifdown_and_unlock(tbl, dev);
>> -       pneigh_queue_purge(&tbl->proxy_queue, dev ? dev_net(dev) : NULL);
>> +       pneigh_queue_purge(&tbl->proxy_queue, dev ? dev_net(dev) : NULL,
>> +                          tbl->family);
>>          if (skb_queue_empty_lockless(&tbl->proxy_queue))
>>                  del_timer_sync(&tbl->proxy_timer);
>>          return 0;
>>
>>
>> The hook about device is in the commit i am backporting to vz7.
>> i made a mistake during the merge so it was not in the v1 patch.
>>
>> The other commit you talk about is:
>> commit f8017317cb0b279b8ab98b0f3901a2e0ac880dad
>> Author: Chen Zhongjin <chenzhongjin at huawei.com>
>> Date:   Tue Nov 1 20:15:52 2022 +0800
>>
>>
>> Which brings the fix about NULL dev - predates the one i am backporting.
>>
>> Both logs are from mainstream tree.
>>
>>
> 
> You ported 8207f253 and merged f8017317 into it, I don't like it. I 
> would've ported both patches sequentially instead, that's all I try to 
> say here.
> 
> When you port something cleanly, git-range-diff will be clean:
> 
> [snorch at turmoil vzkernel-vz7]$ git range-diff 
> 8207f253a097fe15c93d85ac15ebb73c5e39e1e1^- HEAD^- | grep "++\|--\|-+\|+-"
> [snorch at turmoil vzkernel-vz7]$
> 
> in your case it is:
> 
> [snorch at turmoil vzkernel-vz7]$ git range-diff 
> 8207f253a097fe15c93d85ac15ebb73c5e39e1e1^- aatanasov-neigh-qlen-v2^- | 
> grep "++\|--\|-+\|+-"
>      --    pneigh_queue_purge(&tbl->proxy_queue, dev ? dev_net(dev) : 
> NULL);
>      +-    pneigh_queue_purge(&tbl->proxy_queue, dev_net(dev));
> 


No, i did not. Please, see:

git log -p 8207f253a097fe15c93d85ac15ebb73c5e39e1e1

in mainstream tree.



>>
>>
>>>
>>> On 15/01/2024 22:56, Alexander Atanasov wrote:
>>>> From: Thomas Zeitlhofer <thomas.zeitlhofer+lkml at ze-it.at>
>>>>
>>>> Commit 0ff4eb3d5ebb ("neighbour: make proxy_queue.qlen limit
>>>> per-device") introduced the length counter qlen in struct neigh_parms.
>>>> There are separate neigh_parms instances for IPv4/ARP and IPv6/ND, and
>>>> while the family specific qlen is incremented in pneigh_enqueue(), the
>>>> mentioned commit decrements always the IPv4/ARP specific qlen,
>>>> regardless of the currently processed family, in pneigh_queue_purge()
>>>> and neigh_proxy_process().
>>>>
>>>> As a result, with IPv6/ND, the family specific qlen is only incremented
>>>> (and never decremented) until it exceeds PROXY_QLEN, and then, 
>>>> according
>>>> to the check in pneigh_enqueue(), neighbor solicitations are not
>>>> answered anymore. As an example, this is noted when using the
>>>> subnet-router anycast address to access a Linux router. After a certain
>>>> amount of time (in the observed case, qlen exceeded PROXY_QLEN after 
>>>> two
>>>> days), the Linux router stops answering neighbor solicitations for its
>>>> subnet-router anycast address and effectively becomes unreachable.
>>>>
>>>> Another result with IPv6/ND is that the IPv4/ARP specific qlen is
>>>> decremented more often than incremented. This leads to negative qlen
>>>> values, as a signed integer has been used for the length counter qlen,
>>>> and potentially to an integer overflow.
>>>>
>>>> Fix this by introducing the helper function neigh_parms_qlen_dec(),
>>>> which decrements the family specific qlen. Thereby, make use of the
>>>> existing helper function neigh_get_dev_parms_rcu(), whose definition
>>>> therefore needs to be placed earlier in neighbour.c. Take the family
>>>> member from struct neigh_table to determine the currently processed
>>>> family and appropriately call neigh_parms_qlen_dec() from
>>>> pneigh_queue_purge() and neigh_proxy_process().
>>>>
>>>> Additionally, use an unsigned integer for the length counter qlen.
>>>>
>>>> Fixes: 0ff4eb3d5ebb ("neighbour: make proxy_queue.qlen limit 
>>>> per-device")
>>>> Signed-off-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml at ze-it.at>
>>>> Signed-off-by: David S. Miller <davem at davemloft.net>
>>>> (cherry picked from commit 8207f253a097fe15c93d85ac15ebb73c5e39e1e1)
>>>> https://virtuozzo.atlassian.net/browse/PSBM-153018
>>>> https://bugs.openvz.org/browse/OVZ-7410
>>>> Signed-off-by: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
>>>> ---
>>>>   include/net/neighbour.h |  2 +-
>>>>   net/core/neighbour.c    | 58 
>>>> +++++++++++++++++++++--------------------
>>>>   2 files changed, 31 insertions(+), 29 deletions(-)
>>>>
>>>> diff --git a/include/net/neighbour.h b/include/net/neighbour.h
>>>> index 2326044d5294..c63016cc4b27 100644
>>>> --- a/include/net/neighbour.h
>>>> +++ b/include/net/neighbour.h
>>>> @@ -79,7 +79,7 @@ struct neigh_parms {
>>>>       struct rcu_head rcu_head;
>>>>       int    reachable_time;
>>>> -    int    qlen;
>>>> +    u32    qlen;
>>>>       int    data[NEIGH_VAR_DATA_MAX];
>>>>       DECLARE_BITMAP(data_state, NEIGH_VAR_DATA_MAX);
>>>>   };
>>>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>>>> index d5b28c158144..fa9da27c0676 100644
>>>> --- a/net/core/neighbour.c
>>>> +++ b/net/core/neighbour.c
>>>> @@ -223,7 +223,31 @@ static int neigh_del_timer(struct neighbour *n)
>>>>       return 0;
>>>>   }
>>>> -static void pneigh_queue_purge(struct sk_buff_head *list, struct 
>>>> net *net)
>>>> +static struct neigh_parms *neigh_get_dev_parms_rcu(struct 
>>>> net_device *dev,
>>>> +                           int family)
>>>> +{
>>>> +    switch (family) {
>>>> +    case AF_INET:
>>>> +        return __in_dev_arp_parms_get_rcu(dev);
>>>> +    case AF_INET6:
>>>> +        return __in6_dev_nd_parms_get_rcu(dev);
>>>> +    }
>>>> +    return NULL;
>>>> +}
>>>> +
>>>> +static void neigh_parms_qlen_dec(struct net_device *dev, int family)
>>>> +{
>>>> +    struct neigh_parms *p;
>>>> +
>>>> +    rcu_read_lock();
>>>> +    p = neigh_get_dev_parms_rcu(dev, family);
>>>> +    if (p)
>>>> +        p->qlen--;
>>>> +    rcu_read_unlock();
>>>> +}
>>>> +
>>>> +static void pneigh_queue_purge(struct sk_buff_head *list, struct 
>>>> net *net,
>>>> +                   int family)
>>>>   {
>>>>       struct sk_buff_head tmp;
>>>>       unsigned long flags;
>>>> @@ -237,13 +261,7 @@ static void pneigh_queue_purge(struct 
>>>> sk_buff_head *list, struct net *net)
>>>>           struct net_device *dev = skb->dev;
>>>>           if (net == NULL || net_eq(dev_net(dev), net)) {
>>>> -            struct in_device *in_dev;
>>>> -
>>>> -            rcu_read_lock();
>>>> -            in_dev = __in_dev_get_rcu(dev);
>>>> -            if (in_dev)
>>>> -                in_dev->arp_parms->qlen--;
>>>> -            rcu_read_unlock();
>>>> +            neigh_parms_qlen_dec(dev, family);
>>>>               __skb_unlink(skb, list);
>>>>               __skb_queue_tail(&tmp, skb);
>>>>           }
>>>> @@ -321,7 +339,8 @@ int neigh_ifdown(struct neigh_table *tbl, struct 
>>>> net_device *dev)
>>>>       neigh_flush_dev(tbl, dev);
>>>>       pneigh_ifdown(tbl, dev);
>>>>       write_unlock_bh(&tbl->lock);
>>>> -    pneigh_queue_purge(&tbl->proxy_queue, dev_net(dev));
>>>> +    pneigh_queue_purge(&tbl->proxy_queue, dev ? dev_net(dev) : NULL,
>>>> +               tbl->family);
>>>>       if (skb_queue_empty(&tbl->proxy_queue))
>>>>           del_timer_sync(&tbl->proxy_timer);
>>>>       return 0;
>>>> @@ -1504,13 +1523,8 @@ static void neigh_proxy_process(unsigned long 
>>>> arg)
>>>>           if (tdif <= 0) {
>>>>               struct net_device *dev = skb->dev;
>>>> -            struct in_device *in_dev;
>>>> -            rcu_read_lock();
>>>> -            in_dev = __in_dev_get_rcu(dev);
>>>> -            if (in_dev)
>>>> -                in_dev->arp_parms->qlen--;
>>>> -            rcu_read_unlock();
>>>> +            neigh_parms_qlen_dec(dev, tbl->family);
>>>>               __skb_unlink(skb, &tbl->proxy_queue);
>>>>               if (tbl->proxy_redo && netif_running(dev)) {
>>>> @@ -1706,7 +1720,7 @@ int neigh_table_clear(int index, struct 
>>>> neigh_table *tbl)
>>>>       /* It is not clean... Fix it to unload IPv6 module safely */
>>>>       cancel_delayed_work_sync(&tbl->gc_work);
>>>>       del_timer_sync(&tbl->proxy_timer);
>>>> -    pneigh_queue_purge(&tbl->proxy_queue, NULL);
>>>> +    pneigh_queue_purge(&tbl->proxy_queue, NULL, tbl->family);
>>>>       neigh_ifdown(tbl, NULL);
>>>>       if (atomic_read(&tbl->entries))
>>>>           pr_crit("neighbour leakage\n");
>>>> @@ -2964,18 +2978,6 @@ static int proc_unres_qlen(struct ctl_table 
>>>> *ctl, int write,
>>>>       return ret;
>>>>   }
>>>> -static struct neigh_parms *neigh_get_dev_parms_rcu(struct 
>>>> net_device *dev,
>>>> -                           int family)
>>>> -{
>>>> -    switch (family) {
>>>> -    case AF_INET:
>>>> -        return __in_dev_arp_parms_get_rcu(dev);
>>>> -    case AF_INET6:
>>>> -        return __in6_dev_nd_parms_get_rcu(dev);
>>>> -    }
>>>> -    return NULL;
>>>> -}
>>>> -
>>>>   static void neigh_copy_dflt_parms(struct net *net, struct 
>>>> neigh_parms *p,
>>>>                     int index)
>>>>   {
>>>
>>
> 

-- 
Regards,
Alexander Atanasov



More information about the Devel mailing list