[Devel] [PATCH vz7 v2] net: neigh: decrement the family specific qlen
Pavel Tikhomirov
ptikhomirov at virtuozzo.com
Tue Jan 16 10:52:37 MSK 2024
On 16/01/2024 15:10, Alexander Atanasov wrote:
> On 16.01.24 5:12, Pavel Tikhomirov wrote:
>> I'd prefer to have two separate clean patches ported instead of
>> porting a merge of one into another (without a change explaining
>> comment).
>
> commit 8207f253a097fe15c93d85ac15ebb73c5e39e1e1
> Author: Thomas Zeitlhofer <thomas.zeitlhofer+lkml at ze-it.at>
> Date: Tue Nov 15 23:09:41 2022 +0100
>
> @@ -409,7 +427,8 @@ static int __neigh_ifdown(struct neigh_table *tbl,
> struct net_device *dev,
> write_lock_bh(&tbl->lock);
> neigh_flush_dev(tbl, dev, skip_perm);
> pneigh_ifdown_and_unlock(tbl, dev);
> - pneigh_queue_purge(&tbl->proxy_queue, dev ? dev_net(dev) : NULL);
> + pneigh_queue_purge(&tbl->proxy_queue, dev ? dev_net(dev) : NULL,
> + tbl->family);
> if (skb_queue_empty_lockless(&tbl->proxy_queue))
> del_timer_sync(&tbl->proxy_timer);
> return 0;
>
>
> The hook about device is in the commit i am backporting to vz7.
> i made a mistake during the merge so it was not in the v1 patch.
>
> The other commit you talk about is:
> commit f8017317cb0b279b8ab98b0f3901a2e0ac880dad
> Author: Chen Zhongjin <chenzhongjin at huawei.com>
> Date: Tue Nov 1 20:15:52 2022 +0800
>
>
> Which brings the fix about NULL dev - predates the one i am backporting.
>
> Both logs are from mainstream tree.
>
>
You ported 8207f253 and merged f8017317 into it, I don't like it. I
would've ported both patches sequentially instead, that's all I try to
say here.
When you port something cleanly, git-range-diff will be clean:
[snorch at turmoil vzkernel-vz7]$ git range-diff
8207f253a097fe15c93d85ac15ebb73c5e39e1e1^- HEAD^- | grep "++\|--\|-+\|+-"
[snorch at turmoil vzkernel-vz7]$
in your case it is:
[snorch at turmoil vzkernel-vz7]$ git range-diff
8207f253a097fe15c93d85ac15ebb73c5e39e1e1^- aatanasov-neigh-qlen-v2^- |
grep "++\|--\|-+\|+-"
-- pneigh_queue_purge(&tbl->proxy_queue, dev ? dev_net(dev) : NULL);
+- pneigh_queue_purge(&tbl->proxy_queue, dev_net(dev));
>
>
>>
>> On 15/01/2024 22:56, Alexander Atanasov wrote:
>>> From: Thomas Zeitlhofer <thomas.zeitlhofer+lkml at ze-it.at>
>>>
>>> Commit 0ff4eb3d5ebb ("neighbour: make proxy_queue.qlen limit
>>> per-device") introduced the length counter qlen in struct neigh_parms.
>>> There are separate neigh_parms instances for IPv4/ARP and IPv6/ND, and
>>> while the family specific qlen is incremented in pneigh_enqueue(), the
>>> mentioned commit decrements always the IPv4/ARP specific qlen,
>>> regardless of the currently processed family, in pneigh_queue_purge()
>>> and neigh_proxy_process().
>>>
>>> As a result, with IPv6/ND, the family specific qlen is only incremented
>>> (and never decremented) until it exceeds PROXY_QLEN, and then, according
>>> to the check in pneigh_enqueue(), neighbor solicitations are not
>>> answered anymore. As an example, this is noted when using the
>>> subnet-router anycast address to access a Linux router. After a certain
>>> amount of time (in the observed case, qlen exceeded PROXY_QLEN after two
>>> days), the Linux router stops answering neighbor solicitations for its
>>> subnet-router anycast address and effectively becomes unreachable.
>>>
>>> Another result with IPv6/ND is that the IPv4/ARP specific qlen is
>>> decremented more often than incremented. This leads to negative qlen
>>> values, as a signed integer has been used for the length counter qlen,
>>> and potentially to an integer overflow.
>>>
>>> Fix this by introducing the helper function neigh_parms_qlen_dec(),
>>> which decrements the family specific qlen. Thereby, make use of the
>>> existing helper function neigh_get_dev_parms_rcu(), whose definition
>>> therefore needs to be placed earlier in neighbour.c. Take the family
>>> member from struct neigh_table to determine the currently processed
>>> family and appropriately call neigh_parms_qlen_dec() from
>>> pneigh_queue_purge() and neigh_proxy_process().
>>>
>>> Additionally, use an unsigned integer for the length counter qlen.
>>>
>>> Fixes: 0ff4eb3d5ebb ("neighbour: make proxy_queue.qlen limit
>>> per-device")
>>> Signed-off-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml at ze-it.at>
>>> Signed-off-by: David S. Miller <davem at davemloft.net>
>>> (cherry picked from commit 8207f253a097fe15c93d85ac15ebb73c5e39e1e1)
>>> https://virtuozzo.atlassian.net/browse/PSBM-153018
>>> https://bugs.openvz.org/browse/OVZ-7410
>>> Signed-off-by: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
>>> ---
>>> include/net/neighbour.h | 2 +-
>>> net/core/neighbour.c | 58 +++++++++++++++++++++--------------------
>>> 2 files changed, 31 insertions(+), 29 deletions(-)
>>>
>>> diff --git a/include/net/neighbour.h b/include/net/neighbour.h
>>> index 2326044d5294..c63016cc4b27 100644
>>> --- a/include/net/neighbour.h
>>> +++ b/include/net/neighbour.h
>>> @@ -79,7 +79,7 @@ struct neigh_parms {
>>> struct rcu_head rcu_head;
>>> int reachable_time;
>>> - int qlen;
>>> + u32 qlen;
>>> int data[NEIGH_VAR_DATA_MAX];
>>> DECLARE_BITMAP(data_state, NEIGH_VAR_DATA_MAX);
>>> };
>>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>>> index d5b28c158144..fa9da27c0676 100644
>>> --- a/net/core/neighbour.c
>>> +++ b/net/core/neighbour.c
>>> @@ -223,7 +223,31 @@ static int neigh_del_timer(struct neighbour *n)
>>> return 0;
>>> }
>>> -static void pneigh_queue_purge(struct sk_buff_head *list, struct net
>>> *net)
>>> +static struct neigh_parms *neigh_get_dev_parms_rcu(struct net_device
>>> *dev,
>>> + int family)
>>> +{
>>> + switch (family) {
>>> + case AF_INET:
>>> + return __in_dev_arp_parms_get_rcu(dev);
>>> + case AF_INET6:
>>> + return __in6_dev_nd_parms_get_rcu(dev);
>>> + }
>>> + return NULL;
>>> +}
>>> +
>>> +static void neigh_parms_qlen_dec(struct net_device *dev, int family)
>>> +{
>>> + struct neigh_parms *p;
>>> +
>>> + rcu_read_lock();
>>> + p = neigh_get_dev_parms_rcu(dev, family);
>>> + if (p)
>>> + p->qlen--;
>>> + rcu_read_unlock();
>>> +}
>>> +
>>> +static void pneigh_queue_purge(struct sk_buff_head *list, struct net
>>> *net,
>>> + int family)
>>> {
>>> struct sk_buff_head tmp;
>>> unsigned long flags;
>>> @@ -237,13 +261,7 @@ static void pneigh_queue_purge(struct
>>> sk_buff_head *list, struct net *net)
>>> struct net_device *dev = skb->dev;
>>> if (net == NULL || net_eq(dev_net(dev), net)) {
>>> - struct in_device *in_dev;
>>> -
>>> - rcu_read_lock();
>>> - in_dev = __in_dev_get_rcu(dev);
>>> - if (in_dev)
>>> - in_dev->arp_parms->qlen--;
>>> - rcu_read_unlock();
>>> + neigh_parms_qlen_dec(dev, family);
>>> __skb_unlink(skb, list);
>>> __skb_queue_tail(&tmp, skb);
>>> }
>>> @@ -321,7 +339,8 @@ int neigh_ifdown(struct neigh_table *tbl, struct
>>> net_device *dev)
>>> neigh_flush_dev(tbl, dev);
>>> pneigh_ifdown(tbl, dev);
>>> write_unlock_bh(&tbl->lock);
>>> - pneigh_queue_purge(&tbl->proxy_queue, dev_net(dev));
>>> + pneigh_queue_purge(&tbl->proxy_queue, dev ? dev_net(dev) : NULL,
>>> + tbl->family);
>>> if (skb_queue_empty(&tbl->proxy_queue))
>>> del_timer_sync(&tbl->proxy_timer);
>>> return 0;
>>> @@ -1504,13 +1523,8 @@ static void neigh_proxy_process(unsigned long
>>> arg)
>>> if (tdif <= 0) {
>>> struct net_device *dev = skb->dev;
>>> - struct in_device *in_dev;
>>> - rcu_read_lock();
>>> - in_dev = __in_dev_get_rcu(dev);
>>> - if (in_dev)
>>> - in_dev->arp_parms->qlen--;
>>> - rcu_read_unlock();
>>> + neigh_parms_qlen_dec(dev, tbl->family);
>>> __skb_unlink(skb, &tbl->proxy_queue);
>>> if (tbl->proxy_redo && netif_running(dev)) {
>>> @@ -1706,7 +1720,7 @@ int neigh_table_clear(int index, struct
>>> neigh_table *tbl)
>>> /* It is not clean... Fix it to unload IPv6 module safely */
>>> cancel_delayed_work_sync(&tbl->gc_work);
>>> del_timer_sync(&tbl->proxy_timer);
>>> - pneigh_queue_purge(&tbl->proxy_queue, NULL);
>>> + pneigh_queue_purge(&tbl->proxy_queue, NULL, tbl->family);
>>> neigh_ifdown(tbl, NULL);
>>> if (atomic_read(&tbl->entries))
>>> pr_crit("neighbour leakage\n");
>>> @@ -2964,18 +2978,6 @@ static int proc_unres_qlen(struct ctl_table
>>> *ctl, int write,
>>> return ret;
>>> }
>>> -static struct neigh_parms *neigh_get_dev_parms_rcu(struct net_device
>>> *dev,
>>> - int family)
>>> -{
>>> - switch (family) {
>>> - case AF_INET:
>>> - return __in_dev_arp_parms_get_rcu(dev);
>>> - case AF_INET6:
>>> - return __in6_dev_nd_parms_get_rcu(dev);
>>> - }
>>> - return NULL;
>>> -}
>>> -
>>> static void neigh_copy_dflt_parms(struct net *net, struct
>>> neigh_parms *p,
>>> int index)
>>> {
>>
>
--
Best regards, Tikhomirov Pavel
Senior Software Developer, Virtuozzo.
More information about the Devel
mailing list