[Devel] [PATCH RH7] net: skip genenerating uevents for network namespaces that are exiting
Kirill Tkhai
ktkhai at virtuozzo.com
Wed May 27 15:42:59 MSK 2020
From: Andrey Vagin <avagin at openvz.org>
ms commit 002d8a1a6c11
No one can see these events, because a network namespace can not be
destroyed, if it has sockets.
Unlike other devices, uevent-s for network devices are generated
only inside their network namespaces. They are filtered in
kobj_bcast_filter()
My experiments shows that net namespaces are destroyed more 30% faster
with this optimization.
Here is a perf output for destroying network namespaces without this
patch.
- 94.76% 0.02% kworker/u48:1 [kernel.kallsyms] [k] cleanup_net
- 94.74% cleanup_net
- 94.64% ops_exit_list.isra.4
- 41.61% default_device_exit_batch
- 41.47% unregister_netdevice_many
- rollback_registered_many
- 40.36% netdev_unregister_kobject
- 14.55% device_del
+ 13.71% kobject_uevent
- 13.04% netdev_queue_update_kobjects
+ 12.96% kobject_put
- 12.72% net_rx_queue_update_kobjects
kobject_put
- kobject_release
+ 12.69% kobject_uevent
+ 0.80% call_netdevice_notifiers_info
+ 19.57% nfsd_exit_net
+ 11.15% tcp_net_metrics_exit
+ 8.25% rpcsec_gss_exit_net
It's very critical to optimize the exit path for network namespaces,
because they are destroyed under net_mutex and many namespaces can be
destroyed for one iteration.
v2: use dev_set_uevent_suppress()
Cc: Cong Wang <xiyou.wangcong at gmail.com>
Cc: "David S. Miller" <davem at davemloft.net>
Cc: Eric W. Biederman <ebiederm at xmission.com>
Signed-off-by: Andrei Vagin <avagin at openvz.org>
Signed-off-by: David S. Miller <davem at davemloft.net>
[ktkhai: Added missed exit_list initialization]
Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
---
net/core/net-sysfs.c | 16 ++++++++++++++--
net/core/net_namespace.c | 2 ++
2 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 37fbab2bdd40..2580de8ebfc8 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -917,8 +917,14 @@ net_rx_queue_update_kobjects(struct net_device *net, int old_num, int new_num)
}
}
- while (--i >= new_num)
- kobject_put(&net->_rx[i].kobj);
+ while (--i >= new_num) {
+ struct kobject *kobj = &net->_rx[i].kobj;
+
+ if (!list_empty(&dev_net(net)->exit_list))
+ kobj->uevent_suppress = 1;
+
+ kobject_put(kobj);
+ }
return error;
#else
@@ -1331,6 +1337,9 @@ netdev_queue_update_kobjects(struct net_device *net, int old_num, int new_num)
while (--i >= new_num) {
struct netdev_queue *queue = net->_tx + i;
+ if (!list_empty(&dev_net(net)->exit_list))
+ queue->kobj.uevent_suppress = 1;
+
#ifdef CONFIG_BQL
sysfs_remove_group(&queue->kobj, &dql_group);
#endif
@@ -1489,6 +1498,9 @@ void netdev_unregister_kobject(struct net_device * net)
{
struct device *dev = &(net->dev);
+ if (!list_empty(&dev_net(net)->exit_list))
+ dev_set_uevent_suppress(dev, 1);
+
kobject_get(&dev->kobj);
remove_queue_kobjects(net);
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 8076eb163ffc..314a84e3f0ff 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -42,6 +42,7 @@ EXPORT_SYMBOL_GPL(net_namespace_list);
struct net init_net = {
.dev_base_head = LIST_HEAD_INIT(init_net.dev_base_head),
+ .exit_list = LIST_HEAD_INIT(init_net.exit_list),
#ifdef CONFIG_VE
.owner_ve = &ve0,
#ifdef CONFIG_VE_IPTABLES
@@ -321,6 +322,7 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
net->user_ns = user_ns;
idr_init(&net->netns_ids);
spin_lock_init(&net->nsid_lock);
+ INIT_LIST_HEAD(&net->exit_list);
list_for_each_entry(ops, &pernet_list, list) {
error = ops_init(ops, net);
More information about the Devel
mailing list