[Devel] [PATCH RHEL7 COMMIT] ms/net: skip genenerating uevents for network namespaces that are exiting

Konstantin Khorenko khorenko at virtuozzo.com
Wed May 27 19:13:35 MSK 2020


The commit is pushed to "branch-rh7-3.10.0-1127.8.2.vz7.161.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-1127.8.2.vz7.161.2
------>
commit 06b221fdc235f48a1e93a87a20e49be4b0ed9cf7
Author: Andrey Vagin <avagin at openvz.org>
Date:   Mon Oct 24 19:09:53 2016 -0700

    ms/net: skip genenerating uevents for network namespaces that are exiting
    
    No one can see these events, because a network namespace can not be
    destroyed, if it has sockets.
    
    Unlike other devices, uevent-s for network devices are generated
    only inside their network namespaces. They are filtered in
    kobj_bcast_filter()
    
    My experiments shows that net namespaces are destroyed more 30% faster
    with this optimization.
    
    Here is a perf output for destroying network namespaces without this
    patch.
    
    -   94.76%     0.02%  kworker/u48:1  [kernel.kallsyms]     [k] cleanup_net
       - 94.74% cleanup_net
          - 94.64% ops_exit_list.isra.4
             - 41.61% default_device_exit_batch
                - 41.47% unregister_netdevice_many
                   - rollback_registered_many
                      - 40.36% netdev_unregister_kobject
                         - 14.55% device_del
                            + 13.71% kobject_uevent
                         - 13.04% netdev_queue_update_kobjects
                            + 12.96% kobject_put
                         - 12.72% net_rx_queue_update_kobjects
                              kobject_put
                            - kobject_release
                               + 12.69% kobject_uevent
                      + 0.80% call_netdevice_notifiers_info
             + 19.57% nfsd_exit_net
             + 11.15% tcp_net_metrics_exit
             + 8.25% rpcsec_gss_exit_net
    
    It's very critical to optimize the exit path for network namespaces,
    because they are destroyed under net_mutex and many namespaces can be
    destroyed for one iteration.
    
    v2: use dev_set_uevent_suppress()
    
    Cc: Cong Wang <xiyou.wangcong at gmail.com>
    Cc: "David S. Miller" <davem at davemloft.net>
    Cc: Eric W. Biederman <ebiederm at xmission.com>
    Signed-off-by: Andrei Vagin <avagin at openvz.org>
    Signed-off-by: David S. Miller <davem at davemloft.net>
    
    (cherry picked from commit 002d8a1a6c11b9b2a8ac615095589111dd52749b)
    Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
    Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
---
 net/core/net-sysfs.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 37fbab2bdd401..52199cb62bd22 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -917,8 +917,14 @@ net_rx_queue_update_kobjects(struct net_device *net, int old_num, int new_num)
 		}
 	}
 
-	while (--i >= new_num)
-		kobject_put(&net->_rx[i].kobj);
+	while (--i >= new_num) {
+		struct kobject *kobj = &net->_rx[i].kobj;
+
+		if (!list_empty(&dev_net(net)->exit_list))
+			kobj->uevent_suppress = 1;
+
+		kobject_put(kobj);
+	}
 
 	return error;
 #else
@@ -1331,6 +1337,8 @@ netdev_queue_update_kobjects(struct net_device *net, int old_num, int new_num)
 	while (--i >= new_num) {
 		struct netdev_queue *queue = net->_tx + i;
 
+		if (!list_empty(&dev_net(net)->exit_list))
+			queue->kobj.uevent_suppress = 1;
 #ifdef CONFIG_BQL
 		sysfs_remove_group(&queue->kobj, &dql_group);
 #endif
@@ -1489,6 +1497,9 @@ void netdev_unregister_kobject(struct net_device * net)
 {
 	struct device *dev = &(net->dev);
 
+	if (!list_empty(&dev_net(net)->exit_list))
+		dev_set_uevent_suppress(dev, 1);
+
 	kobject_get(&dev->kobj);
 
 	remove_queue_kobjects(net);


More information about the Devel mailing list