[Devel] [PATCH RHEL7 COMMIT] Revert "unix: Charge outgoing buffers into cg memory"

Mon Jun 29 04:00:19 PDT 2015

The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.21
------>
commit c2993da9b126396025f8485004fad02f9a5c9525
Author: Vladimir Davydov <vdavydov at parallels.com>
Date:   Mon Jun 29 15:00:19 2015 +0400

    Revert "unix: Charge outgoing buffers into cg memory"
    
    This reverts commit f22980954a2d765ca6ca03c11b2eac8f3fe1d105.
    
    This commit is deadly broken - it frees pages allocated with
    alloc_kmem_pages using put_page instead of free_kmem_pages. As a result,
    kmem counter of the cgroup the page is charged to won't be uncharged and
    therefore will be leaked. What is worse, if such a page then gets reused
    for a thread info or slab page, there is a chance that the order of the
    new page will be greater than it used to be, as a result the
    mem_cgroup->kmem counter can be under-uncharged:
    
      WARNING: at kernel/res_counter.c:91 res_counter_uncharge_locked+0x2f/0x40()
      CPU: 0 PID: 19 Comm: rcuos/1 ve: 0 Tainted: G        W   --------------   3.10.0-123.1.2.vz7.5.18 #1 5.18
      ffffffff817e8e14 00000000a31483c6 ffff8804090c1c98 ffffffff815ca9ea
      ffff8804090c1cd0 ffffffff8105e091 ffff8800ceadf150 0000000000000000
      ffff8800ceadf150 0000000000000000 ffff8800ceadf178 ffff8804090c1ce0
      Call Trace:
      [<ffffffff815ca9ea>] dump_stack+0x19/0x1b
      [<ffffffff8105e091>] warn_slowpath_common+0x61/0x80
      [<ffffffff8105e1ba>] warn_slowpath_null+0x1a/0x20
      [<ffffffff810edfdf>] res_counter_uncharge_locked+0x2f/0x40
      [<ffffffff810ee1e5>] res_counter_uncharge_until+0x55/0xb0
      [<ffffffff810ee253>] res_counter_uncharge+0x13/0x20
      [<ffffffff811b2ba4>] memcg_uncharge_kmem+0x34/0x80
      [<ffffffff811b2ebd>] __memcg_kmem_uncharge_pages+0x5d/0x70
      [<ffffffff8114fe28>] free_kmem_pages+0x68/0x80
      [<ffffffff8105aed2>] free_task+0x32/0x60
      [<ffffffff8105af9b>] __put_task_struct+0x9b/0x140
      [<ffffffff81062b6c>] delayed_put_task_struct+0x3c/0x80
      [<ffffffff811034d9>] rcu_nocb_kthread+0x229/0x370
      [<ffffffff810883a0>] ? wake_up_bit+0x30/0x30
      [<ffffffff811032b0>] ? rcu_start_gp+0x40/0x40
      [<ffffffff8108723f>] kthread+0xcf/0xe0
      [<ffffffff81087170>] ? create_kthread+0x60/0x60
      [<ffffffff815db0ac>] ret_from_fork+0x7c/0xb0
      [<ffffffff81087170>] ? create_kthread+0x60/0x60
    
    This will probably eventually lead to the cgroup being freed when there
    are still active objects in one or more of its kmem caches:
    
      BUG buffer_head(39:101) (Tainted: G        W   --------------  ): Objects remaining in buffer_head(39:101) on kmem_cache_close()
    
      kernel BUG at mm/slab_common.c:493!
    
    This patch therefore reverts the above mentioned commit. We will rework
    it later.
    
    https://jira.sw.ru/browse/PSBM-34492
    
    Signed-off-by: Vladimir Davydov <vdavydov at parallels.com>
    
    Conflicts:
    	net/core/sock.c
---
 net/core/sock.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 10b4362..03f4b23 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1780,7 +1780,7 @@ struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
 
 			while (order) {
 				if (npages >= 1 << order) {
-					page = alloc_kmem_pages(sk->sk_allocation |
+					page = alloc_pages(sk->sk_allocation |
 							   __GFP_COMP | __GFP_NOWARN,
 							   order);
 					if (page)
@@ -1788,7 +1788,7 @@ struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
 				}
 				order--;
 			}
-			page = alloc_kmem_pages(sk->sk_allocation, 0);
+			page = alloc_page(sk->sk_allocation);
 			if (!page)
 				goto failure;
 fill_page: