[Devel] Re: [PATCH 07/10] memcg: Stop res_counter underflows.

Wed Feb 29 09:05:52 PST 2012

On 02/28/2012 08:07 PM, Suleiman Souhlal wrote:
> On Tue, Feb 28, 2012 at 5:31 AM, Glauber Costa<glommer at parallels.com>  wrote:
>> I don't fully understand this.
>> To me, the whole purpose of having a cache tied to a memcg, is that we know
>> all allocations from that particular cache should be billed to a specific
>> memcg. So after a cache is created, and has an assigned memcg,
>> what's the point in bypassing it to root?
>>
>> It smells like you're just using this to circumvent something...
>
> In the vast majority of the cases, we will be able to account to the cgroup.
> However, there are cases when __mem_cgroup_try_charge() is not able to
> do so, like when the task is being killed.
> When this happens, the allocation will not get accounted to the
> cgroup, but the slab accounting code will still think the page belongs
> to the memcg's kmem_cache.
> So, when we go to free the page, we assume that the page belongs to
> the memcg and uncharge it, even though it was never charged to us in
> the first place.
>
> This is the situation this patch is trying to address, by keeping a
> counter of how much memory has been bypassed like this, and uncharging
> from the root if we have any outstanding bypassed memory.
>
> Does that make sense?
>
Yes, but how about the following:

I had a similar problem in tcp accounting, and solved that by adding 
res_counter_charge_nofail().

I actually implemented something very similar to your bypass (now that I 
understand it better...) and gave up in favor of this.

The tcp code has its particularities, but still, that could work okay 
for the general slab.

Reason being:

Consider you have a limit of X, and is currently at X-1. You bypassed a 
page.

So in reality, you should fail the next allocation, but you will not - 
(unless you start considering the bypassed memory at allocation time as 
well).

If you use res_counter_charge_nofail(), you will:

  1) Still proceed with the allocations that shouldn't fail - so no
     difference here
  2) fail the normal allocations if you have "bypassed" memory filling
     up your limit
  3) all that without coupling something alien to the res_counter API.