[Devel] Re: [PATCH 1/2] Adds a read-only "procs" file similar to "tasks" that shows only unique tgids
Benjamin Blum
bblum at google.com
Thu Jul 2 18:17:56 PDT 2009
On Thu, Jul 2, 2009 at 6:08 PM, Paul Menage<menage at google.com> wrote:
> On Thu, Jul 2, 2009 at 5:53 PM, Andrew Morton<akpm at linux-foundation.org> wrote:
>>> In the first snippet, count will be at most equal to length. As length
>>> is determined from cgroup_task_count, it can be no greater than the
>>> total number of pids on the system.
>>
>> Well that's a problem, because there can be tens or hundreds of
>> thousands of pids, and there's a fairly low maximum size for kmalloc()s
>> (include/linux/kmalloc_sizes.h).
>>
>> And even if this allocation attempt doesn't exceed KMALLOC_MAX_SIZE,
>> large allocations are less unreliable. There is a large break point at
>> 8*PAGE_SIZE (PAGE_ALLOC_COSTLY_ORDER).
>
> This has been a long-standing problem with the tasks file, ever since
> the cpusets days.
>
> There are ways around it - Lai Jiangshan <laijs at cn.fujitsu.com> posted
> a patch that allocated an array of pages to store pids in, with a
> custom sorting function that let you specify indirection rather than
> assuming everything was in one contiguous array. This was technically
> the right approach in terms of not needing vmalloc and never doing
> large allocations, but it was very complex; an alternative that was
> mooted was to use kmalloc for small cgroups and vmalloc for large
> ones, so the vmalloc penalty wouldn't be paid generally. The thread
> fizzled AFAICS.
As it is currently, the kmalloc call will simply fail if there are too
many pids, correct? Do we prefer not being able to read the file in
this case, or would we rather use vmalloc?
>
>>
>> One could perhaps create an alias (symlink?) and leave that in place
>> for a few kernel releases and then remove the old names. The trick to
>> doing this politely is to arrange for a friendly printk to come out
>> when userspace uses the old filename, so people know to change their
>> tools. That printk should come out once-per-boot, not once-per-access.
>
> Personally, I feel that a bit of ugliness in the naming inconsistency
> is less painful than trying to deprecate something that people might
> be using.
That's what the people who designed x86 said :P
> If we could just flip the names without breaking anyone,
> that would be great, but this is just a style issue rather than a
> functional issue. My experience of such printk() statements scattered
> around in code is that no-one takes much notice of them.
Whether or not we get rid of the old ones, it would be good to put in
aliases with the new style now so there's the option of removing the
old style ones later.
>
> Paul
>
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list