[Devel] Re: [PATCH 1/1] cr: define CHECKPOINT_SUBTREE flag and sysctl
Oren Laadan
orenl at cs.columbia.edu
Mon Apr 27 15:13:49 PDT 2009
Serge E. Hallyn wrote:
> Quoting Oren Laadan (orenl at cs.columbia.edu):
>>
>> Serge E. Hallyn wrote:
>>> Quoting Nathan Lynch (ntl at pobox.com):
>>>> "Serge E. Hallyn" <serge at hallyn.com> writes:
>>>>> Quoting Nathan Lynch (ntl at pobox.com):
>>>>>> "Serge E. Hallyn" <serue at us.ibm.com> writes:
>>>>>>> + cnt = ref->users + 1;
>>>>>>> + switch (ref->type) {
>>>>>>> + case CR_OBJ_UTSNS:
>>>>>>> + utsns = ref->ptr;
>>>>>>> + cnt2 = (unsigned long) atomic_read(&utsns->kref.refcount);
>>>>>>> + if (cnt != cnt2) {
>>>>>>> + cr_debug("uts namespace leak\n");
>>>>>> I'm struggling to understand what guarantee a check such as this is
>>>>>> supposed to be making. I see that it will catch *some* undesirable
>>>>>> cases. But "current refcount equals old refcount" does not imply that
>>>>>> "refcount has not changed in the meantime".
>>>>> It's got nothing to do with the refcounts changing.
>>>>>
>>>>> It ensures that, at the end of the checkpoint, the resources (utsns
>>>>> in this case) had no users not accounted for by a checkpointed task.
>>>>> In other words, there was no information leak.
>>>> Okay, I had mistakenly believed this code was running in the
>>>> subtree/non-container case. I reread your patch description and it
>>>> indicates that these checks are made only in the case of container
>>>> checkpoint. If I'm (finally) understanding the patch correctly, my
>>>> concern is lessened. Comparing refcounts is still... unconventional.
>>> Yes, and there are cases where it won't be usable - for instance if
>>> opening a procfile increments the resource->use count. That should
>>> not be an issue for utsns, ipcns, files, or vmas, afaik.
>> Actually, one such case is if you have a FIFO - and a task outside the
>> "container" (for whatever definition we choose) opens that FIFO because
>> the right thingie is mounted in its (distinct) mounts namespace.
>
> That'll affect the CR_OBJ_INODE object, right? (Not the CR_OBJ_FILE
> one).
Yes. And the point is that this leak cannot be reliably detected.
... unless you hide FIFO's in the network namespace :o
(no ! I'm no suggesting...)
Oren.
>
>> Also, unsure if unix domain sockets (those visible through the file
>> system, not the "abstract" type) are otherwise isolated as well ?
>
> Yes, they are isolated by network namespace, to the chagrin of some
> people.
>
> -serge
>
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list