[CRIU] problem dumping some kinds of lxc containers

Pavel Emelyanov xemul at parallels.com
Wed Aug 27 07:27:25 PDT 2014


On 08/27/2014 05:43 PM, Tycho Andersen wrote:
> On Wed, Aug 27, 2014 at 05:11:39PM +0400, Pavel Emelyanov wrote:
>> On 08/27/2014 05:08 PM, Tycho Andersen wrote:
>>> Hi Pavel,
>>>
>>> On Wed, Aug 27, 2014 at 12:35:07PM +0400, Pavel Emelyanov wrote:
>>>> On 08/27/2014 03:18 AM, Tycho Andersen wrote:
>>>>> Hi all,
>>>>>
>>>>> I'm trying to dump an lxc container (created with the ubuntu-cloud
>>>>> template). I get:
>>>>>
>>>>> (00.563988) Error (files-reg.c:457): Can't link remap to /proc/20/mountinfo: No such file or directory
>>>>>
>>>>> /proc/20 doesn't exist, and when this happens there is no pid in the
>>>>> container with pid 20. This is a little confusing, though, since
>>>>> fill_fdlink() takes a struct fd_parms with a pid in the host pid ns,
>>>>> but gives back the path in the container pid ns.
>>>>>
>>>>> After a bit of debugging, I found that the process that is causing
>>>>> this problem is:
>>>>>
>>>>> root     17593  0.3  0.0  26052  1340 ?        S    17:49   0:00 \_ mountall --daemon
>>>>>
>>>>> If I try to checkpoint the container after mountall has exited, it all
>>>>> works fine.
>>>>>
>>>>> Any ideas what is going on here?
>>>>
>>>> Yes. CRIU finds an open file, that cannot be opened by the path kernel provides.
>>>> In your case this is because task 20 has died. At the same time stat() reports
>>>> that the link count on that file is not 0 (this is due to how proc works), which
>>>> in case of disk file would mean, that file "should exist" and we just have to
>>>> create some other name for it. This is called "link remap". For disk files CRIU
>>>> handles it by creating a hard link on the file. For proc this will obviously not
>>>> work, we have to invent something else.
>>>
>>> Thanks for the explanation. Any ideas on what the proper solution is?
>>
>> I was thinking that when we meet an opened file of a dead task, we could create
>> a "fake" one on restore with desired pid (it can be a light-weight task with FS,
>> VM, FILES, etc. shared with parent), wait for its /proc/pid/smth to get opened,
>> then kill one.
>>
>> We have TASK_HELPER state for that in CRIU, they help us restore orphaned pgrps
>> and sessions. Probably these helpers can help here too :)
> 
> Just to repeat it in my own words so I understand: if we see a
> /proc/$pid that doesn't exist, we write it down somewhere (a new
> protobuf, or would it fit somewhere currently?), and then on restore
> we create a task helper for each pid we found. The rest of the restore
> process opens /proc/$pid/whatever, and then it is ok for the task
> helper to exit immediately once the restore completes and we are
> running the actual processes.

Exactly.

> Does that sound about right? Any ideas where the "fake" pid list would
> go?

We have a (not extremely elegant, but) way of telling to files
restoring engine, that "the path of a file doesn't exit, need to
take additional steps before opening one" called "remap".

Currently we have 2 types of remaps -- ghost and link. Ghost remap
is a file, that doesn't have any names on disk and is only alive
due to some process has one opened. Ghosts are taken with us into
images. Link remaps are files with the path we see it by being
unlinked, but with some other name. For this files we create hard
links on disks.

I think it's worth trying to introduce the 3rd type of remaps
which correspond to proc files that correspond to dead tasks.

Thanks,
Pavel



More information about the CRIU mailing list