[CRIU] Error CRIU restore because pid not matched

Pavel Emelyanov xemul at parallels.com
Tue Jan 13 07:28:19 PST 2015


On 01/13/2015 06:06 PM, Christopher Covington wrote:
> On 01/13/2015 08:36 AM, Pavel Emelyanov wrote:
>> On 01/12/2015 11:20 PM, Aris Setyawan wrote:
>>>> Do I get you right, when you restore the processes in pid namespace
>>>> you need to know its new PID in the initial one to  dump it again
>>>> some time soon?
>>>
>>> Yes.
>>>
>>>> If yes, then the --pidfile option would help you. When used on restore
>>>> it makes CRIU write the real pid of the root task restored into a file
>>>> specified.
>>>
>>> But, you have said in earlier comment:
>>>
>>> "In theory we can let process live with whatever PID kernel allocates
>>> for it, but our knowledge of glibc says that most likely there will
>>> be BUGs."
>>>
>>> Is that "--pidfile" option still working? I just want to workaround
>>> with PID mismatch error.
>>
>> Ah, sorry for confusion. The option I mentioned would only help when
>> the task you restore lives in the pid namespace and criu knows it. When
>> you create new pidns and run criu in it, it would still think task is
>> not in pidns and will just restore one, thus the --pidfile would write
>> the virtual pid in it.
>>
>> So the current state of things is -- if tasks didn't live in pid namespace
>> on dump, there's no handy way to restore them into new pidns and keep
>> tracking their new pids.
>>
>> How about teaching CRIU restore tasks into the new pid namespace, even
>> if the tasks weren't in it on dump? I was thinking about such a feature
>> some time ago, but didn't think there would be a user for it.
>>
>> There's one thing with the feature I don't know what to do about, here
>> it is. If we dump a task with pid, e.g. 42, then ask criu to restore one
>> into a pid namespace, then criu would create a pidns, fork a task in it
>> and restore one from images. This new task will have virtual pid being
>> 42 and real one being some other value (written into pid file with the
>> option). The question is -- who should be the init of the new pid namespace,
>> i.e. the task with virtual pid 1?
> 
> I might use the feature for restoring multiple copies of the same process. I
> don't really have a good reason for not using a namespace to begin with--just
> a little paranoia about it affecting performance and the probably unused
> opportunity to reuse existing checkpoints. The following is what I settled on
> for no namespace when dumping, but using a namespace on restore.
> 
> unshare -fp -- criu restore
> 
> So a waiting/non-detaching criu process is pid 1.

But this is ... nasty. Look, let's imagine some non-root process dies,
then all its kids get reparented to init (i.e. -- criu), then one of the
latter guys dies. CRIU's call to wait() exits and criu finishes thus
bringing the whole namespace down.

> This seems to work for
> trivial tests, but I haven't really tried to stress it. A shortcoming I noted
> at one point was that --shell-job didn't work in this configuration and I had
> to launch the dumpee with setsid (this was with summer 2014 code, it could
> have been fixed too).

:)

> I had to start building util-linux-ng for my embedded style root filesystems
> in order to get unshare. If criu supported this mode of operation itself (or
> busybox got unshare or I switched to toybox which I think has unshare), then
> that'd be one less dependency for me to manage, but that's not bubbled to the
> top of my list yet.

So we have one more reason for having this feature. That's great! I've
added one onto the http://criu.org/Todo list.

Thanks,
Pavel




More information about the CRIU mailing list