[CRIU] Error CRIU restore because pid not matched
Christopher Covington
cov at codeaurora.org
Tue Jan 13 07:06:58 PST 2015
On 01/13/2015 08:36 AM, Pavel Emelyanov wrote:
> On 01/12/2015 11:20 PM, Aris Setyawan wrote:
>>> Do I get you right, when you restore the processes in pid namespace
>>> you need to know its new PID in the initial one to dump it again
>>> some time soon?
>>
>> Yes.
>>
>>> If yes, then the --pidfile option would help you. When used on restore
>>> it makes CRIU write the real pid of the root task restored into a file
>>> specified.
>>
>> But, you have said in earlier comment:
>>
>> "In theory we can let process live with whatever PID kernel allocates
>> for it, but our knowledge of glibc says that most likely there will
>> be BUGs."
>>
>> Is that "--pidfile" option still working? I just want to workaround
>> with PID mismatch error.
>
> Ah, sorry for confusion. The option I mentioned would only help when
> the task you restore lives in the pid namespace and criu knows it. When
> you create new pidns and run criu in it, it would still think task is
> not in pidns and will just restore one, thus the --pidfile would write
> the virtual pid in it.
>
> So the current state of things is -- if tasks didn't live in pid namespace
> on dump, there's no handy way to restore them into new pidns and keep
> tracking their new pids.
>
> How about teaching CRIU restore tasks into the new pid namespace, even
> if the tasks weren't in it on dump? I was thinking about such a feature
> some time ago, but didn't think there would be a user for it.
>
> There's one thing with the feature I don't know what to do about, here
> it is. If we dump a task with pid, e.g. 42, then ask criu to restore one
> into a pid namespace, then criu would create a pidns, fork a task in it
> and restore one from images. This new task will have virtual pid being
> 42 and real one being some other value (written into pid file with the
> option). The question is -- who should be the init of the new pid namespace,
> i.e. the task with virtual pid 1?
I might use the feature for restoring multiple copies of the same process. I
don't really have a good reason for not using a namespace to begin with--just
a little paranoia about it affecting performance and the probably unused
opportunity to reuse existing checkpoints. The following is what I settled on
for no namespace when dumping, but using a namespace on restore.
unshare -fp -- criu restore
So a waiting/non-detaching criu process is pid 1. This seems to work for
trivial tests, but I haven't really tried to stress it. A shortcoming I noted
at one point was that --shell-job didn't work in this configuration and I had
to launch the dumpee with setsid (this was with summer 2014 code, it could
have been fixed too).
I had to start building util-linux-ng for my embedded style root filesystems
in order to get unshare. If criu supported this mode of operation itself (or
busybox got unshare or I switched to toybox which I think has unshare), then
that'd be one less dependency for me to manage, but that's not bubbled to the
top of my list yet.
Chris
--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
More information about the CRIU
mailing list