[CRIU] Error CRIU restore because pid not matched
Pavel Emelyanov
xemul at parallels.com
Wed Dec 31 05:58:35 PST 2014
On 12/31/2014 04:50 PM, Aris Setyawan wrote:
> Hi,
>
> I still have many PID mismatch, when the restored process have been
> checkpoint-ed fo along time (more than one hour). Please note that I
> run this on a busy system, where many process run and killed, very
> often.
>
> About your suggestion, I still can understand:
>
>> How to prevent this?
>> So it can't be fixed?
>
> In theory we can let process live with whatever PID kernel allocates
> for it, but our knowledge of glibc says that most likely there will
> be BUGs.
>
> One way to work around this is to unshare the pid namespace with
> unshare -p, then call restore. But in this case you may suffer from
> /proc being the proc from former pid namespace, not the new one. This,
> in turn, can be solved by unsharing the mount namespace too and
> re-mounting the /proc.
>
> The most viable solution for this type of usecases is to checkpoint
> and restore tasks living in namespaces from the very beginning, i.e.
> start them in this or that form of container.
>
>> Btw, the error caused by "pid mismatch" still can occur. Is this an
>> expected behavior?
>
> Yes, some possibility to re-use the PID still exists. On a running
> systems doing C/R is only "safe" for containers.
>
>
> My questions:
> Is error PID mismatch "guaranteed" impossible if I doing C/R for container?
Yes. When you C/R a whole container (even just a pid namespace) the "pid
mismatch" error is guaranteed NOT to happen.
> Is there any documentation about this?
Not yet, but you've asked a great question :) I've created a wiki page [1]
that will get eventually filled with typical C/R failures and descriptions
of why this happens and what to do next.
[1] http://criu.org/When_C/R_fails
Thanks,
Pavel
P.S. I'd appreciate if any discussion about CRIU happens with the mailing
list in Cc. My responsiveness throughput is limited :) but on the mailing
list there are quite a lot of other people that can help.
More information about the CRIU
mailing list