[CRIU] [PATCH] criu: Add exec-cmd option (v2)

Pavel Emelyanov xemul at parallels.com
Fri Mar 21 03:02:43 PDT 2014


On 03/21/2014 01:39 PM, Deyan Doychev wrote:
> 
> On 03/21/2014 11:14 AM, Pavel Emelyanov wrote:
>>>> ...
>>>>>>>>  		wait(NULL);
>>>>>>>>  
>>>>>>>>  	return 0;
>>>>>>>> @@ -1600,6 +1600,9 @@ int cr_restore_tasks(void)
>>>>>>>>  {
>>>>>>>>  	int ret = -1;
>>>>>>>>  
>>>>>>>> +	if (opts.exec_cmd && opts.restore_detach && daemon(1, 0))
>>>>>> In that case we will lose the ability to get error code from failed restore.
>>>>>> Can we move the daemonizing to the very end of the restore procedure?
>>>>
>>>> We want the new process to be the father of the restored processes so
>>>> the last place I see is right before the fork_with_pid(init); call in
>>>> restore_root_task.
>> Hm... Indeed. The daemon will create a child process and make the parent exit.
>> Thus doing it after restore is not what we want :(
>>
>>>> Is this ok or do you have something else in mind?
>> This is OK, but this brings a problem -- we lose the ability to check whether
>> the restore failed or not.
>>
>> I think that making --exec-cmd work together with --restore-detached is quite
>> tricky. We have to fork another task, make this child do restore, then execv()
>> and then the parent should somehow check, that restore succeeded and exit. Or
>> propagate the error code upwards.
> 
> I have an Idea of how to implement this check. We can use two signal
> handlers in the parent process to catch CHLD and some other signal (USR1
> seems OK). The child process can then send USR1 before calling execvp.
> This way, receiving CHLD means the restore failure and USR1 - restore
> success. However this has the following two problems:
> 
> 1. we cannot check the exec call

This is hardly required, since even if it fails, we cannot abort the
restored tree.

> 2. there is some chance of concurrency issue occurring if the restored
> process finishes too fast

Yes. But the proggie we execve() should be prepared for that. We can,
by the way, execute it with sigchld blocked, so that even if restored
tree fails, the new parent binary gets sigchld only when it's ready for
that.

> I am open to other ideas.

We can create a pipe and mark it with close-on-exec. If exec succeeds, the
pipe gets closed and parent have chances to catch this fact.

>>
>> Deyan, can we do it in two steps -- first you make --exec-cmd work without the -d
>> option, i.e. -- the crtools process will just call execv at the end and that's
>> it. It's parent will wait for it to finish. And with a check that --exec-cmd and
>> -d are not used together (for sanity). I will commit this patch.
>>
>> Then you implement the support for -d and --exec-cmd together, so that we correctly
>> and synchronously handle the failed restore code.
>>
>> Is that OK for you?
>>
> 
> Yes, it is OK.
> 
> I will have some time between the two patches to work on the LXC daemon
> we will execute to restore containers and the console plugin.

Coll! I'm looking forward for the 1st patch :)

Thanks,
Pavel


More information about the CRIU mailing list