[CRIU] [PATCH] criu: Add exec-cmd option.

Deyan Doychev deyan at 1h.com
Thu Mar 20 08:06:34 PDT 2014


On 03/20/2014 04:36 PM, Pavel Emelyanov wrote:
> On 03/20/2014 06:25 PM, Deyan Doychev wrote:
>> Hi Christopher,
>>
>> On 03/20/2014 04:00 PM, Christopher Covington wrote:
>>>>> In my testing, a non-zero exit code wasn't propagated to the command line on
>>>>>>> failure.
>>>>> This is because the restore process becomes a daemon prior to restoring
>>>>> the dumped process when --exec-cmd is used.
>>>>>
>>>>> I am not sure what the right action has to be if we fail to execute the
>>>>> command as we have already restored the processes.
>>>>> Should we consider this a full failure and if so - should we kill the
>>>>> processes we have restored?
> Good question. I really doubt we should kill them, since this creates a gap 
> between restore and kill during which tasks may make progress that can make 
> images obsoleted, e.g. if they have alive tcp connection.
>>>>> Maybe the right thing  to do is daemonize only when -d was given instead
>>>>> of implying this option and always daemonizing. This way if -d is not
>>>>> specified we will exit with failure. But please advise what should we do
>>>>> with the restored processes?
>>> What do the options look like in the LXC context?
>>>
>>> For the perf use case killing would make retrying with a corrected exec-cmd
>>> string slightly easier. Letting it run would be fine too, though, since it's
>>> not that much work for the user to manually kill the process before retrying.
>>>
>>> In my use case it's not acceptable to have an orphaned restored process
>>> running in a separate PID namespace because it might alter system performance
>>> undesirably. However, I can imagine other workloads where extra copies,
>>> especially if they were sleeping, might not be much to worry about.
>>>
>>> Regards,
>>> Christopher
>> Killing the tasks seems to be better for us as well. If we leave them
>> running we have a running container that is absolutely ready for use but
>> LXC does not know about it and it looks to the outside LXC world like it
>> is not running.
> That's for the case of LXC container. OpenVZ container, for example, lives
> w/o task watching it, so we simply restore it with the --restore-detached
> option.

Yes, this seems like a good argument not to kill them as someone may try
to restore an OpenVZ container with some task watching it and we can
make the images obsolete if we fail to execute the task. This may happen
the way you described above - with the TCP connection.

>> I currently can't imagine a use case where it will be a good idea to
>> restore the tasks without executing the command. Anyone else?
> If we're talking about restoring w/o executing _and_ w/o detaching, then two
> use cases I know are:

Sorry, I think I was not clear enough. I am talking about restoring with
executing and detaching in the case when execve fails.

Regards,
Deyan


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140320/f608df65/attachment.sig>


More information about the CRIU mailing list