[CRIU] criu_restore() in Open MPI problems

Pavel Emelyanov xemul at parallels.com
Thu Apr 10 10:13:46 PDT 2014


On 04/10/2014 07:36 PM, Adrian Reber wrote:
> On Wed, Apr 09, 2014 at 07:17:15PM +0400, Pavel Emelyanov wrote:
>>>> I think the restore scheme should look like this:
>>>> We run orterun, which prepare pipes and executes "CRIU restore".
>>>> The OpenMPI plugin takes preparate pipes and restores them in a proper
>>>> file descriptors.
>>>
>>> It took me a while, but now I tried to restore a process from
>>> orterun/mpirun with exec()ing 'criu restore' 
>>
>> Thanks for doing this! :)
>>
>>> I still get this error:
>>>
>>> (00.030652)   4277: tty: open type pts id 0x2 index 14 (master 0 sid 0 pgrp 0 inherit 1)
>>> (00.030654)   4277: Error (tty.c:541): tty: Can't dup SELF_STDIN_OFF: Bad file descriptor
>>> (00.031052) Error (cr-restore.c:1035): 4277 exited, status=255
>>> (00.031072) Error (cr-restore.c:1577): Restoring FAILED.
>>>
>>> You are talking about a plugin which restores the missing pipes. How
>>> would such a plugin have to look like? Do you have any examples on
>>> re-creating pipes in a plugin?
>>
>> Well, we don't have the pipe-restoring plugins. But we have examples of
>> unknown file descriptors plugins at test/zdtm/live/static/criu-rtc.c. This
>> one sits on the cr_plugin_dump_file/cr_plugin_restore_file hooks. I think
>> it's perfectly possible to add needed for pipes.
>>
>> You might also be interested in this mailing thread:
>> http://lists.openvz.org/pipermail/criu/2014-March/012929.html
>>
>> In it Cyrill tried to provide hooks for intercepting dump of TCP sockets.
>> Intercepting of pipes dump should probably look similar.
>>
>> BTW, since pipes we're talking about are really external (one end sits
>> outside of our dump tree) I think we should detect this fact and call
>> plugins for external pipes.
>>
>> Hope that helps :) If you need more info -- feel free to ask.
> 
> Another thought. I am running mpirun which starts opal-restart which
> execvp()'s 'criu restore'. opal-restart's FDs (stdout/stderr) are
> already connected to the correct pipes of mpirun. Is there a way for
> criu to inherit those FDs. So instead of re-opening stdout/stderr it
> just uses the FDs 'criu restore' already uses? So maybe the plugin just
> detects that at the same FD there is already a pipe and just re-uses
> that pipe?

I think that plugin should detect that we're having a PIPE via environment
and use one on restore. What do you think, would that be possible?

Thanks,
Pavel


More information about the CRIU mailing list