[CRIU] Restore failed. Exit code: 43

Pavel Emelyanov xemul at parallels.com
Tue Jan 20 07:13:19 PST 2015


On 01/20/2015 05:57 PM, Paschalis Mpeis wrote:
> Thanks for your replies Pavel.
> Please see inline!
> 
> ​My 1st try on restore:​
>  
> 
>     >
>     >     criu_set_log_file("restore.log");
>     >     pid = criu_restore_child();
>     >     if (pid <=0){ what_err_ret_mean(pid);
>     >     exit(-1);
>     >     }
>     >
>     >     if(waitpid(pid, &ret, 0)<0){
>     >     perror("Can't wait for restore");
>     >     kill(pid,SIGKILL);
>     >     exit(-1);
>     >     }
>     >     return chk_exit(ret,SUCC_DUMP_ECODE);
>     >
>     >
>     > ​chk_exit​ was printing the "exit 43" message. It is the function found here:
>     > https://github.com/xemul/criu/blob/master/test/libcriu/lib.c
> 
>     Ah, I see :) Then everything seem to be OK. Look, when you call the criu_dump()
>     with zero pid what gets dumped is the process that does this call in the state
>     when it has just sent the dump request to service. And this particular state
>     is written in the images.
> 
>     Thus, when you call criu_restore_child() the restored process gets restored in
>     a state where it "thinks" as if it has just received the response from the "dump"
>     request.
> 
>     Then, in the code above, this situation ends up in ret = SUCC_RSTR_ECODE (which
>     is 43) and exit(ret) which thus is exit(43). Then the parent wait()-s for the
>     kid checks it exit code, sees it being the 43 one a prints on the screen.
> 
>     And this is what I see in your logs :) So, for now, everything looks to work as
>     expected. If you think it's not, let's discuss how you expect it to work, we'll
>     try to help.
> 
> 
> ​My second try on restore:​
>  
> 
>     > Then I changed the restore code simply:
>     >
>     >     criu_set_log_file("restore.log");
>     >     criu_restore();
>     >
>     >
>     > ​This produces a similar restore.log, with success messages, but the program does not seem to continue.
> 
>     But according to the code you have shown the program continues with returning
>     from the criu_dump() call, then sets ret to SUCC_RSTR_ECODE == 43, then calls
>     exit().
>> 
> 
> On my 1st try, the restore function was calling "chk_exit".
> But this function never calls "exit". Where did the exit occurred?

Inside the restored process.

> Also, ehon the second try, I simply call criu_restore(). 
> After the restore, I simply return 0, and then the execution flow goes to the main function,
> which then simply returns.

But the criu_restore() create new process and puts it into a state where it
returns from criu_dump() and calls exit() anyway.

> But I do not see anything being replayed, e.g.
>  output of the initial execution
> ​ (that I captured earlier)​
> does not show up again
> ​. What am I doing wrong here?
> 
> My main function is very simple, like this:
> 
>         if(crMode==CAPTURE){
>             runLinpack();
>     ​      // in above function, at some point of execution,
>           // I initialise_criu, and then I dump the application

AFAIU you dump it w/o specifying the pid, so what you dump is this process itself.

>        }
>         else if(crMode==REPLAY){
>             initialise_criu(CRIU_IMG_DIR);
>             restoreApplication();

In this call you will create a new process, that would then be restored in the
state as if it has just called the criu_dump() method of the library. Then the
execution of _this_ _new_ process will continue, the task (according to the
dumping code you've shown) will call exit(43) and that's it.

>         }
>         return 0;
> 
>  
> I run the main function two separate times. One to capture, but still continue playing the
> application, and another one to replay the application from the earlier snapshot!

I'm now confused, sorry. Can you post 

* the full program you run (either, but only one way of calling restore, not two)
* the way you run it (the shell commands you execute, one by one)
* the result you see (if dump and restore are OK, then logs are not required)
* and what you expect it to look like

Thanks,
Pavel



More information about the CRIU mailing list