[CRIU] Restore failed. Exit code: 43

Pavel Emelyanov xemul at parallels.com
Tue Jan 20 05:23:53 PST 2015


On 01/20/2015 03:45 PM, Paschalis Mpeis wrote:
> ​The initialisation code, that runs before dump and restore is the following:
> 
>      
>     ​  ​
>     int img_fd = open_img_dir("wdir/i/linpack_cr/");
> 
>         criu_init_opts();
>         criu_set_service_address("./wdir/s/cs.sk <http://cs.sk>");
>         criu_set_images_dir_fd(img_fd);
>     ​​
> 
>         criu_set_log_level(4);
> 
> 
> ​The code to dump (which seems to work okay) is this:
> 
>        int pid, ret;
>         // Create a child
>         pid = fork();
>         assert(pid>=0);
> 
>         if(!pid){     // The child will dump itself
>             close(0); close(1); close(2);
>             assert(setsid()>=0);
>             criu_set_log_file("dump.log");
>             criu_set_leave_running(true)
>             ret = criu_dump();
>             if (ret < 0){
>                 what_err_ret_mean(ret);
>                 exit(1);
>             }
>             if (ret ==0)
>                 ret = SUCC_DUMP_ECODE;
>             else if (ret ==1)
>                 ret = SUCC_RSTR_ECODE;
>             else
>                 ret =1;
>             exit(ret);
>         }// end-of child code
>         // Wait for the child to be captured
>         if(waitpid(pid,&ret,0)<0){
>             perror("Can't wait child");
>             kill(pid, SIGKILL);
>             exit(-1);
>         }
>         if(chk_exit(ret,SUCC_DUMP_ECODE)){
>             kill(pid,SIGKILL);
>             exit(-1);
>         }
> 
> 
> 
> Initially the restore code was taken from one of your tests.
> I was using:
> 
>     criu_set_log_file("restore.log");
>     pid = criu_restore_child();
>     if (pid <=0){ what_err_ret_mean(pid);
>     exit(-1);
>     }
> 
>     if(waitpid(pid, &ret, 0)<0){
>     perror("Can't wait for restore");
>     kill(pid,SIGKILL);
>     exit(-1);
>     }
>     return chk_exit(ret,SUCC_DUMP_ECODE);
> 
> 
> ​chk_exit​ was printing the "exit 43" message. It is the function found here:
> https://github.com/xemul/criu/blob/master/test/libcriu/lib.c

Ah, I see :) Then everything seem to be OK. Look, when you call the criu_dump()
with zero pid what gets dumped is the process that does this call in the state
when it has just sent the dump request to service. And this particular state
is written in the images.

Thus, when you call criu_restore_child() the restored process gets restored in
a state where it "thinks" as if it has just received the response from the "dump"
request.

Then, in the code above, this situation ends up in ret = SUCC_RSTR_ECODE (which 
is 43) and exit(ret) which thus is exit(43). Then the parent wait()-s for the
kid checks it exit code, sees it being the 43 one a prints on the screen.

And this is what I see in your logs :) So, for now, everything looks to work as
expected. If you think it's not, let's discuss how you expect it to work, we'll 
try to help.

> Then I changed the restore code simply:
> 
>     criu_set_log_file("restore.log");
>     criu_restore();
> 
> 
> ​This produces a similar restore.log, with success messages, but the program does not seem to continue.

But according to the code you have shown the program continues with returning
from the criu_dump() call, then sets ret to SUCC_RSTR_ECODE == 43, then calls
exit().

> ​Thanks for your replies.
> I haven't found any other examples other than the tests directory. That's why I based my code on them.

Well, yes :) We from the very beginning treat tests as the best code examples,
as they do both -- provide the code and (!) work.

Thanks,
Pavel



More information about the CRIU mailing list