[CRIU] Restore error

Gabriel Southern southerngs at gmail.com
Fri Feb 26 23:39:52 PST 2016


Hi,

I'm using CRIU with Docker using the Docker fork
https://github.com/boucher/docker.git.

Sometimes my attempts to restore a container fail and when I look in the
criu logs I see something like the following error (full restore log
available here: https://gist.github.com/southerngs/34d3ce928f35e24e3dbb)

(00.401627)      1: Restoring resources
(00.401633)     22: Restoring fd 0 (state -> prepare)
(00.401644)     22: Create transport fd /crtools-fd-22-0
(00.401644)      1: Opening fdinfo-s
(00.401652)      1: Restoring fd 0 (state -> prepare)
(00.401655)      1: Restoring fd 1 (state -> prepare)
(00.401658)      1: Restoring fd 2 (state -> prepare)
(00.401660)      1: Restoring fd 0 (state -> create)
(00.401663)     22: Error (files.c:840): Can't bind unix socket
/crtools-fd-22-0: Address already in use
(00.401684)      1: Create fd for 0
(00.401687)      1: Wait fdinfo pid=22 fd=0
(00.403238)      1: Error (cr-restore.c:1302): 22 exited, status=1
(00.459682) Error (cr-restore.c:1304): 6804 killed by signal 9
(00.526451) Error (cr-restore.c:2130): Restoring FAILED.

This error is not completely deterministic.  Usually if the restore attempt
fails if I wait and retry the command then it will succeed the second
time.  The problem only occurs when there is a lot of checkpoint/restore
activity going on.  But my use case involves restoring a lot of containers
simultaneously and letting them run for a short period of time.  I might be
able to work around this problem by catching an error during a failed
restore and then retrying.  But if I could reduce the number of failed
restore attempts that would be helpful for me.

I'm working with criu from the github master branch.  I produced the error
with version:
Version: 2.0
GitID: v1.8-413-g2fd16c3

Unfortunately I don't know criu works well enough to have good
troubleshooting ideas just from looking at this log.  So I thought I'd ask
here to see if there are any suggestions so I can understand the root cause
and what I might be able to change to prevent it.  Any advice is
appreciated.

Thanks,

-Gabriel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20160226/7f4dd96e/attachment.html>


More information about the CRIU mailing list