[CRIU] Checkpoint and restore application has established unix domain socket connections.

Nicolas Viennot Nicolas.Viennot at twosigma.com
Tue Sep 8 23:42:15 MSK 2020


I think we had the same need. Here’s the commit that solves this problem: https://github.com/twosigma/criu/commit/e562e97f29c98d155b02a871493533b24ecf2abb

Nico

---

From: criu-bounces at openvz.org <criu-bounces at openvz.org> On Behalf Of Jun Gan
Sent: Sunday, September 6, 2020 7:51 PM
To: criu at openvz.org
Subject: [CRIU] Checkpoint and restore application has established unix domain socket connections.

Hi,

I'm using CRIU to checkpoint and restore redis, which has a redis-cli connected to it via a unix domain socket. I cannot just use --ext-unix-sk as it sets socket type to STREAM. Here is the command I used for dump:

criu dump -t 23253 -D criu_imgs/ -o dump.log --shell-job -v4 --external unix[2940132]

And here is the command I used for restore:

criu restore -D criu_imgs/ -o restore_.log --shell-job -v4 --ext-unix-sk --tcp-close

Then I found I got segv in the restore log:

(00.014810)  23253: unix: Opening standalone (stage 0 id 0x10 ino 2940132 peer 2940131)
(00.014823)  23253:             Create fd for 8
(00.014826)  23253: unix: Opening standalone (stage 1 id 0x10 ino 2940132 peer 2940131)
(00.014829)  23253: unix:       Connect 2940132 to 2940131
(00.015683) Error (criu/cr-restore.c:1417): 23253 killed by signal 11: Segmentation fault
(00.015780) Error (criu/cr-restore.c:2293): Restoring FAILED.


Checking the code, I found that CRIU will still try to restore this socket by connecting it back to the original peer, which may not be available anymore. Is there any option to let criu bypass restoring this socket and leave it close just like "--tcp-close" ? In my case, I don't need it anymore, and can just ask the client to re-connect.

-- 
Jun Gan



More information about the CRIU mailing list