<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi Nicolas,<div><br></div><div>Thanks a lot for sharing your solution, I just tried it and it gave me this error when restoring it:</div><div><br></div><div><div>(00.165775) 773832: unix: Opening standalone (stage 0 id 0x10 ino 7105412 peer 7105411)</div><div>(00.165791) 773832: unix: bind id 0x10 ino 7105412 addr /tmp/redis.sock</div><div>(00.165798) 773832: Error (criu/sk-unix.c:1637): unix: Can't bind id 0x10 ino 7105412 addr /tmp/redis.sock: Address already in use</div><div>(00.165804) 773832: Error (criu/files.c:1211): Unable to open fd=8 id=0x10</div><div>(00.166031) Error (criu/cr-restore.c:1565): 773832 exited, status=1</div><div>(00.166044) Error (criu/cr-restore.c:2488): Restoring FAILED.</div><div><br></div></div><div>It seems CRIU would think the socket act as a server in this case. I also tried to avoid bind when the state is TCP_CLOSE. But it gives me segfault since it still tries to connect back to the previous peer.</div><div><div style="color:rgb(0,0,0)"><div><br></div><div>(00.133302) 773853: Create fd for 6</div><div>(00.133305) 773853: unix: Opening standalone (stage 0 id 0xf ino 7113121 peer 0)</div><div>(00.133329) 773853: unix: bind id 0xf ino 7113121 addr /tmp/redis.sock</div><div>(00.133420) 773853: unix: Putting 7113121 into listen state</div><div>(00.133425) 773853: sockets: 7 restore sndbuf 212992 rcv buf 212992</div><div>(00.133428) 773853: sockets: restore priority 0 for socket</div><div>(00.133430) 773853: sockets: restore rcvlowat 1 for socket</div><div>(00.133432) 773853: sockets: restore mark 0 for socket</div><div>(00.133436) 773853: Create fd for 7</div><div>(00.133438) 773853: unix: Opening standalone (stage 0 id 0x10 ino 7113153 peer 7113152)</div><div>(00.133451) 773853: Create fd for 8</div><div>(00.133455) 773853: unix: Opening standalone (stage 1 id 0x10 ino 7113153 peer 7113152)</div><div>(00.133457) 773853: unix: Connect 7113153 to 7113152</div><div>(02.630685) Error: 773853 killed by signal 11: Segmentation fault</div><div>(02.630709) Error: Restoring FAILED.</div></div></div><div><br></div><div>Where did you handle the TCP_CLOSE state when restoring it? And I don't quite understand the fle->stage here. </div><div><div><br></div><div><br></div><div><div></div></div></div><div>Thanks,</div><div>Jun Gan</div></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Sep 8, 2020 at 1:42 PM Nicolas Viennot <<a href="mailto:Nicolas.Viennot@twosigma.com">Nicolas.Viennot@twosigma.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">I think we had the same need. Here’s the commit that solves this problem: <a href="https://github.com/twosigma/criu/commit/e562e97f29c98d155b02a871493533b24ecf2abb" rel="noreferrer" target="_blank">https://github.com/twosigma/criu/commit/e562e97f29c98d155b02a871493533b24ecf2abb</a><br>
<br>
Nico<br>
<br>
---<br>
<br>
From: <a href="mailto:criu-bounces@openvz.org" target="_blank">criu-bounces@openvz.org</a> <<a href="mailto:criu-bounces@openvz.org" target="_blank">criu-bounces@openvz.org</a>> On Behalf Of Jun Gan<br>
Sent: Sunday, September 6, 2020 7:51 PM<br>
To: <a href="mailto:criu@openvz.org" target="_blank">criu@openvz.org</a><br>
Subject: [CRIU] Checkpoint and restore application has established unix domain socket connections.<br>
<br>
Hi,<br>
<br>
I'm using CRIU to checkpoint and restore redis, which has a redis-cli connected to it via a unix domain socket. I cannot just use --ext-unix-sk as it sets socket type to STREAM. Here is the command I used for dump:<br>
<br>
criu dump -t 23253 -D criu_imgs/ -o dump.log --shell-job -v4 --external unix[2940132]<br>
<br>
And here is the command I used for restore:<br>
<br>
criu restore -D criu_imgs/ -o restore_.log --shell-job -v4 --ext-unix-sk --tcp-close<br>
<br>
Then I found I got segv in the restore log:<br>
<br>
(00.014810) 23253: unix: Opening standalone (stage 0 id 0x10 ino 2940132 peer 2940131)<br>
(00.014823) 23253: Create fd for 8<br>
(00.014826) 23253: unix: Opening standalone (stage 1 id 0x10 ino 2940132 peer 2940131)<br>
(00.014829) 23253: unix: Connect 2940132 to 2940131<br>
(00.015683) Error (criu/cr-restore.c:1417): 23253 killed by signal 11: Segmentation fault<br>
(00.015780) Error (criu/cr-restore.c:2293): Restoring FAILED.<br>
<br>
<br>
Checking the code, I found that CRIU will still try to restore this socket by connecting it back to the original peer, which may not be available anymore. Is there any option to let criu bypass restoring this socket and leave it close just like "--tcp-close" ? In my case, I don't need it anymore, and can just ask the client to re-connect.<br>
<br>
-- <br>
Jun Gan<br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature">Jun Gan</div>