[CRIU] OVZ7 problems with check pointing cPanel VEs

Andrei Vagin avagin at virtuozzo.com
Wed Nov 2 22:02:05 PDT 2016


On Fri, Oct 28, 2016 at 11:10:16AM -0300, Jayme wrote:
> We run several OVZ host nodes with many cPanel VE's on them. We put up
> a few new OVZ7 host nodes with intentions on transitioning over to
> that at some point in the future. At first I used the ovztransfer.sh
> script to transfer a cPanel VE from old OVZ to OVZ7 which worked ok,
> the VE started and operated as expected but when I went to
> snapshot/checkpoint the VE I started running in to problems. I cannot
> get OVZ7 to snapshot this cPanel VE.
> 
> Ok, so I thought perhaps it was a botched copy or something related to
> the ovztransfer.sh script since it's not technically an official
> script (but is mentioned in the OVZ documentation). I decided to
> create a brand new centos7 VE on the OVZ7 hostnode and did a fresh
> cPanel install then migrated the cPanel users from old to new server
> using typical cPanel fashion (transfer tool). Everything worked a
> expected but then I tried to snapshot the new container and again it
> is erroring out. I have another container on the same hostnode that I
> can snap without a problem.
> 
> There definitely seems to be some sort of bug that I'm hitting when
> snapshotting this particular cPanel container under OVZ7. It is a
> total road block, I cannot continue a transition to ovz7 until I know
> that I can checkpoint VE's reliably.
> 
> Here are some details:
> 
> Hostnode: Virtuozzo Linux release 7.2 / Linux 3.10.0-327.36.1.vz7.18.7
> 
> VE: CentOS Linux release 7.2.1511 (Core) - Running a brand new install
> of the latest version of cPanel with about ~600 active users recently
> migrated to it using cPanel transfer tool / ~250GB of data.
> 
> # prlctl snapshot 1035
> Creating the snapshot...
> PRL_ERR_VZCTL_OPERATION_FAILED (Details: Failed to checkpoint the Container
> All dump files and logs were saved to
> /vz/private/1035/dump/{ce2b2c58-00ac-4e2f-b4da-e0a0dc594ff4}.fail
> Failed tp dump the Container, status pipe unexpectedly closed
> Failed to dump Container
> Failed to resume Container
> Failed to create snapshot
> )
> Failed to create the snapshot: Unknown
> 
> 
> What I think is the relevant info from the dump.log file
> 
> (04.143824) Error (criu/sk-inet.c:158): In-flight connection (l) for 924d83
> (04.143831) Error (criu/sk-inet.c:160): In-flight connections can be
> ignored with the --skip-in-flight option.
> (04.143868) Error (criu/cr-dump.c:1322): Dump files (pid: 256864) failed with -1
> (04.168423) Error (criu/cr-dump.c:1634): Dumping FAILED.
> 
> 
> I ran the snapshot again and got a different error on the second pass
> but seems to still be related to sockets in some way:
> 
> (04.005642) fdinfo: type: 0x4 flags: 02000002/01 pos: 0 fd: 5
> (04.005657) 287702 fdinfo 6: pos: 0 flags: 2000002/0x1
> (04.005665) Searching for socket 9888dd (family 2.6)
> (04.005675) Error (criu/sk-inet.c:202): Name resolved on unconnected socket
> (04.005680) ----------------------------------------
> (04.005693) Error (criu/cr-dump.c:1322): Dump files (pid: 287702) failed with -1
> (04.005726) Waiting for 287702 to trap
> (04.005753) Daemon 287702 exited trapping
> (04.005761) Sent msg to daemon 5 0 0
> pie: 20365: __fetched msg: 5 0 0
> pie: 20365: 20365: new_sp=0x7f27094c8008 ip 0x7f271180c20e
> 
> I attached the first dump attempt

CRIU doesn't support transit states of tcp sockets. It is under
development right now.

https://github.com/xemul/criu/issues/194

Thanks,
Andrei

> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu



More information about the CRIU mailing list