[CRIU] lxc-checkpoint 1.1.5 works with criu 1.6.1 but not master
Tycho Andersen
tycho.andersen at canonical.com
Wed Jan 6 06:15:25 PST 2016
Hi Adrian,
On Tue, Jan 05, 2016 at 06:47:55PM +0100, Adrian Reber wrote:
> Running lxc-checkpoint works with CRIU 1.6.1 but not with today's
> master.
>
> I get the following dump.log with today's master:
>
> (00.000250) Probing sock diag modules
> (00.000280) Done probing
> (00.000283) ========================================
> (00.000286) Dumping processes (pid: 10794)
> (00.000287) ========================================
> (00.000289) Running pre-dump scripts
> (00.000315) Found anon-shmem device at 4
> (00.000321) Reset 16059's dirty tracking
> (00.000329) Warn (mem.c:56): Can't reset 16059's dirty memory tracker (22)
> (00.000341) Unlock network
> (00.000347) Unfreezing tasks into 1
> (00.000352) Error (cr-dump.c:1578): Dumping FAILED.
I've not seen this before. Based on a quick glance through the source,
looks like the write() to /proc/pid/clear_refs is failing with EINVAL,
which probably means your kernel is too old. Seems like this shouldn't
be a fatal failure, though, as lxc-checkpoint doesn't try to use any
memory tracking features.
>
> I get the following error with v1.8:
>
> (00.000969) cg: Set 1 is criu one
> (00.001276) Error (ptrace.c:64): suspending seccomp failed: Invalid argument
> (00.001281) Unfreezing tasks into 1
> (00.001282) Unseizing 10794 into 1
> (00.001284) Error (ptrace.c:54): Unable to detach from 10794: No such process
> (00.001293) Unlock network
> (00.001297) Unfreezing tasks into 1
> (00.001298) Unseizing 10794 into 1
> (00.001299) Error (ptrace.c:54): Unable to detach from 10794: No such process
> (00.001402) Error (cr-dump.c:1641): Dumping FAILED.
>
> and it works with 1.6.1. I am running on CentOS 7.2 which has experimental CRIU
> support but the kernel is not the newest: 3.10.0-327.4.4.el7.x86_64
Right, the error above is because LXC uses seccomp by default, but
your kernel isn't new enough to support c/r of seccomp. 1.6.1 "works",
but it doesn't do all of the security features (e.g. there is no
seccomp filters on restored tasks).
A quick hack to disable seccomp is simply put,
lxc.seccomp =
at the bottom of the container's config file and restart it. Then it
won't have any seccomp filters, and you should be able to get away
with 1.8.
> With lxc-checkpoint and criu 1.6.1 I can migrate lxc containers from one host
> to another. Unfortunately not back. The migration back to the original host
> with criu 1.6.1 fails with:
>
> (00.023469) 1: [./sys/devices/virtual/net](192->172)
> (00.023494) 1: [./sys/devices/virtual/net](193->192)
> (00.023496) 1: <--
> (00.023497) 1: <--
> (00.023498) 1: <--
> (00.023499) 1: <--
> (00.023500) 1: <--
> (00.023507) 1: Start with 164:./
> (00.023508) 1: Mounting unsupported @./ (0)
> (00.023509) 1: 164:./ private 0 shared 0 slave 1
> (00.023513) 1: Mounting tmpfs @./dev (0)
> (00.024140) 1: Error (mount.c:1860): Can't mount at ./dev: Invalid argument
> (00.024269) Error (cr-restore.c:1895): Restoring FAILED.
>
> The first migration (host1 -> host2) works. Only back to host1 fails.
> If I restore the image dumped on host2 also on host2 it also works.
>
> Any ideas what might be wrong with my setup?
IIRC there was some issue with this in older versions of CRIU and
mounts, but unfortunately I don't remember the exact details :(
Tycho
More information about the CRIU
mailing list