[CRIU] lxc-checkpoint 1.1.5 works with criu 1.6.1 but not master
Adrian Reber
adrian at lisas.de
Thu Jan 7 02:04:28 PST 2016
Hello Tycho,
thanks for your answers.
On Wed, Jan 06, 2016 at 07:15:25AM -0700, Tycho Andersen wrote:
> Hi Adrian,
>
> On Tue, Jan 05, 2016 at 06:47:55PM +0100, Adrian Reber wrote:
> > Running lxc-checkpoint works with CRIU 1.6.1 but not with today's
> > master.
> >
> > I get the following dump.log with today's master:
> >
> > (00.000250) Probing sock diag modules
> > (00.000280) Done probing
> > (00.000283) ========================================
> > (00.000286) Dumping processes (pid: 10794)
> > (00.000287) ========================================
> > (00.000289) Running pre-dump scripts
> > (00.000315) Found anon-shmem device at 4
> > (00.000321) Reset 16059's dirty tracking
> > (00.000329) Warn (mem.c:56): Can't reset 16059's dirty memory tracker (22)
> > (00.000341) Unlock network
> > (00.000347) Unfreezing tasks into 1
> > (00.000352) Error (cr-dump.c:1578): Dumping FAILED.
>
> I've not seen this before. Based on a quick glance through the source,
> looks like the write() to /proc/pid/clear_refs is failing with EINVAL,
> which probably means your kernel is too old. Seems like this shouldn't
> be a fatal failure, though, as lxc-checkpoint doesn't try to use any
> memory tracking features.
I have always seen the dirty memory tracking warning, but as it is a
warning it has never been a problem before. It seems, however, that with
the following commit:
commit d10835c4ee0d0b1881b926708dee9877f5fb294d
Author: Pavel Emelyanov <xemul at parallels.com>
Date: Tue Dec 15 22:25:09 2015 +0300
dump: Dont read prohibited kernel files
criu now just aborts. Reverting this commits 'fixes' the broken master
behaviour.
> > I get the following error with v1.8:
> >
[...]
> A quick hack to disable seccomp is simply put,
>
> lxc.seccomp =
>
> at the bottom of the container's config file and restart it. Then it
> won't have any seccomp filters, and you should be able to get away
> with 1.8.
Perfect, that helps. Thanks.
> > With lxc-checkpoint and criu 1.6.1 I can migrate lxc containers from one host
> > to another. Unfortunately not back. The migration back to the original host
> > with criu 1.6.1 fails with:
> >
> > (00.023469) 1: [./sys/devices/virtual/net](192->172)
> > (00.023494) 1: [./sys/devices/virtual/net](193->192)
> > (00.023496) 1: <--
> > (00.023497) 1: <--
> > (00.023498) 1: <--
> > (00.023499) 1: <--
> > (00.023500) 1: <--
> > (00.023507) 1: Start with 164:./
> > (00.023508) 1: Mounting unsupported @./ (0)
> > (00.023509) 1: 164:./ private 0 shared 0 slave 1
> > (00.023513) 1: Mounting tmpfs @./dev (0)
> > (00.024140) 1: Error (mount.c:1860): Can't mount at ./dev: Invalid argument
> > (00.024269) Error (cr-restore.c:1895): Restoring FAILED.
> >
> > The first migration (host1 -> host2) works. Only back to host1 fails.
> > If I restore the image dumped on host2 also on host2 it also works.
> >
> > Any ideas what might be wrong with my setup?
>
> IIRC there was some issue with this in older versions of CRIU and
> mounts, but unfortunately I don't remember the exact details :(
No, that was SELinux. On one of the systems SELinux was enabled on the
other it wasn't. On the SELinux system the partitions are mounted with
the option seclabel and restoring the seclabel option on the non SELinux
system failed with:
[ 462.628678] tmpfs: No value for mount option 'seclabel'
So, now, lxc-checkpoint works with SELinux on both machines enabled or
disabled with all CRIU versions (if reverting above commit). Great stuff!
Adrian
More information about the CRIU
mailing list