[CRIU] lxc-checkpoint 1.1.5 works with criu 1.6.1 but not master

Tycho Andersen tycho.andersen at canonical.com
Wed Jan 6 06:15:25 PST 2016


Hi Adrian,

On Tue, Jan 05, 2016 at 06:47:55PM +0100, Adrian Reber wrote:
> Running lxc-checkpoint works with CRIU 1.6.1 but not with today's
> master.
> 
> I get the following dump.log with today's master:
> 
> (00.000250) Probing sock diag modules
> (00.000280) Done probing
> (00.000283) ========================================
> (00.000286) Dumping processes (pid: 10794)
> (00.000287) ========================================
> (00.000289) Running pre-dump scripts
> (00.000315) Found anon-shmem device at 4
> (00.000321) Reset 16059's dirty tracking
> (00.000329) Warn  (mem.c:56): Can't reset 16059's dirty memory tracker (22)
> (00.000341) Unlock network
> (00.000347) Unfreezing tasks into 1
> (00.000352) Error (cr-dump.c:1578): Dumping FAILED.

I've not seen this before. Based on a quick glance through the source,
looks like the write() to /proc/pid/clear_refs is failing with EINVAL,
which probably means your kernel is too old. Seems like this shouldn't
be a fatal failure, though, as lxc-checkpoint doesn't try to use any
memory tracking features.

> 
> I get the following error with v1.8:
> 
> (00.000969) cg: Set 1 is criu one
> (00.001276) Error (ptrace.c:64): suspending seccomp failed: Invalid argument
> (00.001281) Unfreezing tasks into 1
> (00.001282)     Unseizing 10794 into 1
> (00.001284) Error (ptrace.c:54): Unable to detach from 10794: No such process
> (00.001293) Unlock network
> (00.001297) Unfreezing tasks into 1
> (00.001298)     Unseizing 10794 into 1
> (00.001299) Error (ptrace.c:54): Unable to detach from 10794: No such process
> (00.001402) Error (cr-dump.c:1641): Dumping FAILED.
> 
> and it works with 1.6.1. I am running on CentOS 7.2 which has experimental CRIU
> support but the kernel is not the newest: 3.10.0-327.4.4.el7.x86_64

Right, the error above is because LXC uses seccomp by default, but
your kernel isn't new enough to support c/r of seccomp. 1.6.1 "works",
but it doesn't do all of the security features (e.g. there is no
seccomp filters on restored tasks).

A quick hack to disable seccomp is simply put,

lxc.seccomp =

at the bottom of the container's config file and restart it. Then it
won't have any seccomp filters, and you should be able to get away
with 1.8.

> With lxc-checkpoint and criu 1.6.1 I can migrate lxc containers from one host
> to another. Unfortunately not back. The migration back to the original host
> with criu 1.6.1 fails with:
> 
> (00.023469)      1:    [./sys/devices/virtual/net](192->172)
> (00.023494)      1:     [./sys/devices/virtual/net](193->192)
> (00.023496)      1:     <--
> (00.023497)      1:    <--
> (00.023498)      1:   <--
> (00.023499)      1:  <--
> (00.023500)      1: <--
> (00.023507)      1: Start with 164:./
> (00.023508)      1:     Mounting unsupported @./ (0)
> (00.023509)      1: 164:./ private 0 shared 0 slave 1
> (00.023513)      1:     Mounting tmpfs @./dev (0)
> (00.024140)      1: Error (mount.c:1860): Can't mount at ./dev: Invalid argument
> (00.024269) Error (cr-restore.c:1895): Restoring FAILED.
> 
> The first migration (host1 -> host2) works. Only back to host1 fails.
> If I restore the image dumped on host2 also on host2 it also works.
> 
> Any ideas what might be wrong with my setup?

IIRC there was some issue with this in older versions of CRIU and
mounts, but unfortunately I don't remember the exact details :(

Tycho


More information about the CRIU mailing list