[CRIU] lxc-checkpoint 1.1.5 works with criu 1.6.1 but not master
Pavel Emelyanov
xemul at parallels.com
Wed Jan 13 05:18:31 PST 2016
On 01/07/2016 01:04 PM, Adrian Reber wrote:
> Hello Tycho,
>
> thanks for your answers.
>
> On Wed, Jan 06, 2016 at 07:15:25AM -0700, Tycho Andersen wrote:
>> Hi Adrian,
>>
>> On Tue, Jan 05, 2016 at 06:47:55PM +0100, Adrian Reber wrote:
>>> Running lxc-checkpoint works with CRIU 1.6.1 but not with today's
>>> master.
>>>
>>> I get the following dump.log with today's master:
>>>
>>> (00.000250) Probing sock diag modules
>>> (00.000280) Done probing
>>> (00.000283) ========================================
>>> (00.000286) Dumping processes (pid: 10794)
>>> (00.000287) ========================================
>>> (00.000289) Running pre-dump scripts
>>> (00.000315) Found anon-shmem device at 4
>>> (00.000321) Reset 16059's dirty tracking
>>> (00.000329) Warn (mem.c:56): Can't reset 16059's dirty memory tracker (22)
>>> (00.000341) Unlock network
>>> (00.000347) Unfreezing tasks into 1
>>> (00.000352) Error (cr-dump.c:1578): Dumping FAILED.
>>
>> I've not seen this before. Based on a quick glance through the source,
>> looks like the write() to /proc/pid/clear_refs is failing with EINVAL,
>> which probably means your kernel is too old. Seems like this shouldn't
>> be a fatal failure, though, as lxc-checkpoint doesn't try to use any
>> memory tracking features.
>
> I have always seen the dirty memory tracking warning, but as it is a
> warning it has never been a problem before. It seems, however, that with
> the following commit:
>
> commit d10835c4ee0d0b1881b926708dee9877f5fb294d
> Author: Pavel Emelyanov <xemul at parallels.com>
> Date: Tue Dec 15 22:25:09 2015 +0300
>
> dump: Dont read prohibited kernel files
>
> criu now just aborts. Reverting this commits 'fixes' the broken master
> behaviour.
:(
Would you check whether the patch titled
"[PATCH] kdat: Handle pagemaps with zeroed pfns"
from the mailing list fixes it?
-- Pavel
>>> I get the following error with v1.8:
>>>
>
> [...]
>
>> A quick hack to disable seccomp is simply put,
>>
>> lxc.seccomp =
>>
>> at the bottom of the container's config file and restart it. Then it
>> won't have any seccomp filters, and you should be able to get away
>> with 1.8.
>
> Perfect, that helps. Thanks.
>
>>> With lxc-checkpoint and criu 1.6.1 I can migrate lxc containers from one host
>>> to another. Unfortunately not back. The migration back to the original host
>>> with criu 1.6.1 fails with:
>>>
>>> (00.023469) 1: [./sys/devices/virtual/net](192->172)
>>> (00.023494) 1: [./sys/devices/virtual/net](193->192)
>>> (00.023496) 1: <--
>>> (00.023497) 1: <--
>>> (00.023498) 1: <--
>>> (00.023499) 1: <--
>>> (00.023500) 1: <--
>>> (00.023507) 1: Start with 164:./
>>> (00.023508) 1: Mounting unsupported @./ (0)
>>> (00.023509) 1: 164:./ private 0 shared 0 slave 1
>>> (00.023513) 1: Mounting tmpfs @./dev (0)
>>> (00.024140) 1: Error (mount.c:1860): Can't mount at ./dev: Invalid argument
>>> (00.024269) Error (cr-restore.c:1895): Restoring FAILED.
>>>
>>> The first migration (host1 -> host2) works. Only back to host1 fails.
>>> If I restore the image dumped on host2 also on host2 it also works.
>>>
>>> Any ideas what might be wrong with my setup?
>>
>> IIRC there was some issue with this in older versions of CRIU and
>> mounts, but unfortunately I don't remember the exact details :(
>
> No, that was SELinux. On one of the systems SELinux was enabled on the
> other it wasn't. On the SELinux system the partitions are mounted with
> the option seclabel and restoring the seclabel option on the non SELinux
> system failed with:
>
> [ 462.628678] tmpfs: No value for mount option 'seclabel'
>
> So, now, lxc-checkpoint works with SELinux on both machines enabled or
> disabled with all CRIU versions (if reverting above commit). Great stuff!
>
> Adrian
> .
>
More information about the CRIU
mailing list