[CRIU] lxc-checkpoint 1.1.5 works with criu 1.6.1 but not master

Pavel Emelyanov xemul at parallels.com
Wed Jan 13 05:18:31 PST 2016


On 01/07/2016 01:04 PM, Adrian Reber wrote:
> Hello Tycho,
> 
> thanks for your answers.
> 
> On Wed, Jan 06, 2016 at 07:15:25AM -0700, Tycho Andersen wrote:
>> Hi Adrian,
>>
>> On Tue, Jan 05, 2016 at 06:47:55PM +0100, Adrian Reber wrote:
>>> Running lxc-checkpoint works with CRIU 1.6.1 but not with today's
>>> master.
>>>
>>> I get the following dump.log with today's master:
>>>
>>> (00.000250) Probing sock diag modules
>>> (00.000280) Done probing
>>> (00.000283) ========================================
>>> (00.000286) Dumping processes (pid: 10794)
>>> (00.000287) ========================================
>>> (00.000289) Running pre-dump scripts
>>> (00.000315) Found anon-shmem device at 4
>>> (00.000321) Reset 16059's dirty tracking
>>> (00.000329) Warn  (mem.c:56): Can't reset 16059's dirty memory tracker (22)
>>> (00.000341) Unlock network
>>> (00.000347) Unfreezing tasks into 1
>>> (00.000352) Error (cr-dump.c:1578): Dumping FAILED.
>>
>> I've not seen this before. Based on a quick glance through the source,
>> looks like the write() to /proc/pid/clear_refs is failing with EINVAL,
>> which probably means your kernel is too old. Seems like this shouldn't
>> be a fatal failure, though, as lxc-checkpoint doesn't try to use any
>> memory tracking features.
> 
> I have always seen the dirty memory tracking warning, but as it is a
> warning it has never been a problem before. It seems, however, that with
> the following commit:
> 
> commit d10835c4ee0d0b1881b926708dee9877f5fb294d
> Author: Pavel Emelyanov <xemul at parallels.com>
> Date:   Tue Dec 15 22:25:09 2015 +0300
> 
>     dump: Dont read prohibited kernel files
> 
> criu now just aborts. Reverting this commits 'fixes' the broken master
> behaviour.

:(

Would you check whether the patch titled 
"[PATCH] kdat: Handle pagemaps with zeroed pfns"
from the mailing list fixes it?

-- Pavel

>>> I get the following error with v1.8:
>>>
> 
> [...]
> 
>> A quick hack to disable seccomp is simply put,
>>
>> lxc.seccomp =
>>
>> at the bottom of the container's config file and restart it. Then it
>> won't have any seccomp filters, and you should be able to get away
>> with 1.8.
> 
> Perfect, that helps. Thanks.
> 
>>> With lxc-checkpoint and criu 1.6.1 I can migrate lxc containers from one host
>>> to another. Unfortunately not back. The migration back to the original host
>>> with criu 1.6.1 fails with:
>>>
>>> (00.023469)      1:    [./sys/devices/virtual/net](192->172)
>>> (00.023494)      1:     [./sys/devices/virtual/net](193->192)
>>> (00.023496)      1:     <--
>>> (00.023497)      1:    <--
>>> (00.023498)      1:   <--
>>> (00.023499)      1:  <--
>>> (00.023500)      1: <--
>>> (00.023507)      1: Start with 164:./
>>> (00.023508)      1:     Mounting unsupported @./ (0)
>>> (00.023509)      1: 164:./ private 0 shared 0 slave 1
>>> (00.023513)      1:     Mounting tmpfs @./dev (0)
>>> (00.024140)      1: Error (mount.c:1860): Can't mount at ./dev: Invalid argument
>>> (00.024269) Error (cr-restore.c:1895): Restoring FAILED.
>>>
>>> The first migration (host1 -> host2) works. Only back to host1 fails.
>>> If I restore the image dumped on host2 also on host2 it also works.
>>>
>>> Any ideas what might be wrong with my setup?
>>
>> IIRC there was some issue with this in older versions of CRIU and
>> mounts, but unfortunately I don't remember the exact details :(
> 
> No, that was SELinux. On one of the systems SELinux was enabled on the
> other it wasn't. On the SELinux system the partitions are mounted with
> the option seclabel and restoring the seclabel option on the non SELinux
> system failed with:
> 
> [  462.628678] tmpfs: No value for mount option 'seclabel'
> 
> So, now, lxc-checkpoint works with SELinux on both machines enabled or
> disabled with all CRIU versions (if reverting above commit). Great stuff!
> 
> 		Adrian
> .
> 



More information about the CRIU mailing list