[CRIU] lxc-checkpoint 1.1.5 works with criu 1.6.1 but not master
Adrian Reber
adrian at lisas.de
Thu Jan 14 08:11:31 PST 2016
On Thu, Jan 14, 2016 at 06:20:04PM +0300, Pavel Emelyanov wrote:
> >>>>>> On Tue, Jan 05, 2016 at 06:47:55PM +0100, Adrian Reber wrote:
> >>>>>>> Running lxc-checkpoint works with CRIU 1.6.1 but not with today's
> >>>>>>> master.
> >>>>>>>
> >>>>>>> I get the following dump.log with today's master:
> >>>>>>>
> >>>>>>> (00.000250) Probing sock diag modules
> >>>>>>> (00.000280) Done probing
> >>>>>>> (00.000283) ========================================
> >>>>>>> (00.000286) Dumping processes (pid: 10794)
> >>>>>>> (00.000287) ========================================
> >>>>>>> (00.000289) Running pre-dump scripts
> >>>>>>> (00.000315) Found anon-shmem device at 4
> >>>>>>> (00.000321) Reset 16059's dirty tracking
> >>>>>>> (00.000329) Warn (mem.c:56): Can't reset 16059's dirty memory tracker (22)
> >>>>>>> (00.000341) Unlock network
> >>>>>>> (00.000347) Unfreezing tasks into 1
> >>>>>>> (00.000352) Error (cr-dump.c:1578): Dumping FAILED.
> >>>>>>
> >>>>>> I've not seen this before. Based on a quick glance through the source,
> >>>>>> looks like the write() to /proc/pid/clear_refs is failing with EINVAL,
> >>>>>> which probably means your kernel is too old. Seems like this shouldn't
> >>>>>> be a fatal failure, though, as lxc-checkpoint doesn't try to use any
> >>>>>> memory tracking features.
> >>>>>
> >>>>> I have always seen the dirty memory tracking warning, but as it is a
> >>>>> warning it has never been a problem before. It seems, however, that with
> >>>>> the following commit:
> >>>>>
> >>>>> commit d10835c4ee0d0b1881b926708dee9877f5fb294d
> >>>>> Author: Pavel Emelyanov <xemul at parallels.com>
> >>>>> Date: Tue Dec 15 22:25:09 2015 +0300
> >>>>>
> >>>>> dump: Dont read prohibited kernel files
> >>>>>
> >>>>> criu now just aborts. Reverting this commits 'fixes' the broken master
> >>>>> behaviour.
> >>>>
> >>>> :(
> >>>>
> >>>> Would you check whether the patch titled
> >>>> "[PATCH] kdat: Handle pagemaps with zeroed pfns"
> >>>> from the mailing list fixes it?
> >>>
> >>> Unfortunately not. I had to change the last line of context of that
> >>> patch to get it applied, but a simple dump still fails:
> >>>
> >>> # ./criu dump -D /tmp/3 -j -t `pidof minimal` -v -v -v -v
> >>> (00.000035) Probing sock diag modules
> >>> (00.000072) Done probing
> >>> (00.000074) ========================================
> >>> (00.000076) Dumping processes (pid: 18737)
> >>> (00.000078) ========================================
> >>> (00.000081) Running pre-dump scripts
> >>> (00.000106) Pagemap is fully functional
> >>> (00.000136) Found anon-shmem device at 4
> >>> (00.000141) Reset 20624's dirty tracking
> >>> (00.000147) Warn (mem.c:56): Can't reset 20624's dirty memory tracker (22)
> >>
> >> Hm... Does your kernel lack support for soft-dirty tracking at all?
> >
> > Yes. No soft-dirty tracking for me. But that was no problem until now.
>
> OK :) My fault, yes.
>
> Would you then apply the patch I've mentioned earlier, then this one:
>
> diff --git a/mem.c b/mem.c
> index 92e37f3..f23e6e9 100644
> --- a/mem.c
> +++ b/mem.c
> @@ -50,15 +50,20 @@ int do_task_reset_dirty_track(int pid)
> return errno == EACCES ? 1 : -1;
>
> ret = write(fd, cmd, sizeof(cmd));
> - close(fd);
> -
> if (ret < 0) {
> - pr_warn("Can't reset %d's dirty memory tracker (%d)\n", pid, errno);
> - return -1;
> + if (errno == EINVAL) /* No clear-soft-dirty in kernel */
> + ret = 1;
> + else {
> + pr_perror("Can't reset %d's dirty memory tracker (%d)\n", pid, errno);
> + ret = -1;
> + }
> + } else {
> + pr_info(" ... done\n");
> + ret = 0;
> }
>
> - pr_info(" ... done\n");
> - return 0;
> + close(fd);
> + return ret;
> }
>
> unsigned int dump_pages_args_size(struct vm_area_list *vmas)
>
> and check again?
Now it works correctly:
(00.000027) Probing sock diag modules
(00.000062) Done probing
(00.000065) ========================================
(00.000068) Dumping processes (pid: 22196)
(00.000070) ========================================
(00.000097) Pagemap is fully functional
(00.000126) Found anon-shmem device at 4
(00.000130) Reset 22198's dirty tracking
(00.000137) Dirty tracking support is OFF
Adrian
More information about the CRIU
mailing list