[CRIU] Restoring lxc-1.1.5 centos 7 container with httpd fails
Adrian Reber
adrian at lisas.de
Wed May 11 00:35:54 PDT 2016
Hello Andrey,
I applied your last three patches from the CRIU ML:
mount: create a clean mount only if a sub directory is bind-mounted
mount: don't overmount a mount if it should be bind-mounted somewhere
mount: dump a file system only if a mount point isn't overmounted
and I can checkpoint and restore a lxc container with httpd, mongodb and
postgresql running in it. I haven't yet checked if all patches are
necessary for my problem, but I can look further into it if you want.
If I start mariadb I still get
(00.142574) 294: Parsed 7f8d95799000-7f8d9579a000 vma
(00.169810) Error (cr-restore.c:1407): 6077 killed by signal 9: Killed
(00.169972) Switching to new ns to clean ghosts
(00.170019) Error (files-reg.c:515): `- XFail [.criu.mntns.3xE0jR/15var/tmp/ib8i8FJW.cr.1.ghost] ghost: No such file or directory
(00.170024) Error (files-reg.c:515): `- XFail [.criu.mntns.3xE0jR/15var/tmp/ibHo1ZB4.cr.2.ghost] ghost: No such file or directory
(00.170027) Error (files-reg.c:515): `- XFail [.criu.mntns.3xE0jR/15var/tmp/ib0eakuc.cr.3.ghost] ghost: No such file or directory
(00.170029) Error (files-reg.c:515): `- XFail [.criu.mntns.3xE0jR/15var/tmp/ibtHE6fs.cr.4.ghost] ghost: No such file or directory
(00.170031) Error (files-reg.c:515): `- XFail [.criu.mntns.3xE0jR/15var/tmp/ibmiNsaA.cr.5.ghost] ghost: No such file or directory
(00.170827) Error (cr-restore.c:2251): Restoring FAILED.
Thanks so far.
Adrian
On Fri, May 06, 2016 at 04:01:16PM -0700, Andrey Vagin wrote:
> Hi Adrian,
>
> Can you create a kvm VM where I will be able to reproduce the problem?
>
> I tried to reproduce it by myself, but it works for me.
>
> 18435 pts/0 S 0:00 [lxc monitor] /var/lib/lxc centos
> 18441 ? Ss 0:00 \_ /sbin/init
> 18475 ? Ss 0:00 \_ /usr/lib/systemd/systemd-journald
> 18476 ? Ss 0:00 \_ /usr/lib/systemd/systemd-udevd
> 18477 ? Ss 0:00 \_ /usr/lib/systemd/systemd-logind
> 18478 ? Ss 0:00 \_ /bin/dbus-daemon --system
> --address=systemd: --nofork --nopidfile --systemd-activation
> 18479 ? Ssl 0:00 \_ /usr/sbin/rsyslogd -n
> 18480 ? Ss 0:00 \_ /usr/sbin/sshd -D
> 18481 ? Ss 0:00 \_ /usr/sbin/httpd -DFOREGROUND
> 18482 ? S 0:00 \_ /usr/sbin/httpd -DFOREGROUND
> 18483 ? S 0:00 \_ /usr/sbin/httpd -DFOREGROUND
> 18484 ? S 0:00 \_ /usr/sbin/httpd -DFOREGROUND
> 18485 ? S 0:00 \_ /usr/sbin/httpd -DFOREGROUND
> 18486 ? S 0:00 \_ /usr/sbin/httpd -DFOREGROUND
> 19812 ? Ss 0:00 /usr/sbin/crond -n
> [root at localhost lxc]# LD_LIBRARY_PATH=/usr/local/lib64/ ./src/lxc/lxc-checkpoint -s -n centos -D /root/images -v
> [root at localhost lxc]# LD_LIBRARY_PATH=/usr/local/lib64/ ./src/lxc/lxc-checkpoint -r -n centos -D /root/images -v
> [root at localhost lxc]# echo $?
> 0
> [root at localhost lxc]# git describe HEAD
> lxc-2.0.0
>
>
> On Fri, May 06, 2016 at 10:25:12PM +0200, Adrian Reber wrote:
> > > > > We've discussed with Adrian in irc and he promised to give more
> > > > > info about this issue.
> > > > >
> > > > > To investiage this sort of bugs I add sleep(1000) after pr_err() to
> > > > > freeze processes in a moment of the error and try to find what is wrong
> > > > > here via /proc/PID/root.
> > > >
> > > > With the sleep after the last pr_err() I see two criu processes:
> > > >
> > > > # ls -la /proc/10183/root
> > > > lrwxrwxrwx. 1 root root 0 May 6 08:47 /proc/10183/root -> /
> > > > # ls -la /proc/10188/root
> > > > lrwxrwxrwx. 1 root root 0 May 6 08:47 /proc/10188/root -> /
> > >
> > > I mean that you need to try to resolve source and target argumnets of a
> > > mount syscall which returns an error.
> >
> > I am not really sure what to do. The current restore fails with:
> >
> > (00.080566) 1: mnt: Bind /tmp/cr-tmpfs.KD4sxa/hugetlb to .criu.mntns.07hSc4/13/sys/fs/cgroup/hugetlb
> > (00.080585) 1: Error (mount.c:2479): mnt: Can't mount at .criu.mntns.07hSc4/13/sys/fs/cgroup/hugetlb: No such file or directory
> >
> > The directory /proc/4745/root/var/lib/lxc/c7/rootfs/.criu.mntns.07hSc4
> > is empty. So that seems to explain why the mount is not working.
> >
> > The other directory exists:
> >
> > # ls /proc/4745/root/tmp/cr-tmpfs.KD4sxa/hugetlb/
> > lxc
> >
> > So is the first empty directory the problem? As the path contains
> > 'mntns'... Does this need some special mount namespace support. This is
> > running on the CentOS 7 kernel. Which might be missing some features.
> >
> >
> > Adrian
More information about the CRIU
mailing list