[CRIU] Dealing with other mount types

Tycho Andersen tycho.andersen at canonical.com
Thu Mar 26 08:26:11 PDT 2015


On Thu, Mar 26, 2015 at 06:11:15PM +0300, Andrew Vagin wrote:
> On Tue, Mar 24, 2015 at 12:57:29PM -0600, Tycho Andersen wrote:
> > Hi all,
> > 
> > [As a preface, I don't understand all the issues at play here, so any
> > input or corrections are very much welcome.]
> > 
> > Recent changes in Ubuntu and LXC mean that c/r of LXC containers no longer
> > works out of the box, so I'd like to fix that. The first step is to fix some of
> > the mount handling. When I start a container on Vivid with LXC 1.1, I get a
> > mountinfo that looks like:
> > 
> > 44 45 253:1 /usr/local/var/lib/lxc/u1/rootfs / rw,relatime master:1 - ext4 /dev/disk/by-uuid/6c5a78e0-95fa-49a8-aa91-a8093d295e58 rw,data=ordered
> > 78 44 0:36 / /dev rw,relatime - tmpfs none rw,size=100k,mode=755
> > 79 44 0:38 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw
> > 80 81 0:38 /sys/net /proc/sys/net rw,nosuid,nodev,noexec,relatime - proc proc rw
> > 81 79 0:38 /sys /proc/sys ro,nosuid,nodev,noexec,relatime - proc proc rw
> > 82 79 0:38 /sysrq-trigger /proc/sysrq-trigger ro,nosuid,nodev,noexec,relatime - proc proc rw
> > 83 44 0:39 / /sys rw,nosuid,nodev,noexec,relatime - sysfs sysfs rw
> > 84 83 0:39 / /sys ro,nosuid,nodev,noexec,relatime - sysfs sysfs rw
> > 85 84 0:39 / /sys/devices/virtual/net rw,relatime - sysfs sysfs rw
> > 86 85 0:39 /devices/virtual/net /sys/devices/virtual/net rw,nosuid,nodev,noexec,relatime - sysfs sysfs rw
> > 87 84 0:34 / /sys/fs/fuse/connections rw,relatime master:23 - fusectl fusectl rw
> > 88 84 0:7 / /sys/kernel/debug rw,relatime master:25 - debugfs debugfs rw
> > 89 84 0:11 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime master:8 - securityfs securityfs rw
> > 90 84 0:23 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime master:11 - pstore pstore rw
> > 91 84 0:40 / /sys/fs/cgroup rw,relatime - tmpfs cgroup rw,size=12k,mode=755
> > 92 91 0:21 /cgmanager /sys/fs/cgroup/cgmanager rw - tmpfs tmpfs rw,mode=755
> > 46 78 0:41 / /dev/pts rw,relatime - devpts devpts rw,gid=5,mode=620,ptmxmode=666
> > 47 44 0:42 / /run rw,nosuid,noexec,relatime - tmpfs none rw,size=199952k,mode=755
> > 48 47 0:43 / /run/lock rw,nosuid,nodev,noexec,relatime - tmpfs none rw,size=5120k
> > 49 47 0:44 / /run/shm rw,nosuid,nodev,relatime - tmpfs none rw
> > 50 47 0:45 / /run/user rw,nosuid,nodev,noexec,relatime - tmpfs none rw,size=102400k,mode=755
> > 
> > First, several things (the rootfs, fuse, pstore, etc.) are mounted as slaves.
> > My understanding is that this happens because systemd remounts / as MS_SHARED
> > instead of MS_PRIVATE, but it means that we need some way of handling slave
> > mounts. One thought is to have an argument similar to --ext-mount-map which
> > tells criu which peer group a particular mount is a slave to. For e.g. pstore
> > above, this would look like:
> > 
> > 1. criu ... --slave-mount-map=/sys/fs/pstore:/sys/fs/pstore # source:target
> > 2. criu walks the mount tree as usual, and when it sees something in
> >    --slave-mount-map:
> >     1. criu bind mounts /sys/fs/pstore into $root_yard/sys/fs/pstore
> >     2. criu sets MS_SLAVE (by calling restore_shared_options())
> > 
> > Second, for e.g. /proc/sys, the root of the mount is a path that's relative to
> > it's parent's mountpoint. I think (?) this just means that mount.c's
> > find_fsroot_mount_for() needs to be a little smarter when it resolves things,
> > so it should return /'s mountinfo when called for /proc/sys, instead of
> > complaining about a proper root mount. Is there something else here that I'm
> > missing?
> 
> I don't understand what are you talking about. Could you show
> restore.log?

Yep, sorry. This is the part I was confusing with another issue.

Tycho

> Look at my logs:
> 
> diff --git a/mount.c b/mount.c
> index 776ce6a..e31f9eb 100644
> --- a/mount.c
> +++ b/mount.c
> @@ -560,8 +560,11 @@ static struct mount_info *find_fsroot_mount_for(struct mount_info *bm)
>         list_for_each_entry(sm, &bm->mnt_bind, mnt_bind)
>                 if (fsroot_mounted(sm) ||
>                                 (sm->parent == NULL &&
> -                                strstartswith(bm->root, sm->root)))
> +                                strstartswith(bm->root, sm->root))) {
> +                       pr_debug("MP  : %s %s\n", bm->root, bm->mountpoint);
> +                       pr_debug("ROOT: %s %s\n", sm->root, sm->mountpoint);
>                         return sm;
> +               }
>  
>         return NULL;
>  }
> 
> 
> [root at avagin-fc19-cr criu]# ./criu show -f /root/git/criu/test/dump/ns/static/mntns_rw_ro_rw/12391/1/mountpoints-11.img 
> fstype: 0x1 mnt_id: 0x6a root_dev: 0x24 parent_mnt_id: 0x69 flags: 0x200000 root: "/sys/net" mountpoint: "/proc/sys/net" source: "proc" options: "" shared_id: 0 master_id: 0 
> fstype: 0x1 mnt_id: 0x69 root_dev: 0x24 parent_mnt_id: 0x87 flags: 0x200001 root: "/sys" mountpoint: "/proc/sys" source: "proc" options: "" shared_id: 0 master_id: 0 
> fstype: 0x6 mnt_id: 0x68 root_dev: 0x25 parent_mnt_id: 0x86 flags: 0x200000 root: "/" mountpoint: "/dev/pts" source: "pts" options: "mode=666,ptmxmode=666,newinstance" shared_id: 0 master_id: 0 
> fstype: 0x1 mnt_id: 0x87 root_dev: 0x24 parent_mnt_id: 0x86 flags: 0x200000 root: "/" mountpoint: "/proc" source: "proc" options: "" shared_id: 0 master_id: 0 
> fstype: 0 mnt_id: 0x86 root_dev: 0x800003 parent_mnt_id: 0x67 flags: 0x200000 root: "/root/git/criu/test" mountpoint: "/" source: "/dev/sda3" options: "data=ordered" shared_id: 0 master_id: 0 
> 
> [root at avagin-fc19-cr criu]# cat /root/git/criu/test/dump/ns/static/mntns_rw_ro_rw/12391/1/restore.log | grep '\(MP\|ROOT\)'
> (00.007926)      1: MP  : /sys ./proc/sys
> (00.007928)      1: ROOT: / ./proc
> (00.007931)      1: MP  : /sys/net ./proc/sys/net
> (00.007934)      1: ROOT: / ./proc
> > 
> > Tycho


More information about the CRIU mailing list