[CRIU] --ext-mount-map auto likes MS_BIND too much

Oleg Nesterov oleg at redhat.com
Tue Apr 14 08:09:08 PDT 2015


Sorry for delay.

On 04/13, Oleg Nesterov wrote:
>
> So I hit the new problems with criu. I'll write another email,
> I beleive the recent --ext-mount-map auto were not 100% correct.

Or I simply do not understand what should it do.

Lets start with the simplified "test case":

	# cat /proc/self/mountinfo
	17 38 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw
	18 38 0:16 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs rw,seclabel
	19 38 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs rw,seclabel,size=16374292k,nr_inodes=4093573,mode=755
	21 19 0:17 / /dev/shm rw,nosuid,nodev shared:3 - tmpfs tmpfs rw,seclabel
	22 19 0:11 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=000
	23 38 0:18 / /run rw,nosuid,nodev shared:22 - tmpfs tmpfs rw,seclabel,mode=755
	24 18 0:19 / /sys/fs/cgroup rw,nosuid,nodev,noexec shared:8 - tmpfs tmpfs rw,seclabel,mode=755
	25 24 0:20 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 - cgroup cgroup rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
	38 1 253:1 / / rw,relatime shared:1 - xfs /dev/mapper/rhel_ibm--x3650m4--02--vm--02-root rw,seclabel,attr2,inode64,noquota

	# unshare -m
	26 20 253:1 / / rw,relatime shared:1 - xfs /dev/mapper/rhel_ibm--x3650m4--02--vm--02-root rw,seclabel,attr2,inode64,noquota
	27 26 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs rw,seclabel,size=16374292k,nr_inodes=4093573,mode=755
	28 27 0:17 / /dev/shm rw,nosuid,nodev shared:3 - tmpfs tmpfs rw,seclabel
	29 27 0:11 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=000
	30 26 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw
	31 26 0:16 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs rw,seclabel
	32 31 0:19 / /sys/fs/cgroup rw,nosuid,nodev,noexec shared:8 - tmpfs tmpfs rw,seclabel,mode=755
	33 32 0:20 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 - cgroup cgroup rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
	34 26 0:18 / /run rw,nosuid,nodev shared:22 - tmpfs tmpfs rw,seclabel,mode=755

	# perl -e 'close STDIN; close STDOUT; close STDERR; sleep'

Now, on another console:

	# criu dump -D D/ -j -t `pidof perl`
	# criu restore -D D/ -j

Works.

But what if I pass "--ext-mount-map auto" ? It should not make any harm, yes?

	# criu dump -D D/ -j -t `pidof perl` --ext-mount-map auto --enable-external-sharing --enable-external-masters

yes, this works. But!

	# criu restore -D D/ -j --ext-mount-map auto --enable-external-sharing --enable-external-masters

fails:

	Error (mount.c:1844): Can't mount at ./dev/shm: No such file or directory


The reason looks clear. Lets look at resolve_external_mounts(), it calls
find_best_external_match() unconditionally, and it always finds the match
from the "root" ns_id (which has ->pid == getpid(), I do not know the right
term).

And this basically means that "autodetected external mount" applies to every
mountpoint except "/". The relevant part of "dump -vvvvv" is:

	autodetected external mount /run/ for ./run
	autodetected external mount /sys/fs/cgroup/systemd/ for ./sys/fs/cgroup/systemd
	autodetected external mount /sys/fs/cgroup/ for ./sys/fs/cgroup
	autodetected external mount /sys/ for ./sys
	autodetected external mount /proc/ for ./proc
	autodetected external mount /dev/pts/ for ./dev/pts
	autodetected external mount /dev/shm/ for ./dev/shm
	autodetected external mount /dev/ for ./dev

And of course, "restore" can't work, -vvvv makes it clear:

	Start with 26:./
		Mounting unsupported @./ (0)
	26:./ private 0 shared 1 slave 0
		Mounting devtmpfs @./dev (0)
		Bind /dev/ to ./dev
	27:./dev private 1 shared 1 slave 0
		Mounting tmpfs @./dev/shm (0)
		Bind /dev/shm/ to ./dev/shm
	Error (mount.c:1844): Can't mount at ./dev/shm: No such file or directory	

Surely, /dev/ was not remounted correctly.

Perhaps resolve_external_mounts() should skip the fsroot_mounted() mount
points at least? Although afaics this is not enough too.

Plus, if we skip something in resolve_external_mounts(), then I am not
sure that other m->external checks (say, in collect_shared()) will be
correct...


So. This looks "obviously wrong" to me. Or I simply do not understand
whats going on?

Help!

The change below helps in this particular case, but as I said it is not
correct/enough.

Oleg.

--- x/./mount.c
+++ x/./mount.c
@@ -718,6 +718,9 @@ static int resolve_external_mounts(struc
 		if (m->parent == NULL || m->is_ns_root)
 			continue;
 
+		if (fsroot_mounted(m))
+			continue;
+
 		ret = try_resolve_ext_mount(m);
 		if (ret < 0 && ret != -ENOTSUP) {
 			return -1;



More information about the CRIU mailing list