[CRIU] --ext-mount-map auto likes MS_BIND too much
Oleg Nesterov
oleg at redhat.com
Tue Apr 14 08:09:08 PDT 2015
Sorry for delay.
On 04/13, Oleg Nesterov wrote:
>
> So I hit the new problems with criu. I'll write another email,
> I beleive the recent --ext-mount-map auto were not 100% correct.
Or I simply do not understand what should it do.
Lets start with the simplified "test case":
# cat /proc/self/mountinfo
17 38 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw
18 38 0:16 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs rw,seclabel
19 38 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs rw,seclabel,size=16374292k,nr_inodes=4093573,mode=755
21 19 0:17 / /dev/shm rw,nosuid,nodev shared:3 - tmpfs tmpfs rw,seclabel
22 19 0:11 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=000
23 38 0:18 / /run rw,nosuid,nodev shared:22 - tmpfs tmpfs rw,seclabel,mode=755
24 18 0:19 / /sys/fs/cgroup rw,nosuid,nodev,noexec shared:8 - tmpfs tmpfs rw,seclabel,mode=755
25 24 0:20 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 - cgroup cgroup rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
38 1 253:1 / / rw,relatime shared:1 - xfs /dev/mapper/rhel_ibm--x3650m4--02--vm--02-root rw,seclabel,attr2,inode64,noquota
# unshare -m
26 20 253:1 / / rw,relatime shared:1 - xfs /dev/mapper/rhel_ibm--x3650m4--02--vm--02-root rw,seclabel,attr2,inode64,noquota
27 26 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs rw,seclabel,size=16374292k,nr_inodes=4093573,mode=755
28 27 0:17 / /dev/shm rw,nosuid,nodev shared:3 - tmpfs tmpfs rw,seclabel
29 27 0:11 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=000
30 26 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw
31 26 0:16 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs rw,seclabel
32 31 0:19 / /sys/fs/cgroup rw,nosuid,nodev,noexec shared:8 - tmpfs tmpfs rw,seclabel,mode=755
33 32 0:20 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 - cgroup cgroup rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
34 26 0:18 / /run rw,nosuid,nodev shared:22 - tmpfs tmpfs rw,seclabel,mode=755
# perl -e 'close STDIN; close STDOUT; close STDERR; sleep'
Now, on another console:
# criu dump -D D/ -j -t `pidof perl`
# criu restore -D D/ -j
Works.
But what if I pass "--ext-mount-map auto" ? It should not make any harm, yes?
# criu dump -D D/ -j -t `pidof perl` --ext-mount-map auto --enable-external-sharing --enable-external-masters
yes, this works. But!
# criu restore -D D/ -j --ext-mount-map auto --enable-external-sharing --enable-external-masters
fails:
Error (mount.c:1844): Can't mount at ./dev/shm: No such file or directory
The reason looks clear. Lets look at resolve_external_mounts(), it calls
find_best_external_match() unconditionally, and it always finds the match
from the "root" ns_id (which has ->pid == getpid(), I do not know the right
term).
And this basically means that "autodetected external mount" applies to every
mountpoint except "/". The relevant part of "dump -vvvvv" is:
autodetected external mount /run/ for ./run
autodetected external mount /sys/fs/cgroup/systemd/ for ./sys/fs/cgroup/systemd
autodetected external mount /sys/fs/cgroup/ for ./sys/fs/cgroup
autodetected external mount /sys/ for ./sys
autodetected external mount /proc/ for ./proc
autodetected external mount /dev/pts/ for ./dev/pts
autodetected external mount /dev/shm/ for ./dev/shm
autodetected external mount /dev/ for ./dev
And of course, "restore" can't work, -vvvv makes it clear:
Start with 26:./
Mounting unsupported @./ (0)
26:./ private 0 shared 1 slave 0
Mounting devtmpfs @./dev (0)
Bind /dev/ to ./dev
27:./dev private 1 shared 1 slave 0
Mounting tmpfs @./dev/shm (0)
Bind /dev/shm/ to ./dev/shm
Error (mount.c:1844): Can't mount at ./dev/shm: No such file or directory
Surely, /dev/ was not remounted correctly.
Perhaps resolve_external_mounts() should skip the fsroot_mounted() mount
points at least? Although afaics this is not enough too.
Plus, if we skip something in resolve_external_mounts(), then I am not
sure that other m->external checks (say, in collect_shared()) will be
correct...
So. This looks "obviously wrong" to me. Or I simply do not understand
whats going on?
Help!
The change below helps in this particular case, but as I said it is not
correct/enough.
Oleg.
--- x/./mount.c
+++ x/./mount.c
@@ -718,6 +718,9 @@ static int resolve_external_mounts(struc
if (m->parent == NULL || m->is_ns_root)
continue;
+ if (fsroot_mounted(m))
+ continue;
+
ret = try_resolve_ext_mount(m);
if (ret < 0 && ret != -ENOTSUP) {
return -1;
More information about the CRIU
mailing list