[CRIU] Fake mount points in dump

Stanislav Kinsburskiу skinsbursky at odin.com
Tue Jan 19 15:18:10 PST 2016


19 янв. 2016 г. 8:30 PM пользователь Pavel Emelyanov <xemul at parallels.com> написал:
>
> On 01/19/2016 08:09 PM, Stanislav Kinsburskiy wrote: 
> > 
> > 
> > 19.01.2016 16:44, Pavel Emelyanov пишет: 
> >> On 01/19/2016 05:54 PM, Stanislav Kinsburskiy wrote: 
> >>> 
> >>> 19.01.2016 13:28, Pavel Emelyanov пишет: 
> >>>> On 01/19/2016 01:46 PM, Stanislav Kinsburskiy wrote: 
> >>>>> 18.01.2016 17:26, Pavel Emelyanov пишет: 
> >>>>>> On 01/18/2016 03:35 PM, Stanislav Kinsburskiy wrote: 
> >>>>>>> Hi, 
> >>>>>>> 
> >>>>>>> I'm trying to suspend a container with the following mount list in it: 
> >>>>>>> 
> >>>>>>> [root at centos-7-x86_64 ~]# cat /proc/mounts 
> >>>>>>> rootfs / rootfs rw 0 0 
> >>>>>>> /dev/ploop43992p1 / ext4 rw,relatime,data=ordered,balloon_ino=12 0 0 
> >>>>>>> none /sys sysfs rw,relatime,ve=102 0 0 
> >>>>>>> none /sys/fs/cgroup tmpfs rw,relatime,size=1931780k,nr_inodes=482945 0 0 
> >>>>>>> cgroup /sys/fs/cgroup/cpuset cgroup 
> >>>>>>> rw,nosuid,nodev,noexec,relatime,cpuset 0 0 
> >>>>>>> cgroup /sys/fs/cgroup/cpu,cpuacct cgroup 
> >>>>>>> rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0 
> >>>>>>> cgroup /sys/fs/cgroup/memory cgroup 
> >>>>>>> rw,nosuid,nodev,noexec,relatime,memory 0 0 
> >>>>>>> cgroup /sys/fs/cgroup/devices cgroup 
> >>>>>>> rw,nosuid,nodev,noexec,relatime,devices 0 0 
> >>>>>>> cgroup /sys/fs/cgroup/freezer cgroup 
> >>>>>>> rw,nosuid,nodev,noexec,relatime,freezer 0 0 
> >>>>>>> cgroup /sys/fs/cgroup/net_cls cgroup 
> >>>>>>> rw,nosuid,nodev,noexec,relatime,net_cls 0 0 
> >>>>>>> cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0 
> >>>>>>> cgroup /sys/fs/cgroup/perf_event cgroup 
> >>>>>>> rw,nosuid,nodev,noexec,relatime,perf_event 0 0 
> >>>>>>> cgroup /sys/fs/cgroup/hugetlb cgroup 
> >>>>>>> rw,nosuid,nodev,noexec,relatime,hugetlb 0 0 
> >>>>>>> cgroup /sys/fs/cgroup/systemd cgroup 
> >>>>>>> rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 
> >>>>>>> 0 0 
> >>>>>>> proc /proc proc rw,relatime 0 0 
> >>>>>>> devtmpfs /dev devtmpfs rw,nosuid,size=1931780k,nr_inodes=482945 0 0 
> >>>>>>> tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0 
> >>>>>>> devpts /dev/pts devpts 
> >>>>>>> rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0 
> >>>>>>> tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0 
> >>>>>>> mqueue /dev/mqueue mqueue rw,relatime 0 0 
> >>>>>>> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0 
> >>>>>>> 
> >>>>>>> and get the following error (persistent): 
> >>>>>>> 
> >>>>>>> (00.221054) mnt: <-- 
> >>>>>>> (00.221489)     type proc source proc mnt_id 20 s_dev 0x3 / @ ./proc 
> >>>>>>> flags 0x30000e options 
> >>>>>>> (00.221651)     type sysfs source sysfs mnt_id 21 s_dev 0x13 / @ ./sys 
> >>>>>>> flags 0x30000e options 
> >>>>>>> (00.221754)     type devtmpfs source devtmpfs mnt_id 22 s_dev 0x5 / @ 
> >>>>>>> ./dev flags 0x1100000 options size=1922508k,nr_inodes=480627,mode=755 
> >>>>>>> (00.221842)     type securityfs source securityfs mnt_id 23 s_dev 0x14 / 
> >>>>>>> @ ./sys/kernel/security flags 0x30000e options 
> >>>>>>> (00.221927)     type tmpfs source tmpfs mnt_id 24 s_dev 0x15 / @ 
> >>>>>>> ./dev/shm flags 0x1100000 options 
> >>>>>>> (00.222015)     type devpts source devpts mnt_id 25 s_dev 0xb / @ 
> >>>>>>> ./dev/pts flags 0x30000a options gid=5,mode=620,ptmxmode=000 
> >>>>>>> (00.222163)     type tmpfs source tmpfs mnt_id 26 s_dev 0x16 / @ ./run 
> >>>>>>> flags 0x1100000 options mode=755 
> >>>>>>> (00.222624)     type tmpfs source tmpfs mnt_id 27 s_dev 0x17 / @ 
> >>>>>>> ./sys/fs/cgroup flags 0x1100000 options mode=755 
> >>>>>>> (00.222716)     type cgroup source cgroup mnt_id 28 s_dev 0x18 / @ 
> >>>>>>> ./sys/fs/cgroup/systemd flags 0x30000e options 
> >>>>>>> xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 
> >>>>>>> (00.222803)     type pstore source pstore mnt_id 29 s_dev 0x19 / @ 
> >>>>>>> ./sys/fs/pstore flags 0x30000e options 
> >>>>>>> (00.222917)     type cgroup source cgroup mnt_id 30 s_dev 0x12 / @ 
> >>>>>>> ./sys/fs/cgroup/cpuset flags 0x30000e options cpuset 
> >>>>>>> (00.223022)     type cgroup source cgroup mnt_id 31 s_dev 0x11 / @ 
> >>>>>>> ./sys/fs/cgroup/cpu,cpuacct flags 0x30000e options cpuacct,cpu 
> >>>>>>> (00.223147)     type cgroup source cgroup mnt_id 32 s_dev 0xf / @ 
> >>>>>>> ./sys/fs/cgroup/memory flags 0x30000e options memory 
> >>>>>>> (00.223303)     type cgroup source cgroup mnt_id 33 s_dev 0x1a / @ 
> >>>>>>> ./sys/fs/cgroup/devices flags 0x30000e options devices 
> >>>>>>> (00.223425)     type cgroup source cgroup mnt_id 34 s_dev 0x1b / @ 
> >>>>>>> ./sys/fs/cgroup/freezer flags 0x30000e options freezer 
> >>>>>>> (00.223514)     type cgroup source cgroup mnt_id 35 s_dev 0x1c / @ 
> >>>>>>> ./sys/fs/cgroup/net_cls flags 0x30000e options net_cls 
> >>>>>>> (00.223602)     type cgroup source cgroup mnt_id 36 s_dev 0xe / @ 
> >>>>>>> ./sys/fs/cgroup/blkio flags 0x30000e options blkio 
> >>>>>>> (00.223688)     type cgroup source cgroup mnt_id 37 s_dev 0x1d / @ 
> >>>>>>> ./sys/fs/cgroup/perf_event flags 0x30000e options perf_event 
> >>>>>>> (00.223773)     type cgroup source cgroup mnt_id 38 s_dev 0x1e / @ 
> >>>>>>> ./sys/fs/cgroup/hugetlb flags 0x30000e options hugetlb 
> >>>>>>> (00.223858)     type cgroup source cgroup mnt_id 39 s_dev 0x1f / @ 
> >>>>>>> ./sys/fs/cgroup/ve flags 0x30000e options ve 
> >>>>>>> (00.224002)     type cgroup source cgroup mnt_id 40 s_dev 0x10 / @ 
> >>>>>>> ./sys/fs/cgroup/beancounter flags 0x30000e options beancounter 
> >>>>>>> (00.224116)     type configfs source configfs mnt_id 43 s_dev 0x20 / @ 
> >>>>>>> ./sys/kernel/config flags 0x300000 options 
> >>>>>>> (00.224274)     type ext4 source /dev/mapper/vz_skinsbursky--vz7-root 
> >>>>>>> mnt_id 44 s_dev 0xfd00001 / @ ./ flags 0x300000 options 
> >>>>>>> quota,usrquota,grpquota,data=ordered 
> >>>>>>> (00.224387)     type autofs source systemd-1 mnt_id 45 s_dev 0x21 / @ 
> >>>>>>> ./proc/sys/fs/binfmt_misc flags 0x300000 options 
> >>>>>>> fd=32,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 
> >>>>>>> (00.224485) Error (autofs.c:220): Failed to find pipe_ino option (old 
> >>>>>>> kernel?) 
> >>>>>>> (00.224565) Error (proc_parse.c:1385): Failed to parse FS specific data 
> >>>>>>> on ./proc/sys/fs/binfmt_misc 
> >>>>>>> (00.224686) Error (mount.c:1908): mnt: Can't parse 83507's mountinfo 
> >>>>>>> (00.224770) Error (mount.c:824): mnt: Failed to find criu pid's mount ns 
> >>>>>> This error only appear if you use --ext-mount-map auto, do you? 
> >>>>> Yes, above is correct. 
> >>>>> What does it mean? 
> >>>> It means than any external bind-mounts are tried to get auto-resolved. 
> >>>> Your error means that either a) we don't have such, but they are erroneously 
> >>>> detected or b) we have such, they are auto-detected by parsing host's 
> >>>> mount points and latter contain unsupported entries. In the latter case 
> >>>> we can fix it by relaxing the requirement for host's mountpoints e.g. 
> >>>> we can ignore options, since we don't need them. 
> >>> This is probably the former case. 
> >>> If I shutdown autofs service on host, dump succeeds. 
> >>> One more question: what makes you think, that it's the former case? 
> >> I didn't tell I thought this was the former case. 
> >> 
> >>> Who or what can bind-mount service mount to a container? 
> >> This can be done by vzctl or by kernel propagation, but I don't 
> >> know whether this is the case. 
> >> 
> >> Looking at the resolve_external_mounts() I see that we parse host's 
> >> mountpoints regardless of whether we have ext mounts or not. So I 
> >> withdraw my previous comment and make new statement: host's mount 
> >> points contain unsupported entries, but we "fail in advance" w/o 
> >> actually checking whether we need them at all. 
> > 
> > This message: 
> > 
> > (00.122918)     type autofs source systemd-1 mnt_id 86 s_dev 0x2e / @ 
> > ./proc/sys/fs/binfmt_misc flags 0x300000 options 
> > fd=43,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 
> > 
> > is printed in parse_mountinfo(). Thus this information comes from kernel. 
> > It looks strange to me, that mount_info of criu process contains 
> > container's mount points + autofs from host. 
> > Isn't it? 
>
> It is. Check the /proc/pid/mountinfo files themselves, maybe the mp just 
> got propagated from host to container or vice versa. 
>

I've done it already. And there is no autofs mount in container's processes mountinfo.
CRIU checks mountinfo for some process, which exists only during dump stage. Looks like it's a CRIU child, which joined container's mount namespace.
And it's reproducible on rhel7 kernel.
I'll continue investigation.


> -- Pavel 
>



More information about the CRIU mailing list