[CRIU] Fake mount points in dump
Stanislav Kinsburskiу
skinsbursky at odin.com
Tue Jan 19 15:18:10 PST 2016
19 янв. 2016 г. 8:30 PM пользователь Pavel Emelyanov <xemul at parallels.com> написал:
>
> On 01/19/2016 08:09 PM, Stanislav Kinsburskiy wrote:
> >
> >
> > 19.01.2016 16:44, Pavel Emelyanov пишет:
> >> On 01/19/2016 05:54 PM, Stanislav Kinsburskiy wrote:
> >>>
> >>> 19.01.2016 13:28, Pavel Emelyanov пишет:
> >>>> On 01/19/2016 01:46 PM, Stanislav Kinsburskiy wrote:
> >>>>> 18.01.2016 17:26, Pavel Emelyanov пишет:
> >>>>>> On 01/18/2016 03:35 PM, Stanislav Kinsburskiy wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I'm trying to suspend a container with the following mount list in it:
> >>>>>>>
> >>>>>>> [root at centos-7-x86_64 ~]# cat /proc/mounts
> >>>>>>> rootfs / rootfs rw 0 0
> >>>>>>> /dev/ploop43992p1 / ext4 rw,relatime,data=ordered,balloon_ino=12 0 0
> >>>>>>> none /sys sysfs rw,relatime,ve=102 0 0
> >>>>>>> none /sys/fs/cgroup tmpfs rw,relatime,size=1931780k,nr_inodes=482945 0 0
> >>>>>>> cgroup /sys/fs/cgroup/cpuset cgroup
> >>>>>>> rw,nosuid,nodev,noexec,relatime,cpuset 0 0
> >>>>>>> cgroup /sys/fs/cgroup/cpu,cpuacct cgroup
> >>>>>>> rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0
> >>>>>>> cgroup /sys/fs/cgroup/memory cgroup
> >>>>>>> rw,nosuid,nodev,noexec,relatime,memory 0 0
> >>>>>>> cgroup /sys/fs/cgroup/devices cgroup
> >>>>>>> rw,nosuid,nodev,noexec,relatime,devices 0 0
> >>>>>>> cgroup /sys/fs/cgroup/freezer cgroup
> >>>>>>> rw,nosuid,nodev,noexec,relatime,freezer 0 0
> >>>>>>> cgroup /sys/fs/cgroup/net_cls cgroup
> >>>>>>> rw,nosuid,nodev,noexec,relatime,net_cls 0 0
> >>>>>>> cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
> >>>>>>> cgroup /sys/fs/cgroup/perf_event cgroup
> >>>>>>> rw,nosuid,nodev,noexec,relatime,perf_event 0 0
> >>>>>>> cgroup /sys/fs/cgroup/hugetlb cgroup
> >>>>>>> rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
> >>>>>>> cgroup /sys/fs/cgroup/systemd cgroup
> >>>>>>> rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
> >>>>>>> 0 0
> >>>>>>> proc /proc proc rw,relatime 0 0
> >>>>>>> devtmpfs /dev devtmpfs rw,nosuid,size=1931780k,nr_inodes=482945 0 0
> >>>>>>> tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
> >>>>>>> devpts /dev/pts devpts
> >>>>>>> rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
> >>>>>>> tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0
> >>>>>>> mqueue /dev/mqueue mqueue rw,relatime 0 0
> >>>>>>> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
> >>>>>>>
> >>>>>>> and get the following error (persistent):
> >>>>>>>
> >>>>>>> (00.221054) mnt: <--
> >>>>>>> (00.221489) type proc source proc mnt_id 20 s_dev 0x3 / @ ./proc
> >>>>>>> flags 0x30000e options
> >>>>>>> (00.221651) type sysfs source sysfs mnt_id 21 s_dev 0x13 / @ ./sys
> >>>>>>> flags 0x30000e options
> >>>>>>> (00.221754) type devtmpfs source devtmpfs mnt_id 22 s_dev 0x5 / @
> >>>>>>> ./dev flags 0x1100000 options size=1922508k,nr_inodes=480627,mode=755
> >>>>>>> (00.221842) type securityfs source securityfs mnt_id 23 s_dev 0x14 /
> >>>>>>> @ ./sys/kernel/security flags 0x30000e options
> >>>>>>> (00.221927) type tmpfs source tmpfs mnt_id 24 s_dev 0x15 / @
> >>>>>>> ./dev/shm flags 0x1100000 options
> >>>>>>> (00.222015) type devpts source devpts mnt_id 25 s_dev 0xb / @
> >>>>>>> ./dev/pts flags 0x30000a options gid=5,mode=620,ptmxmode=000
> >>>>>>> (00.222163) type tmpfs source tmpfs mnt_id 26 s_dev 0x16 / @ ./run
> >>>>>>> flags 0x1100000 options mode=755
> >>>>>>> (00.222624) type tmpfs source tmpfs mnt_id 27 s_dev 0x17 / @
> >>>>>>> ./sys/fs/cgroup flags 0x1100000 options mode=755
> >>>>>>> (00.222716) type cgroup source cgroup mnt_id 28 s_dev 0x18 / @
> >>>>>>> ./sys/fs/cgroup/systemd flags 0x30000e options
> >>>>>>> xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
> >>>>>>> (00.222803) type pstore source pstore mnt_id 29 s_dev 0x19 / @
> >>>>>>> ./sys/fs/pstore flags 0x30000e options
> >>>>>>> (00.222917) type cgroup source cgroup mnt_id 30 s_dev 0x12 / @
> >>>>>>> ./sys/fs/cgroup/cpuset flags 0x30000e options cpuset
> >>>>>>> (00.223022) type cgroup source cgroup mnt_id 31 s_dev 0x11 / @
> >>>>>>> ./sys/fs/cgroup/cpu,cpuacct flags 0x30000e options cpuacct,cpu
> >>>>>>> (00.223147) type cgroup source cgroup mnt_id 32 s_dev 0xf / @
> >>>>>>> ./sys/fs/cgroup/memory flags 0x30000e options memory
> >>>>>>> (00.223303) type cgroup source cgroup mnt_id 33 s_dev 0x1a / @
> >>>>>>> ./sys/fs/cgroup/devices flags 0x30000e options devices
> >>>>>>> (00.223425) type cgroup source cgroup mnt_id 34 s_dev 0x1b / @
> >>>>>>> ./sys/fs/cgroup/freezer flags 0x30000e options freezer
> >>>>>>> (00.223514) type cgroup source cgroup mnt_id 35 s_dev 0x1c / @
> >>>>>>> ./sys/fs/cgroup/net_cls flags 0x30000e options net_cls
> >>>>>>> (00.223602) type cgroup source cgroup mnt_id 36 s_dev 0xe / @
> >>>>>>> ./sys/fs/cgroup/blkio flags 0x30000e options blkio
> >>>>>>> (00.223688) type cgroup source cgroup mnt_id 37 s_dev 0x1d / @
> >>>>>>> ./sys/fs/cgroup/perf_event flags 0x30000e options perf_event
> >>>>>>> (00.223773) type cgroup source cgroup mnt_id 38 s_dev 0x1e / @
> >>>>>>> ./sys/fs/cgroup/hugetlb flags 0x30000e options hugetlb
> >>>>>>> (00.223858) type cgroup source cgroup mnt_id 39 s_dev 0x1f / @
> >>>>>>> ./sys/fs/cgroup/ve flags 0x30000e options ve
> >>>>>>> (00.224002) type cgroup source cgroup mnt_id 40 s_dev 0x10 / @
> >>>>>>> ./sys/fs/cgroup/beancounter flags 0x30000e options beancounter
> >>>>>>> (00.224116) type configfs source configfs mnt_id 43 s_dev 0x20 / @
> >>>>>>> ./sys/kernel/config flags 0x300000 options
> >>>>>>> (00.224274) type ext4 source /dev/mapper/vz_skinsbursky--vz7-root
> >>>>>>> mnt_id 44 s_dev 0xfd00001 / @ ./ flags 0x300000 options
> >>>>>>> quota,usrquota,grpquota,data=ordered
> >>>>>>> (00.224387) type autofs source systemd-1 mnt_id 45 s_dev 0x21 / @
> >>>>>>> ./proc/sys/fs/binfmt_misc flags 0x300000 options
> >>>>>>> fd=32,pgrp=1,timeout=300,minproto=5,maxproto=5,direct
> >>>>>>> (00.224485) Error (autofs.c:220): Failed to find pipe_ino option (old
> >>>>>>> kernel?)
> >>>>>>> (00.224565) Error (proc_parse.c:1385): Failed to parse FS specific data
> >>>>>>> on ./proc/sys/fs/binfmt_misc
> >>>>>>> (00.224686) Error (mount.c:1908): mnt: Can't parse 83507's mountinfo
> >>>>>>> (00.224770) Error (mount.c:824): mnt: Failed to find criu pid's mount ns
> >>>>>> This error only appear if you use --ext-mount-map auto, do you?
> >>>>> Yes, above is correct.
> >>>>> What does it mean?
> >>>> It means than any external bind-mounts are tried to get auto-resolved.
> >>>> Your error means that either a) we don't have such, but they are erroneously
> >>>> detected or b) we have such, they are auto-detected by parsing host's
> >>>> mount points and latter contain unsupported entries. In the latter case
> >>>> we can fix it by relaxing the requirement for host's mountpoints e.g.
> >>>> we can ignore options, since we don't need them.
> >>> This is probably the former case.
> >>> If I shutdown autofs service on host, dump succeeds.
> >>> One more question: what makes you think, that it's the former case?
> >> I didn't tell I thought this was the former case.
> >>
> >>> Who or what can bind-mount service mount to a container?
> >> This can be done by vzctl or by kernel propagation, but I don't
> >> know whether this is the case.
> >>
> >> Looking at the resolve_external_mounts() I see that we parse host's
> >> mountpoints regardless of whether we have ext mounts or not. So I
> >> withdraw my previous comment and make new statement: host's mount
> >> points contain unsupported entries, but we "fail in advance" w/o
> >> actually checking whether we need them at all.
> >
> > This message:
> >
> > (00.122918) type autofs source systemd-1 mnt_id 86 s_dev 0x2e / @
> > ./proc/sys/fs/binfmt_misc flags 0x300000 options
> > fd=43,pgrp=1,timeout=300,minproto=5,maxproto=5,direct
> >
> > is printed in parse_mountinfo(). Thus this information comes from kernel.
> > It looks strange to me, that mount_info of criu process contains
> > container's mount points + autofs from host.
> > Isn't it?
>
> It is. Check the /proc/pid/mountinfo files themselves, maybe the mp just
> got propagated from host to container or vice versa.
>
I've done it already. And there is no autofs mount in container's processes mountinfo.
CRIU checks mountinfo for some process, which exists only during dump stage. Looks like it's a CRIU child, which joined container's mount namespace.
And it's reproducible on rhel7 kernel.
I'll continue investigation.
> -- Pavel
>
More information about the CRIU
mailing list