[CRIU] overmount confusion
Tycho Andersen
tycho.andersen at canonical.com
Wed Apr 1 13:23:23 PDT 2015
Hi Pavel,
On Wed, Apr 01, 2015 at 10:58:10PM +0300, Pavel Emelyanov wrote:
> On 04/01/2015 08:50 PM, Tycho Andersen wrote:
> > Hi Pavel,
> >
> > On Wed, Apr 01, 2015 at 06:00:33PM +0300, Pavel Emelyanov wrote:
> >>
> >>>>>>> I patched it to have a to allow overmounts (i.e. skip this warning if
> >>>>>>> a flag is passed), but then it fails to open mount 122 with:
> >>>>>>>
> >>>>>>> (00.139107) Error (mount.c:762): The file system 0x29 (0x2a) tmpfs ./sys/fs/cgroup is inaccessible
> >>>>>>>
> >>>>>>> so it seems that the current overmount detection code is not
> >>>>>>> aggressive enough, since it only checks the sibling mounts instead of
> >>>>>>> the whole mount tree.
> >>>>>>
> >>>>>> I think the code is correct. We seek for overmounts on m's parent only
> >>>>>> because it m is overmounted by something higher, then the m's parent
> >>>>>> will be overmounted too and CRIU will detect this when checking m's
> >>>>>> parent itself.
> >>>>>
> >>>>> But shouldn't it detect the /sys/fs/cgroup case above?
> >>>>
> >>>> Well, I believe it is. You get the "is overmounted" message on unmodified
> >>>> CRIU sources, don't you?
> >>>
> >>> Only for /sys/fs/cgroup/cgmanager (and I set a flag there to avoid
> >>> trying to do any work when dumping it later). For this one it doesn't
> >>> detect that it is overmounted, so that flag isn't set and my code
> >>> doesn't mount it.
> >>
> >> Hm... The 22:/sys/fs/cgroup is not overmounted according to mountinfo.
> >> The directory itself has another mount on top of it (128'th one), but
> >> it doesn't count as overmount.
> >
> > Oh, ok. I thought these were the same thing. Doesn't it at least have
> > the same problem (i.e. the underlying mount being inaccessable) and
> > potential set of solutions?
>
> Yes, it has :) If someone opens /sys/fs/cgroup/mem/tasks, then mounts
> tmpfs on top of /sys/fs/cgroup then the opened file becomes overmounted
> and we have to do moving/diving tricks we're discussing below.
I see, ok. Thanks for the explanation.
> >> Well, I mean on dump we only need the mounts tree that can be get from
> >> /proc/pid/mountinfo. We don't need to mess with the FS-s themselves. Even
> >> if we dump a file that is opened and then overmounted we don't need to
> >> "dive under" the hiding mountpoint or move it aside -- we just check that
> >> the path we think this file has (by readlink-ing the /proc/pid/fd link) is
> >> resolved into wrong one (by stat()-ing this path), then we compare the
> >> mount-id of this file (got from /proc/pid/fdinfo/fd) with the information
> >> of mounts tree we have and see that the files is indeed overmounted. That's
> >> (should be) enough for dump. On restore we'll have to open this file "under"
> >> the overmounting mount and for this we would need to "dive under" or move
> >> mounts.
> >
> > I see (unless the underlying mount is a tmpfs, I guess, then we still
> > have to "dive under" to tar it, right?).
>
> Yes, to tar tmpfs we have to somehow get the whole tree. For non-overmounted
> tmpfs we bind-mount its root to temporary location and tar it. For overmounted
> tmpfs we can't simply do it.
I understand, thanks.
> > It does seem nicer to do
> > *at() everywhere. Do we need a mount/bind-at system call if we mount
> > things in the right order (but do an open(O_DIR) before mounting
> > things on top of it, as you suggested above)?
>
> Almost. If we have / mountpoint and /foo mountpoint and want to a do something
> with the /foo/bar file which is not on /foo's fs, but the fs that used to be
> on / before we mounted /foo, then we need fd pointing to /foo _before_ mounting
> on it to openat() on it (and effectively dive under /foo fs). Having fs on
> /foo's fs root (i.e. -- /foo after mountpoint was created) wouldn't help much :)
Ah, I see. Can we solve this by just keeping a dirfd for each of the
directories under where we mount them in case we want to use them?
Tycho
> -- Pavel
More information about the CRIU
mailing list