[CRIU] [PATCH] mnt: add support for external shared mounts

Mon Apr 6 08:40:43 PDT 2015

On Mon, Apr 06, 2015 at 06:19:48PM +0300, Pavel Emelyanov wrote:
> On 04/06/2015 06:06 PM, Tycho Andersen wrote:
> > Hi Pavel,
> > 
> > On Mon, Apr 06, 2015 at 05:56:41PM +0300, Pavel Emelyanov wrote:
> >> On 04/06/2015 05:48 PM, Tycho Andersen wrote:
> >>> On Mon, Apr 06, 2015 at 11:26:41AM +0300, Pavel Emelyanov wrote:
> >>>> On 04/03/2015 07:20 PM, Tycho Andersen wrote:
> >>>>> Hi Andrey,
> >>>>>
> >>>>> On Fri, Apr 03, 2015 at 07:00:43PM +0300, Andrew Vagin wrote:
> >>>>>> On Thu, Apr 02, 2015 at 08:22:33PM +0000, Tycho Andersen wrote:
> >>>>>>> Add support for external shared mounts via the --enable-external-masters flag.
> >>>>>>> This flag assumes that the mount is visible to criu via /proc/self/mountinfo
> >>>>>>> and that it will be present at the same location on restore. No attempt to dump
> >>>>>>> or restore the content is made; the external master is bind mounted into the
> >>>>>>> mount ns and whatever is there is what the container sees.
> >>>>>>
> >>>>>> I need time to think about the idea. At the first glance it should work.
> >>>>>>
> >>>>>> But we (I and Pavel) are not sure that we understand what you do:).
> >>>>>>
> >>>>>> Could you show /proc/PID/mountinfo for a container? Why can you not use
> >>>>>> external mounts?
> >>>>>
> >>>>> Because criu dies with "couldn't find sharing" before the external
> >>>>> mount map is consulted. I could use ext-mount-map, but we'd still need
> >>>>> some patch to skip this check, and since we can resolve these
> >>>>> automatically anyway it seems like we should just do that instead.
> >>>>
> >>>> Ouch, so the slavery is set up for external mount points? I've missed that, sorry :(
> >>>> But taking into account my new insight :) doesn't the patch falls logically into
> >>>> two pieces -- the first is auto-assignments for ext-mount-map-s and the second --
> >>>> enabling for slave external bind mounts?
> >>>
> >>> I guess it depends on how you want the implementation to work.
> >>
> >> :) I want it to suit all the parties' requirements, though not duplicate
> >> the functionality and API.
> >>
> >>> Currently it doesn't use the ext-mount-map piece, although I think it
> >>> could.
> >>
> >> I was under impression that the whole effort is to support internal
> >> mountpoints that happened to be slave to something from the outside.
> >> Now it seems that the idea is to support external slave mounts only.
> > 
> > Sorry, I don't understand the distinction you're making here.
> 
> In the validate_mounts() we have call for try_resolve_ext_mount() (it's
> the wrong place to call, but nonetheless). Any mountpoint whose root
> cannot be found withing the container's root gets there and we try to
> treat this mountpoint as "external".
> 
> AFAIU you try to fix only mountpoints that are such.

Ah, I understand. You want to handle all external mounts (so nobody
would have to pass anything besides --ext-mount-map auto) here? How do
we detect where the bind mount came from without the sharing numbers?

> >> Am I right with this? If yes, then we should definitely re-use the
> >> external mounts code we have.
> >>
> >>> I think we'd still need an extra flag in the protobufs to
> >>> indicate that the mount was an external sharing and would not be
> >>> provided by the user from the CLI on restore, so I'm not sure using
> >>> ext-mount-map really buys us much, but I can make those changes if you
> >>> want.
> >>
> >> We already have the ext_mount bit in the mnt_entry. If I get it right,
> >> you will need a) sign that the mount point in question is slave and
> >> b) the path to mountpoint on dump (to use it on restore).
> >>
> >> While "a" seem to be auto-assigned on restore, the "b" can be achieved
> >> by introducing the "--ext-mount-map auto" mode. In this mode all the
> >> mountpoints that are thought to be "external" would be treated as if
> >> the called has specifies --ext-mount-map $mountpoint:$root. I.e. -- the
> >> root path will be written in the image just as you suggested in the
> >> your patch.
> >>
> >> Does the above work to you?
> > 
> > It works, but how do I tell the difference between something where the
> > ext-mount-map was provided by the user and where it was autogenerated?
> 
> You mean on restore? All "external" mount entries are marked. And, for
> such entries, the --ext-mount-map's VALUE is written into the image too.
> 
> How typically --ext-mount-map is used is like this.
> 
>    On dump:     --ext-mount-map mount-point:some-value
> 
>    On restore:  --ext-mount-map some-value:path-to-root-on-host
> 
> If instead of some-value we use some path-to-root-on-host from the
> very beginning, and this path is calculated the way you propose:
> 
>              snprintf(p, size+1, "%s%s", pm->mountpoint + 1, m->root);
> 
> then we get the "auto" mode. As you told, for standard C/R this would
> work, so we save many --ext-mount-map options and just use one -- the
> "--ext-mount-map auto".
> 
> > That's the part I think we need the extra bit for. We could do
> > something like "if the mount name isn't found in the ext-mount-map on
> > restore and the user provided auto as one of the options, then assume
> > the mount had external sharing", but it seems much cleaner to be to be
> > explicit.
> 
> Sure, but my point is that we have the external-mount bit in the image
> already. So we effectively do "if the mountpoint is external and the
> --ext-mount-map auto is specified (and we haven't found the explicit
> --ext-mount-map KEY:VAL mapping), then use the source path from image".

I see, ok. Take a look at the patches I just sent and see if you have
any more comments, and I will re-work them again to have the flags
look like this. I'm not sure how to do the bind mount detection
without sharing as above, though.

> -- Pavel
>