[CRIU] AUFS Support in CRIU

Saied Kazemi saied at google.com
Tue Aug 19 23:21:54 PDT 2014


On Tue, Aug 19, 2014 at 3:03 AM, Pavel Emelyanov <xemul at parallels.com>
wrote:

> On 08/18/2014 08:43 PM, Saied Kazemi wrote:
>
> >     > @@ -197,6 +199,20 @@ int fill_fdlink(int lfd, const struct
> fd_parms *p, struct fd_link *link)
> >     >               return -1;
> >     >       }
> >     >
> >     > +     /*
> >     > +      * If the link file descriptor (lfd) points to a
> >     > +      * file in an AUFS branch, the pathname should be
> >     > +      * replaced with a pathname from root.
> >     > +      */
> >     > +     if (opts.aufs) {
> >     > +             int n = sizeof link->name - 1;
> >     > +             n = fixup_aufs_path(&link->name[1], n, true);
> >
> >     Since you say, that we only see spoiled paths in map_files, do we
> need
> >     the path fixup here? Isn't the one performed in parsing the mappings
> >     not enough?
> >
> >
> > We need this logic here because the link->name is in an AUFS branch.
> For example, these are the paths that are passed to fill_fdlink() during my
> test:
> >
> > /dev/null
> > /var/lib/docker/aufs/diff/<ID>/bin/busybox
> > /var/lib/docker/aufs/diff/<ID>/lib/libuClibc-0.9.33.2.so <
> http://libuClibc-0.9.33.2.so>
> > /var/lib/docker/aufs/diff/<ID>/lib/ld64-uClibc-0.9.33.2.so <
> http://ld64-uClibc-0.9.33.2.so>
> > /
>
> OK, the only way why spoiled paths leak there is through the
> dump_filemap(),
> and the file we want to dump there was already seen in parse_smaps(). Can
> we
> make path fixup only once? I think we can
>
> * fixup the path in parse_smaps()
> * put one on the vma_area() (can be unione-ed with page_bitmap, it's
> restore field)
> * then in dump_filemap() put one as the link pointer on fdparms
>
>
> after this the if (!p->link) check in dump_one_reg_file() would avoid
> calling
> the fill_fdlink.
>
> What do you think?
>

It's a good idea and I reworked the code accordingly.  Please see the
attached patch to make sure that it's how you meant.



>
> >
> >
> >     > +             if (n < 0)
> >     > +                     return -1;
> >     > +             if (n > 0)
> >     > +                     len = n;
> >     > +     }
> >     > +
> >     >       link->len = len + 1;
> >     >       return 0;
> >     >  }
> >
> >     > +int fixup_aufs_path(char *path, int size, bool chop)
> >     > +{
> >     > +     char rpath[PATH_MAX];
> >     > +     int n;
> >     > +     int blen;
> >     > +
> >     > +     if (aufs_branches == NULL) {
> >     > +             pr_err("No aufs branches to search for %s\n", path);
> >     > +             return -1;
> >
> >     Presumably, if we don't have AUFS we shouldn't fail :)
> >
> >
> > If we don't have AUFS, we should not specify --aufs and, therefore, we
> should not be calling
> > fixup_aufs_path().  So, if we get to fixup_aufs_path() but don't have
> branch information, it's
> > an error.
> >
> > Alternatively, we can decide not to specify --aufs and call AUFS code
> regardless, returning
> > without error if there's no AUFS.  But I added --aufs for performance
> reasons.  It's used in
> > 3 places:
> >
> > 1. cr-dump.c/dump_one_task()
> >    - to collect AUFS branch pathnames and free them when done
> > 2. files.c/fill_fdlink()
> >    - to replace pathnames in branches with pathnames from root
> > 3. proc_parse.c/parse_smaps()
> >    - to get the path for stat() instead of fstat()
> >
>
> Ah, I see. I was thinking that we can make CRIU automatically detect, that
> it
> should do paths fixups, but since we'd have to specify --aufs-root anyway,
> then OK.
>

When I have a bit more time, I will work on this and send a patch to
automatically detect AUFS.



> >
> >     > +     }
> >     > +
> >
> >     > +int parse_mountinfo_aufs_sbinfo(pid_t pid, char *sbinfo, int len)
> >     > +{
> >     > +     char line[PATH_MAX];
> >     > +     char *fstype = NULL;
> >     > +     char *opt = NULL;
> >     > +     char *cp;
> >     > +     int n, ret;
> >     > +
> >     > +     n = get_mntent_by_mountpoint(pid, "/", line, sizeof line);
> >     > +     if (n < 0)
> >     > +             return -1;
> >     > +
> >
> >     Presumably, this would get fixed by rework of mountinfo's parser,
> right?
> >
> >
> > I already reworked the callback in mountinfo's parser which was in the
> previous patch (see
> > proc_parse.c/aufs_parse()).  This is different as it's not changing
> anything in the mountinfo
> > entry.  It's just getting the sbinfo string from mountinfo root entry to
> collect AUFS branch
> > information.
> >
> > dump_one_task()
> >    if (opts.aufs) parse_aufs_branches()
> >       parse_mountinf_aufs_sbinfo()
> >          get_mntent_by_mountpoint()
> >
> > It may be possible to add another call back, .post_parse, and rework
> mountinfo parsing.  But
> > .post_parse will only be used by used by AUFS and this code is temporary
> anyway.  It will go
> > away as soon as AUFS bug is fixed.  So, if you're OK, let's keep it like
> this.  For non-AUFS
> > filesystems, this code isn't even called.
>
> Yes, I agree. What bothers me is that we do mountinfo-s parsing multiple
> times and introduce two independent parser codes for that. I'd prefer if
> we make one-pass scanning.
>

I reworked this part too via the .parse callback so there are no additional
mountinfo parsings :)



>
> >
> >
> >     > +int aufs_parse(struct mount_info *new)
> >     > +{
> >     > +     char *cp;
> >     > +
> >     > +     if (!opts.aufs_root || strcmp(new->mountpoint, "./"))
> >
> >     If we've met AUFS, but no --aufs-root specified, shouldn't we fail
> the dump?
> >
> >
> > The reason it doesn't return an error is that as soon as the mountinfo
> entry for root is
> > fixed (i.e., shows the root pathname instead of just /), we won't have
> to specify --aufs-root
> > anymore and shouldn't fail in aufs_parse().
> >
> >
> >
> >
> >     > +             return 0;
> >     > +
> >     > +     cp = malloc(strlen(opts.aufs_root) + 1);
> >     > +     if (!cp) {
> >     > +             pr_err("Cannot allocate memory for %s\n",
> opts.aufs_root);
> >     > +             return -1;
> >     > +     }
> >     > +     strcpy(cp, opts.aufs_root);
> >     > +
> >     > +     pr_debug("Replacing %s with %s\n", new->root,
> opts.aufs_root);
> >     > +     free(new->root);
> >     > +     new->root = cp;
> >
> >     What if we have more than one AUFS mountpoint with different roots?
> I'm
> >     OK if we only support one, but then we should fail the 2nd
> mountpoint met.
> >
> >
> > Here, we're replacing the root of the container which by definition is
> only 1.  But if different
> > processes in the process tree have different AUFS mounts, their
> respective branch pathnames will
> > be handed one by one in parse_aufs_branches() when called from
> dump_one_task().
>
> I meant -- what if one process sees two different AUFS mounts.
>

I think it should work because the branches are set for each nsid.  This is
another thing that we can address as time allows going forward.



>
> >     > +
> >     > +     return 0;
> >     > +}
> >
> >
> >     Thanks,
> >     Pavel
> >
> >
> > Thanks again for your feedback.  I hope the answers are clear.  If it's
> easier or more productive,
> > I am available on video chat to discuss in greater detail.  My schedule
> is flexible, so time
> > difference shouldn't be a problem.
>
> OK, I'll keep that in mind :) Your timezone is US Pacific one, right?
>

Yes, I am in Mountain View.

Looking forward to your comments on this patch which I think has addressed
everything above.

Cheers!

--Saied
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140819/b487d033/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Added-AUFS-support.patch
Type: application/octet-stream
Size: 16573 bytes
Desc: not available
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140819/b487d033/attachment-0001.obj>


More information about the CRIU mailing list