[CRIU] AUFS Support in CRIU

Saied Kazemi saied at google.com
Wed Aug 20 15:32:34 PDT 2014


OK, I used ppage_bitmap as aufs_fpath during dump to get rid of
fixup_aufs_fd_path().  So, we scan the branches only once.

I also got rid of --aufs command line option but kept opts.aufs which is
automatically set in the .parse callback.  Since opts.aufs is now used in
only one place, it'll be easier to get rid of it in a future patch.

I rebased my code to the head and made sure all comments correctly reflect
the current state.  Please review the attached patch file and let me know
what you think.

Cheers!

--Saied




On Wed, Aug 20, 2014 at 11:00 AM, Pavel Emelyanov <xemul at parallels.com>
wrote:

> On 08/20/2014 08:54 PM, Saied Kazemi wrote:
>
> >
> >     > @@ -240,6 +244,31 @@ static int vma_get_mapfile(struct vma_area
> *vma, DIR *mfd,
> >     >                       vma->vm_socket_id = buf.st_ino;
> >     >               } else if (errno != ENOENT)
> >     >                       return -1;
> >     > +     } else if (opts.aufs) {
> >     > +             /*
> >     > +              * AUFS support to compensate for the kernel bug
> >     > +              * exposing branch pathnames in map_files.
> >     > +              *
> >     > +              * If the link points inside a branch, replace it
> >     > +              * with a pathname from the root for later use in
> >     > +              * dump_filemap().
> >     > +              */
> >     > +             char p[PATH_MAX];
> >     > +             int n;
> >     > +
> >     > +             p[0] = '.';
> >     > +             n = read_fd_link(vma->vm_file_fd, &p[1], sizeof p -
> 1);
> >     > +             if (n < 0)
> >     > +                     return -1;
> >     > +             n = fixup_aufs_path(&p[1], sizeof p - 1, true);
> >     > +             if (n < 0)
> >     > +                     return -1;
> >     > +             if (n > 0) {
> >     > +                     vma->aufs_rpath = xmalloc(n + 2);
> >     > +                     if (!vma->aufs_rpath)
> >     > +                             return -1;
> >     > +                     strcpy(vma->aufs_rpath, p);
> >     > +             }
> >     >       }
> >     >
> >     >       return 0;
> >     > @@ -450,12 +479,24 @@ int parse_smaps(pid_t pid, struct
> vm_area_list *vma_area_list, bool use_map_file
> >     >                       vma_area->st = prev->st;
> >     >               } else if (vma_area->vm_file_fd >= 0) {
> >     >                       struct stat *st_buf;
> >     > +                     char *f;
> >     >
> >     >                       st_buf = vma_area->st =
> xmalloc(sizeof(*st_buf));
> >     >                       if (!st_buf)
> >     >                               goto err;
> >     >
> >     > -                     if (fstat(vma_area->vm_file_fd, st_buf) < 0)
> {
> >     > +                     /*
> >     > +                      * For AUFS support, we cannot fstat() a
> file descriptor that
> >     > +                      * is a symbolic link to a branch.  Instead,
> we obtain the
> >     > +                      * pathname of the file from the root and
> use stat().
> >     > +                      */
> >     > +                     if (opts.aufs && (f =
> fixup_aufs_fd_path(vma_area->vm_file_fd))) {
> >     > +                             if (stat(f, st_buf) < 0) {
> >     > +                                     pr_perror("Failed stat on
> %d's map %lu (%s)",
> >     > +                                             pid, start, f);
> >     > +                                     goto err;
> >     > +                             }
> >     > +                     } else if (fstat(vma_area->vm_file_fd,
> st_buf) < 0) {
> >     >                               pr_perror("Failed fstat on %d's map
> %lu", pid, start);
> >     >                               goto err;
> >     >                       }
> >
> >     Why do we need the 2nd call to fixup branch name? Can't we just stat
> >     the vma->aufs_rpath here?
> >
> >
> > That's because vma->aufs_rpath is relative from the root of the mount
> namespace (e.g., ./bin/busybox) but we are not in that directory (we're in
> /).  To avoid calling fixup we can construct the full path by prepending
> aufs_root as follows:
> >
> >     -                       if (opts.aufs && (f =
> fixup_aufs_fd_path(vma_area->vm_file_fd))) {
> >     -                               if (stat(f, st_buf) < 0) {
> >     +                       if (opts.aufs && vma_area->aufs_rpath &&
> opts.aufs_root) {
> >     +                               char path[PATH_MAX];
> >     +                               int n;
> >     +
> >     +                               n = snprintf(path, PATH_MAX,
> "%s/%s", opts.aufs_root, vma_area->aufs_rpath);
> >     +                               if (n >= PATH_MAX) {
> >     +                                       pr_err("Path %s/%s too
> long\n", opts.aufs_root, vma_area->aufs_rpath);
> >     +                                       goto err;
> >     +                               }
> >     +                               if (stat(path, st_buf) < 0) {
> >
> >
> > If you prefer this approach, I will send you a new patch with this
> change and removal of fixup_aufs_fd_path().
>
> Well yes :) I was confused by the 2nd call to fixup_aufs_path from inside
> fixup_aufs_fd_path. It would call read_fd_link and walk the array of
> branches
> again, which is not necessary, since we did it already in parse_smaps(). So
> it's better, from my perspective, not to do extra work.
>
> Just an optimization suggestion -- since we see the full path in
> parse_smaps(),
> we can save one on vma->ppage_bitmap (it's also restore-only field) and
> just
> re-use one here.
>
> Thanks,
> Pavel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140820/c92303ca/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Added-AUFS-support.patch
Type: application/octet-stream
Size: 16991 bytes
Desc: not available
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140820/c92303ca/attachment-0001.obj>


More information about the CRIU mailing list