[CRIU] AUFS Support in CRIU

Saied Kazemi saied at google.com
Thu Aug 21 07:16:49 PDT 2014


On Thu, Aug 21, 2014 at 4:54 AM, Pavel Emelyanov <xemul at parallels.com>
wrote:

> On 08/21/2014 02:32 AM, Saied Kazemi wrote:
> > OK, I used ppage_bitmap as aufs_fpath during dump to get rid of
> fixup_aufs_fd_path().  So, we scan the branches only once.
> >
> > I also got rid of --aufs command line option but kept opts.aufs which is
> automatically set in the .parse callback.
> > Since opts.aufs is now used in only one place, it'll be easier to get
> rid of it in a future patch.
> >
> > I rebased my code to the head and made sure all comments correctly
> reflect the current state.  Please review the attached
> > patch file and let me know what you think.
>
> I have no more concerns, thanks :) I plan to make 1.3 release at the end
> of the
> next week. If this is the final version of the patch, I can merge it in 1.3
>
> Thanks,
> Pavel
>

Great!  1.3 will be a feature-rich release.

I will let you know by the end of this week if there are any changes as I
plan to test UnionFS which is similar to AUFS but not identical.

Also, I hope to find time before the end of next week to write an
instruction page for criu.org on how to dump and restore Docker containers.

Cheers!

--Saied



> On Wed, Aug 20, 2014 at 11:00 AM, Pavel Emelyanov <xemul at parallels.com
> <mailto:xemul at parallels.com>> wrote:
> >
> >     On 08/20/2014 08:54 PM, Saied Kazemi wrote:
> >
> >     >
> >     >     > @@ -240,6 +244,31 @@ static int vma_get_mapfile(struct
> vma_area *vma, DIR *mfd,
> >     >     >                       vma->vm_socket_id = buf.st_ino;
> >     >     >               } else if (errno != ENOENT)
> >     >     >                       return -1;
> >     >     > +     } else if (opts.aufs) {
> >     >     > +             /*
> >     >     > +              * AUFS support to compensate for the kernel
> bug
> >     >     > +              * exposing branch pathnames in map_files.
> >     >     > +              *
> >     >     > +              * If the link points inside a branch, replace
> it
> >     >     > +              * with a pathname from the root for later use
> in
> >     >     > +              * dump_filemap().
> >     >     > +              */
> >     >     > +             char p[PATH_MAX];
> >     >     > +             int n;
> >     >     > +
> >     >     > +             p[0] = '.';
> >     >     > +             n = read_fd_link(vma->vm_file_fd, &p[1],
> sizeof p - 1);
> >     >     > +             if (n < 0)
> >     >     > +                     return -1;
> >     >     > +             n = fixup_aufs_path(&p[1], sizeof p - 1, true);
> >     >     > +             if (n < 0)
> >     >     > +                     return -1;
> >     >     > +             if (n > 0) {
> >     >     > +                     vma->aufs_rpath = xmalloc(n + 2);
> >     >     > +                     if (!vma->aufs_rpath)
> >     >     > +                             return -1;
> >     >     > +                     strcpy(vma->aufs_rpath, p);
> >     >     > +             }
> >     >     >       }
> >     >     >
> >     >     >       return 0;
> >     >     > @@ -450,12 +479,24 @@ int parse_smaps(pid_t pid, struct
> vm_area_list *vma_area_list, bool use_map_file
> >     >     >                       vma_area->st = prev->st;
> >     >     >               } else if (vma_area->vm_file_fd >= 0) {
> >     >     >                       struct stat *st_buf;
> >     >     > +                     char *f;
> >     >     >
> >     >     >                       st_buf = vma_area->st =
> xmalloc(sizeof(*st_buf));
> >     >     >                       if (!st_buf)
> >     >     >                               goto err;
> >     >     >
> >     >     > -                     if (fstat(vma_area->vm_file_fd,
> st_buf) < 0) {
> >     >     > +                     /*
> >     >     > +                      * For AUFS support, we cannot fstat()
> a file descriptor that
> >     >     > +                      * is a symbolic link to a branch.
> Instead, we obtain the
> >     >     > +                      * pathname of the file from the root
> and use stat().
> >     >     > +                      */
> >     >     > +                     if (opts.aufs && (f =
> fixup_aufs_fd_path(vma_area->vm_file_fd))) {
> >     >     > +                             if (stat(f, st_buf) < 0) {
> >     >     > +                                     pr_perror("Failed stat
> on %d's map %lu (%s)",
> >     >     > +                                             pid, start, f);
> >     >     > +                                     goto err;
> >     >     > +                             }
> >     >     > +                     } else if (fstat(vma_area->vm_file_fd,
> st_buf) < 0) {
> >     >     >                               pr_perror("Failed fstat on
> %d's map %lu", pid, start);
> >     >     >                               goto err;
> >     >     >                       }
> >     >
> >     >     Why do we need the 2nd call to fixup branch name? Can't we
> just stat
> >     >     the vma->aufs_rpath here?
> >     >
> >     >
> >     > That's because vma->aufs_rpath is relative from the root of the
> mount namespace (e.g., ./bin/busybox) but we are not in that directory
> (we're in /).  To avoid calling fixup we can construct the full path by
> prepending aufs_root as follows:
> >     >
> >     >     -                       if (opts.aufs && (f =
> fixup_aufs_fd_path(vma_area->vm_file_fd))) {
> >     >     -                               if (stat(f, st_buf) < 0) {
> >     >     +                       if (opts.aufs && vma_area->aufs_rpath
> && opts.aufs_root) {
> >     >     +                               char path[PATH_MAX];
> >     >     +                               int n;
> >     >     +
> >     >     +                               n = snprintf(path, PATH_MAX,
> "%s/%s", opts.aufs_root, vma_area->aufs_rpath);
> >     >     +                               if (n >= PATH_MAX) {
> >     >     +                                       pr_err("Path %s/%s too
> long\n", opts.aufs_root, vma_area->aufs_rpath);
> >     >     +                                       goto err;
> >     >     +                               }
> >     >     +                               if (stat(path, st_buf) < 0) {
> >     >
> >     >
> >     > If you prefer this approach, I will send you a new patch with this
> change and removal of fixup_aufs_fd_path().
> >
> >     Well yes :) I was confused by the 2nd call to fixup_aufs_path from
> inside
> >     fixup_aufs_fd_path. It would call read_fd_link and walk the array of
> branches
> >     again, which is not necessary, since we did it already in
> parse_smaps(). So
> >     it's better, from my perspective, not to do extra work.
> >
> >     Just an optimization suggestion -- since we see the full path in
> parse_smaps(),
> >     we can save one on vma->ppage_bitmap (it's also restore-only field)
> and just
> >     re-use one here.
> >
> >     Thanks,
> >     Pavel
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140821/19642d15/attachment-0001.html>


More information about the CRIU mailing list