[CRIU] AUFS Support in CRIU
Pavel Emelyanov
xemul at parallels.com
Thu Aug 21 04:54:16 PDT 2014
On 08/21/2014 02:32 AM, Saied Kazemi wrote:
> OK, I used ppage_bitmap as aufs_fpath during dump to get rid of fixup_aufs_fd_path(). So, we scan the branches only once.
>
> I also got rid of --aufs command line option but kept opts.aufs which is automatically set in the .parse callback.
> Since opts.aufs is now used in only one place, it'll be easier to get rid of it in a future patch.
>
> I rebased my code to the head and made sure all comments correctly reflect the current state. Please review the attached
> patch file and let me know what you think.
I have no more concerns, thanks :) I plan to make 1.3 release at the end of the
next week. If this is the final version of the patch, I can merge it in 1.3
Thanks,
Pavel
> Cheers!
>
> --Saied
>
>
>
>
> On Wed, Aug 20, 2014 at 11:00 AM, Pavel Emelyanov <xemul at parallels.com <mailto:xemul at parallels.com>> wrote:
>
> On 08/20/2014 08:54 PM, Saied Kazemi wrote:
>
> >
> > > @@ -240,6 +244,31 @@ static int vma_get_mapfile(struct vma_area *vma, DIR *mfd,
> > > vma->vm_socket_id = buf.st_ino;
> > > } else if (errno != ENOENT)
> > > return -1;
> > > + } else if (opts.aufs) {
> > > + /*
> > > + * AUFS support to compensate for the kernel bug
> > > + * exposing branch pathnames in map_files.
> > > + *
> > > + * If the link points inside a branch, replace it
> > > + * with a pathname from the root for later use in
> > > + * dump_filemap().
> > > + */
> > > + char p[PATH_MAX];
> > > + int n;
> > > +
> > > + p[0] = '.';
> > > + n = read_fd_link(vma->vm_file_fd, &p[1], sizeof p - 1);
> > > + if (n < 0)
> > > + return -1;
> > > + n = fixup_aufs_path(&p[1], sizeof p - 1, true);
> > > + if (n < 0)
> > > + return -1;
> > > + if (n > 0) {
> > > + vma->aufs_rpath = xmalloc(n + 2);
> > > + if (!vma->aufs_rpath)
> > > + return -1;
> > > + strcpy(vma->aufs_rpath, p);
> > > + }
> > > }
> > >
> > > return 0;
> > > @@ -450,12 +479,24 @@ int parse_smaps(pid_t pid, struct vm_area_list *vma_area_list, bool use_map_file
> > > vma_area->st = prev->st;
> > > } else if (vma_area->vm_file_fd >= 0) {
> > > struct stat *st_buf;
> > > + char *f;
> > >
> > > st_buf = vma_area->st = xmalloc(sizeof(*st_buf));
> > > if (!st_buf)
> > > goto err;
> > >
> > > - if (fstat(vma_area->vm_file_fd, st_buf) < 0) {
> > > + /*
> > > + * For AUFS support, we cannot fstat() a file descriptor that
> > > + * is a symbolic link to a branch. Instead, we obtain the
> > > + * pathname of the file from the root and use stat().
> > > + */
> > > + if (opts.aufs && (f = fixup_aufs_fd_path(vma_area->vm_file_fd))) {
> > > + if (stat(f, st_buf) < 0) {
> > > + pr_perror("Failed stat on %d's map %lu (%s)",
> > > + pid, start, f);
> > > + goto err;
> > > + }
> > > + } else if (fstat(vma_area->vm_file_fd, st_buf) < 0) {
> > > pr_perror("Failed fstat on %d's map %lu", pid, start);
> > > goto err;
> > > }
> >
> > Why do we need the 2nd call to fixup branch name? Can't we just stat
> > the vma->aufs_rpath here?
> >
> >
> > That's because vma->aufs_rpath is relative from the root of the mount namespace (e.g., ./bin/busybox) but we are not in that directory (we're in /). To avoid calling fixup we can construct the full path by prepending aufs_root as follows:
> >
> > - if (opts.aufs && (f = fixup_aufs_fd_path(vma_area->vm_file_fd))) {
> > - if (stat(f, st_buf) < 0) {
> > + if (opts.aufs && vma_area->aufs_rpath && opts.aufs_root) {
> > + char path[PATH_MAX];
> > + int n;
> > +
> > + n = snprintf(path, PATH_MAX, "%s/%s", opts.aufs_root, vma_area->aufs_rpath);
> > + if (n >= PATH_MAX) {
> > + pr_err("Path %s/%s too long\n", opts.aufs_root, vma_area->aufs_rpath);
> > + goto err;
> > + }
> > + if (stat(path, st_buf) < 0) {
> >
> >
> > If you prefer this approach, I will send you a new patch with this change and removal of fixup_aufs_fd_path().
>
> Well yes :) I was confused by the 2nd call to fixup_aufs_path from inside
> fixup_aufs_fd_path. It would call read_fd_link and walk the array of branches
> again, which is not necessary, since we did it already in parse_smaps(). So
> it's better, from my perspective, not to do extra work.
>
> Just an optimization suggestion -- since we see the full path in parse_smaps(),
> we can save one on vma->ppage_bitmap (it's also restore-only field) and just
> re-use one here.
>
> Thanks,
> Pavel
>
>
More information about the CRIU
mailing list