[CRIU] AUFS Support in CRIU
Pavel Emelyanov
xemul at parallels.com
Tue Aug 19 03:03:38 PDT 2014
On 08/18/2014 08:43 PM, Saied Kazemi wrote:
> > @@ -197,6 +199,20 @@ int fill_fdlink(int lfd, const struct fd_parms *p, struct fd_link *link)
> > return -1;
> > }
> >
> > + /*
> > + * If the link file descriptor (lfd) points to a
> > + * file in an AUFS branch, the pathname should be
> > + * replaced with a pathname from root.
> > + */
> > + if (opts.aufs) {
> > + int n = sizeof link->name - 1;
> > + n = fixup_aufs_path(&link->name[1], n, true);
>
> Since you say, that we only see spoiled paths in map_files, do we need
> the path fixup here? Isn't the one performed in parsing the mappings
> not enough?
>
>
> We need this logic here because the link->name is in an AUFS branch. For example, these are the paths that are passed to fill_fdlink() during my test:
>
> /dev/null
> /var/lib/docker/aufs/diff/<ID>/bin/busybox
> /var/lib/docker/aufs/diff/<ID>/lib/libuClibc-0.9.33.2.so <http://libuClibc-0.9.33.2.so>
> /var/lib/docker/aufs/diff/<ID>/lib/ld64-uClibc-0.9.33.2.so <http://ld64-uClibc-0.9.33.2.so>
> /
OK, the only way why spoiled paths leak there is through the dump_filemap(),
and the file we want to dump there was already seen in parse_smaps(). Can we
make path fixup only once? I think we can
* fixup the path in parse_smaps()
* put one on the vma_area() (can be unione-ed with page_bitmap, it's restore field)
* then in dump_filemap() put one as the link pointer on fdparms
after this the if (!p->link) check in dump_one_reg_file() would avoid calling
the fill_fdlink.
What do you think?
>
>
> > + if (n < 0)
> > + return -1;
> > + if (n > 0)
> > + len = n;
> > + }
> > +
> > link->len = len + 1;
> > return 0;
> > }
>
> > +int fixup_aufs_path(char *path, int size, bool chop)
> > +{
> > + char rpath[PATH_MAX];
> > + int n;
> > + int blen;
> > +
> > + if (aufs_branches == NULL) {
> > + pr_err("No aufs branches to search for %s\n", path);
> > + return -1;
>
> Presumably, if we don't have AUFS we shouldn't fail :)
>
>
> If we don't have AUFS, we should not specify --aufs and, therefore, we should not be calling
> fixup_aufs_path(). So, if we get to fixup_aufs_path() but don't have branch information, it's
> an error.
>
> Alternatively, we can decide not to specify --aufs and call AUFS code regardless, returning
> without error if there's no AUFS. But I added --aufs for performance reasons. It's used in
> 3 places:
>
> 1. cr-dump.c/dump_one_task()
> - to collect AUFS branch pathnames and free them when done
> 2. files.c/fill_fdlink()
> - to replace pathnames in branches with pathnames from root
> 3. proc_parse.c/parse_smaps()
> - to get the path for stat() instead of fstat()
>
Ah, I see. I was thinking that we can make CRIU automatically detect, that it
should do paths fixups, but since we'd have to specify --aufs-root anyway, then OK.
>
> > + }
> > +
>
> > +int parse_mountinfo_aufs_sbinfo(pid_t pid, char *sbinfo, int len)
> > +{
> > + char line[PATH_MAX];
> > + char *fstype = NULL;
> > + char *opt = NULL;
> > + char *cp;
> > + int n, ret;
> > +
> > + n = get_mntent_by_mountpoint(pid, "/", line, sizeof line);
> > + if (n < 0)
> > + return -1;
> > +
>
> Presumably, this would get fixed by rework of mountinfo's parser, right?
>
>
> I already reworked the callback in mountinfo's parser which was in the previous patch (see
> proc_parse.c/aufs_parse()). This is different as it's not changing anything in the mountinfo
> entry. It's just getting the sbinfo string from mountinfo root entry to collect AUFS branch
> information.
>
> dump_one_task()
> if (opts.aufs) parse_aufs_branches()
> parse_mountinf_aufs_sbinfo()
> get_mntent_by_mountpoint()
>
> It may be possible to add another call back, .post_parse, and rework mountinfo parsing. But
> .post_parse will only be used by used by AUFS and this code is temporary anyway. It will go
> away as soon as AUFS bug is fixed. So, if you're OK, let's keep it like this. For non-AUFS
> filesystems, this code isn't even called.
Yes, I agree. What bothers me is that we do mountinfo-s parsing multiple
times and introduce two independent parser codes for that. I'd prefer if
we make one-pass scanning.
>
>
> > +int aufs_parse(struct mount_info *new)
> > +{
> > + char *cp;
> > +
> > + if (!opts.aufs_root || strcmp(new->mountpoint, "./"))
>
> If we've met AUFS, but no --aufs-root specified, shouldn't we fail the dump?
>
>
> The reason it doesn't return an error is that as soon as the mountinfo entry for root is
> fixed (i.e., shows the root pathname instead of just /), we won't have to specify --aufs-root
> anymore and shouldn't fail in aufs_parse().
>
>
>
>
> > + return 0;
> > +
> > + cp = malloc(strlen(opts.aufs_root) + 1);
> > + if (!cp) {
> > + pr_err("Cannot allocate memory for %s\n", opts.aufs_root);
> > + return -1;
> > + }
> > + strcpy(cp, opts.aufs_root);
> > +
> > + pr_debug("Replacing %s with %s\n", new->root, opts.aufs_root);
> > + free(new->root);
> > + new->root = cp;
>
> What if we have more than one AUFS mountpoint with different roots? I'm
> OK if we only support one, but then we should fail the 2nd mountpoint met.
>
>
> Here, we're replacing the root of the container which by definition is only 1. But if different
> processes in the process tree have different AUFS mounts, their respective branch pathnames will
> be handed one by one in parse_aufs_branches() when called from dump_one_task().
I meant -- what if one process sees two different AUFS mounts.
> > +
> > + return 0;
> > +}
>
>
> Thanks,
> Pavel
>
>
> Thanks again for your feedback. I hope the answers are clear. If it's easier or more productive,
> I am available on video chat to discuss in greater detail. My schedule is flexible, so time
> difference shouldn't be a problem.
OK, I'll keep that in mind :) Your timezone is US Pacific one, right?
Thanks,
Pavel
More information about the CRIU
mailing list