<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Aug 19, 2014 at 3:03 AM, Pavel Emelyanov <span dir="ltr"><<a href="mailto:xemul@parallels.com" target="_blank">xemul@parallels.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">On 08/18/2014 08:43 PM, Saied Kazemi wrote:<br>
<br>
> > @@ -197,6 +199,20 @@ int fill_fdlink(int lfd, const struct fd_parms *p, struct fd_link *link)<br>
> > return -1;<br>
> > }<br>
> ><br>
> > + /*<br>
> > + * If the link file descriptor (lfd) points to a<br>
> > + * file in an AUFS branch, the pathname should be<br>
> > + * replaced with a pathname from root.<br>
> > + */<br>
> > + if (opts.aufs) {<br>
> > + int n = sizeof link->name - 1;<br>
> > + n = fixup_aufs_path(&link->name[1], n, true);<br>
><br>
> Since you say, that we only see spoiled paths in map_files, do we need<br>
> the path fixup here? Isn't the one performed in parsing the mappings<br>
> not enough?<br>
><br>
><br>
> We need this logic here because the link->name is in an AUFS branch. For example, these are the paths that are passed to fill_fdlink() during my test:<br>
><br>
> /dev/null<br>
> /var/lib/docker/aufs/diff/<ID>/bin/busybox<br>
</div>> /var/lib/docker/aufs/diff/<ID>/lib/<a href="http://libuClibc-0.9.33.2.so" target="_blank">libuClibc-0.9.33.2.so</a> <<a href="http://libuClibc-0.9.33.2.so" target="_blank">http://libuClibc-0.9.33.2.so</a>><br>
> /var/lib/docker/aufs/diff/<ID>/lib/<a href="http://ld64-uClibc-0.9.33.2.so" target="_blank">ld64-uClibc-0.9.33.2.so</a> <<a href="http://ld64-uClibc-0.9.33.2.so" target="_blank">http://ld64-uClibc-0.9.33.2.so</a>><br>
> /<br>
<br>
OK, the only way why spoiled paths leak there is through the dump_filemap(),<br>
and the file we want to dump there was already seen in parse_smaps(). Can we<br>
make path fixup only once? I think we can<br>
<br>
* fixup the path in parse_smaps()<br>
* put one on the vma_area() (can be unione-ed with page_bitmap, it's restore field)<br>
* then in dump_filemap() put one as the link pointer on fdparms<br>
<br>
<br>
after this the if (!p->link) check in dump_one_reg_file() would avoid calling<br>
the fill_fdlink.<br>
<br>
What do you think?<br></blockquote><div><br></div><div>It's a good idea and I reworked the code accordingly. Please see the attached patch to make sure that it's how you meant.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div class="h5"><br>
><br>
><br>
> > + if (n < 0)<br>
> > + return -1;<br>
> > + if (n > 0)<br>
> > + len = n;<br>
> > + }<br>
> > +<br>
> > link->len = len + 1;<br>
> > return 0;<br>
> > }<br>
><br>
> > +int fixup_aufs_path(char *path, int size, bool chop)<br>
> > +{<br>
> > + char rpath[PATH_MAX];<br>
> > + int n;<br>
> > + int blen;<br>
> > +<br>
> > + if (aufs_branches == NULL) {<br>
> > + pr_err("No aufs branches to search for %s\n", path);<br>
> > + return -1;<br>
><br>
> Presumably, if we don't have AUFS we shouldn't fail :)<br>
><br>
><br>
> If we don't have AUFS, we should not specify --aufs and, therefore, we should not be calling<br>
> fixup_aufs_path(). So, if we get to fixup_aufs_path() but don't have branch information, it's<br>
> an error.<br>
><br>
> Alternatively, we can decide not to specify --aufs and call AUFS code regardless, returning<br>
> without error if there's no AUFS. But I added --aufs for performance reasons. It's used in<br>
> 3 places:<br>
><br>
> 1. cr-dump.c/dump_one_task()<br>
> - to collect AUFS branch pathnames and free them when done<br>
> 2. files.c/fill_fdlink()<br>
> - to replace pathnames in branches with pathnames from root<br>
> 3. proc_parse.c/parse_smaps()<br>
> - to get the path for stat() instead of fstat()<br>
><br>
<br>
</div></div>Ah, I see. I was thinking that we can make CRIU automatically detect, that it<br>
should do paths fixups, but since we'd have to specify --aufs-root anyway, then OK.<br></blockquote><div><br></div><div>When I have a bit more time, I will work on this and send a patch to automatically detect AUFS.</div>
<div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div class="h5"><br>
><br>
> > + }<br>
> > +<br>
><br>
> > +int parse_mountinfo_aufs_sbinfo(pid_t pid, char *sbinfo, int len)<br>
> > +{<br>
> > + char line[PATH_MAX];<br>
> > + char *fstype = NULL;<br>
> > + char *opt = NULL;<br>
> > + char *cp;<br>
> > + int n, ret;<br>
> > +<br>
> > + n = get_mntent_by_mountpoint(pid, "/", line, sizeof line);<br>
> > + if (n < 0)<br>
> > + return -1;<br>
> > +<br>
><br>
> Presumably, this would get fixed by rework of mountinfo's parser, right?<br>
><br>
><br>
> I already reworked the callback in mountinfo's parser which was in the previous patch (see<br>
> proc_parse.c/aufs_parse()). This is different as it's not changing anything in the mountinfo<br>
> entry. It's just getting the sbinfo string from mountinfo root entry to collect AUFS branch<br>
> information.<br>
><br>
> dump_one_task()<br>
> if (opts.aufs) parse_aufs_branches()<br>
> parse_mountinf_aufs_sbinfo()<br>
> get_mntent_by_mountpoint()<br>
><br>
> It may be possible to add another call back, .post_parse, and rework mountinfo parsing. But<br>
> .post_parse will only be used by used by AUFS and this code is temporary anyway. It will go<br>
> away as soon as AUFS bug is fixed. So, if you're OK, let's keep it like this. For non-AUFS<br>
> filesystems, this code isn't even called.<br>
<br>
</div></div>Yes, I agree. What bothers me is that we do mountinfo-s parsing multiple<br>
times and introduce two independent parser codes for that. I'd prefer if<br>
we make one-pass scanning.<br></blockquote><div><br></div><div>I reworked this part too via the .parse callback so there are no additional mountinfo parsings :)</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div class="h5"><br>
><br>
><br>
> > +int aufs_parse(struct mount_info *new)<br>
> > +{<br>
> > + char *cp;<br>
> > +<br>
> > + if (!opts.aufs_root || strcmp(new->mountpoint, "./"))<br>
><br>
> If we've met AUFS, but no --aufs-root specified, shouldn't we fail the dump?<br>
><br>
><br>
> The reason it doesn't return an error is that as soon as the mountinfo entry for root is<br>
> fixed (i.e., shows the root pathname instead of just /), we won't have to specify --aufs-root<br>
> anymore and shouldn't fail in aufs_parse().<br>
><br>
><br>
><br>
><br>
> > + return 0;<br>
> > +<br>
> > + cp = malloc(strlen(opts.aufs_root) + 1);<br>
> > + if (!cp) {<br>
> > + pr_err("Cannot allocate memory for %s\n", opts.aufs_root);<br>
> > + return -1;<br>
> > + }<br>
> > + strcpy(cp, opts.aufs_root);<br>
> > +<br>
> > + pr_debug("Replacing %s with %s\n", new->root, opts.aufs_root);<br>
> > + free(new->root);<br>
> > + new->root = cp;<br>
><br>
> What if we have more than one AUFS mountpoint with different roots? I'm<br>
> OK if we only support one, but then we should fail the 2nd mountpoint met.<br>
><br>
><br>
> Here, we're replacing the root of the container which by definition is only 1. But if different<br>
> processes in the process tree have different AUFS mounts, their respective branch pathnames will<br>
> be handed one by one in parse_aufs_branches() when called from dump_one_task().<br>
<br>
</div></div>I meant -- what if one process sees two different AUFS mounts.<br></blockquote><div><br></div><div>I think it should work because the branches are set for each nsid. This is another thing that we can address as time allows going forward.</div>
<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class=""><br>
> > +<br>
> > + return 0;<br>
> > +}<br>
><br>
><br>
> Thanks,<br>
> Pavel<br>
><br>
><br>
> Thanks again for your feedback. I hope the answers are clear. If it's easier or more productive,<br>
> I am available on video chat to discuss in greater detail. My schedule is flexible, so time<br>
> difference shouldn't be a problem.<br>
<br>
</div>OK, I'll keep that in mind :) Your timezone is US Pacific one, right?<br></blockquote><div><br></div><div>Yes, I am in Mountain View.</div><div><br></div><div>Looking forward to your comments on this patch which I think has addressed everything above.</div>
<div><br></div><div>Cheers!</div><div><br></div><div>--Saied</div><div><br></div></div></div></div>