[CRIU] race dumping fds in wily LXC containers

Tycho Andersen tycho.andersen at canonical.com
Wed Jul 15 13:39:29 PDT 2015


On Wed, Jul 15, 2015 at 01:01:15PM +0300, Pavel Emelyanov wrote:
>
> > Right, this fstat() fails because the previous readlink() returned a
> > "(deleted)", i.e. rpath in the fstatat() call in check_path_remap()
> > has this "(deleted)".
> 
> No, we don't check for "(deleted)" to take any decisions. Only stat/fstat
> results comparisons. The only thing we do with it is strip one from the
> file path if it's there :)

Not to make any decisions, but the problem here is that the call in
check_path_remap() looks like:

fstatat(mntns_root, "/sys/fs/cgroup/systemd/lxc/w1-4 (deleted)", &pst, 0);

which doesn't work.

> >> After this the link remap will be called.
> >>
> >> In you case it seems to be the kernel spoofing the /sys/ files names
> >> somehow so that criu is not able to stat() the name in the first place.
> > 
> > I think the initial stat succeeds somehow (since we don't get an error
> > there and it contiues on), but the subsequent readlink tacks on
> > "(deleted)" and thus the fstat of that file fails, which doesn't make
> > much sense to me. The file definitely exists, it's like there is some
> > problem readlink()ing it (perhaps because it is sent over a unix
> > socket or something? not sure).
> 
> The reason for going to link remap is stat (on a file descriptor) succeeded
> and reported non zero link count AND the subsequent fstat() on file path 
> reported ENOENT. (And an NFS special-care, but I don't think it's the case).

Right, but the problem is that we're stating the wrong file as above.
What I'm not sure about is why we're getting the wrong thing from
readlink, since the file exists.

I suspect it has something to do with fuse + sending a fd over unix
socket, but that's a hunch more than anything. I was hoping you might
know where to look :)

Tycho


More information about the CRIU mailing list