[CRIU] race dumping fds in wily LXC containers

Tycho Andersen tycho.andersen at canonical.com
Tue Jun 30 14:45:00 PDT 2015


Hi all,

I'm trying to debug a strange race that happens sometimes when
checkpointing wily ubuntu LXC containers. The symptom of the race is:

(00.020270) Error (files-reg.c:527): Can't link remap to /sys/fs/cgroup/systemd/lxc/w1 (deleted): Operation not permitted

The problem here seems to be that the readlink on criu's
/proc/self/fd/$the_fd_for_that_file gives a "(deleted)" result, which
subsequently confuses things. (In fact, I'm a little confused about
how dump_linked_remap() works at all, given that just before it is
called the fstatat() fails; but let's ignore that for now.)

The strangest part of all this is that after the dump fails, I can
attach to the container and do a readlink on the /proc/pid/fd/$fd for
the pid in question, and it gives me the right (i.e. non-"(deleted)")
answer.

Any ideas as to what's going on here? My best guess is a kernel bug
related to sending fds (the underlying filesystem is lxcfs, a fuse
filesystem, not the traditional cgroup fs), but that's just a hunch.

Any thoughts would be appreciated.

Tycho


More information about the CRIU mailing list