[CRIU] [PATCH 1/2] Quick bug fix for move_in_cgroup() missing fd

Pavel Emelyanov xemul at parallels.com
Wed Aug 6 10:23:11 PDT 2014


On 08/06/2014 09:11 PM, Garrison Bellack wrote:
> These snippets below are from my conversation with Tycho Andersen about the 
> issue where he looked into it as well and confirmed the fix:
> 
> Garrison:
> This is the situation in which the bug arises:
> 1. Make a namespace jail. (I was using the pid and mnt namespaces)
> 2. Run a process in that namespace jail (ex: infiniteLoop) 
> 3. Put the nsinit in cgroups such as:
> /memory/TMP0
> /cpu/TMP0
> 4. Put the child process of nsinit in cgroups such as:
> /memory/TMP0/TMP1
> /cpu/TMP0/TMP1
> 5. Dump the nsinit process (and as such dumping/killing the child process too)
> 6. Remove the cgroups TMP0 and TMP1
> 7. Restore the nsinit process
> 
> This fails on the attempt to move the child process into TMP0/TMP1/tasks.

With what error message?

> Yes move_in_cgroup is being called twice and failing on the second call. The
> first time, move_in_cgroup puts both the parent and the child process in the 
> TMP0/ cgroup set.

move_in_cgroup() puts only one task into cgroups:

                snprintf(aux + aux_off, sizeof(aux) - aux_off, "/%s/tasks", ce->path);

                pr_debug("  `-> %s\n", aux);
                err = fd = openat(cg, aux, O_WRONLY);
                if (fd >= 0) {
                        /*
                         * Writing zero into this file moves current
                         * task w/o any permissions checks :)
                         */
                        err = write(fd, "0", 1);
                        close(fd);
                }

-- the current one.

> The second call is trying to move the child into the TMP0/TMP1 cg set and fails
> due to not getting the correct fd from get_service_fd even though the correct fd 
> still seems to be around.
> 
> By calling close_service_fd() at the end of move_in_cgroup(), you are turning off the bit for CGROUP_YARD in sfd_map which keeps track of all service file descriptors. Then when, move_in_cgroup() gets called a second time, get_service_fd() fails because the bit is still off. 
> 
> 
> Tycho:
> It looks like these are all mutually recursive, there is a call graph:
> 
> restore_task_with_children -> create_children_and_session ->
> fork_with_pid -> restore_task_with_children

Of course, each forked task goes and forks its own kids. Each task
calls restore_task_with_children() only once.

> So it seems definitely incorrect to close the cgroup fd in
> move_in_cgroup. 

Why? It closes an fd in a single tasks, all the others still have
it till _they_ call move_in_cgroup()

> Looking at 203c291 (when move_in_cgroup was
> introduced) it seems that this loop has always existed. Seems best to
> fix it (and I think you're right that we can just delete the call,
> since it is closed in fini_cgroup() on error or success).

fini_cgroup() is only called by criu() processes -- all the others
never reach this code.

> Without the close_service_fd call it all seems to
> work just fine. 

If you check the restored processes you'd find one "extra" file
descriptor in their tables -- the non-closed 

> Anyway, I can confirm this bug and fix.
> 
> On Tue, Aug 5, 2014 at 10:35 PM, Pavel Emelyanov <xemul at parallels.com <mailto:xemul at parallels.com>> wrote:
> 
>     On 08/05/2014 11:50 PM, gbellack at google.com <mailto:gbellack at google.com> wrote:
>     > From: gbellack <gbellack at google.com <mailto:gbellack at google.com>>
>     >
>     > There is an issue where if the proccess to be killed spawns a child proccess and
>     > moves it in a child cgroup of the one the parent proccess is in, upon restore,
>     > move_in_cgroup() is called twice as it should be (once to move the parent
>     > proccess and once to move the child proccess) but the file descriptor has
>     > already been closed causing a failure for the second call to move_in_cgroup().
> 
>     Can you provide more details please? The move_in_cgroup() is supposed to
>     move task only once, how does the 2nd time happen?
> 
>     > Change-Id: I6ae88b95c5410a7f56108e28eb3133f113e868d0
>     > Signed-off-by: Garrison Bellack <gbellack at google.com <mailto:gbellack at google.com>>
>     > ---
>     >  cgroup.c | 1 -
>     >  1 file changed, 1 deletion(-)
>     >
>     > diff --git a/cgroup.c b/cgroup.c
>     > index 06311e4..8c99e9d 100644
>     > --- a/cgroup.c
>     > +++ b/cgroup.c
>     > @@ -617,7 +617,6 @@ static int move_in_cgroup(CgSetEntry *se)
>     >               }
>     >       }
>     >
>     > -     close_service_fd(CGROUP_YARD);
>     >       return 0;
>     >  }
>     >
>     >
> 
> 



More information about the CRIU mailing list