[CRIU] Fix cpuset restore-in-root bug

Wed Oct 8 07:40:23 PDT 2014

On Wed, Oct 08, 2014 at 06:30:58PM +0400, Pavel Emelyanov wrote:
> On 10/08/2014 06:25 PM, Pavel Emelyanov wrote:
> > On 10/08/2014 06:16 PM, Tycho Andersen wrote:
> >> On Wed, Oct 08, 2014 at 06:02:04PM +0400, Pavel Emelyanov wrote:
> >>> On 10/08/2014 05:57 PM, Pavel Emelyanov wrote:
> >>>> On 10/08/2014 02:43 AM, Tycho Andersen wrote:
> >>>>> Hi Pavel,
> >>>>>
> >>>>> On Tue, Oct 07, 2014 at 03:58:39PM -0500, Tycho Andersen wrote:
> >>>>>> On Tue, Oct 07, 2014 at 11:15:34PM +0400, Pavel Emelyanov wrote:
> >>>>>>> On 10/07/2014 06:16 PM, Tycho Andersen wrote:
> >>>>>>>> Hi Pavel,
> >>>>>>>>
> >>>>>>>> On Tue, Oct 07, 2014 at 12:58:34PM +0400, Pavel Emelyanov wrote:
> >>>>>>>>> On 10/06/2014 06:03 PM, Tycho Andersen wrote:
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> Based on some discussion, we shouldn't try to do any intelligent copying of any
> >>>>>>>>>> properties anywhere else in the tree. These cpuset properties are special only
> >>>>>>>>>> in the sense that they need to be restored before the move_in_cgroup call, and
> >>>>>>>>>> if the parent cgroups have conflicting options (or empty options), it is best
> >>>>>>>>>> to just let the restore fail.
> >>>>>>>>>
> >>>>>>>>> Patches applied, thanks!
> >>>>>>>>>
> >>>>>>>>>> Also, I noticed when making this patch that I am getting a lot of:
> >>>>>>>>>>
> >>>>>>>>>> (00.092575)    290: Error (util.c:101): Unable to close fd 1016: Bad file descriptor
> >>>>>>>>>
> >>>>>>>>> Ouch. Can you show the full log please?
> >>>>>>>>
> >>>>>>>> Yes, these are with the current master:
> >>>>>>>>
> >>>>>>>> http://paste.ubuntu.com/8514798/
> >>>>>>>
> >>>>>>> Hm... I've run cgroup* tests on recent HEAD and no errors in logs.
> >>>>>>>
> >>>>>>>> also, 1016 is the cg service fd, so it seems like maybe a recent
> >>>>>>>> reordering broke something? I can do a bisect if you take a look at
> >>>>>>>> the log and nothing obvious pops out.
> >>>>>>>
> >>>>>>> Yes, please. If you would be able to shed more light on this, that
> >>>>>>> would be awesome.
> >>>>>>
> >>>>>> Hmm. I tried to bisect with a patch that I know I wasn't seeing this
> >>>>>> problem with, but even that one has this issue now. Something about my
> >>>>>> environment must have changed, I guess? I will investigate more and
> >>>>>> keep you posted.
> >>>>>
> >>>>> The patch below fixes the issue for me. Does it make sense? I have no
> >>>>> idea why we/I didn't see this earlier.
> >>>>>
> >>>>> Tycho
> >>>>>
> >>>>>
> >>>>>
> >>>>> >From 0f3aad8a78aee56ff7794db33cf76c5d5aae5039 Mon Sep 17 00:00:00 2001
> >>>>> From: Tycho Andersen <tycho.andersen at canonical.com>
> >>>>> Date: Tue, 7 Oct 2014 17:25:26 -0500
> >>>>> Subject: [PATCH] restore: don't race when closing cg yard
> >>>>>
> >>>>> If CLONE_FILES is set, we only have one copy of the cg yard fd, so we shouldn't
> >>>>> all close it. Further, we should wait to close it until after the FORKING stage
> >>>>> is done, since it is needed during that stage.
> >>>>>
> >>>>> If CLONE_FILES is not set, each process can safely close their copy of the fd.
> >>>>>
> >>>>> Signed-off-by: Tycho Andersen <tycho.andersen at canonical.com>
> >>>>> ---
> >>>>>  cr-restore.c | 17 +++++++++--------
> >>>>>  1 file changed, 9 insertions(+), 8 deletions(-)
> >>>>>
> >>>>> diff --git a/cr-restore.c b/cr-restore.c
> >>>>> index fd35bef..c12d28e 100644
> >>>>> --- a/cr-restore.c
> >>>>> +++ b/cr-restore.c
> >>>>> @@ -1461,14 +1461,6 @@ static int restore_task_with_children(void *_arg)
> >>>>>  	if (create_children_and_session())
> >>>>>  		goto err_fini_mnt;
> >>>>>  
> >>>>> -	/*
> >>>>> -	 * This must be done after forking to allow child
> >>>>> -	 * to get the cgroup fd so it can move into the
> >>>>> -	 * correct /tasks file if it is in a different cgroup
> >>>>> -	 * set than its parent
> >>>>> -	 */
> >>>>> -	close_service_fd(CGROUP_YARD);
> >>>>> -
> >>>>>  	if (restore_task_mnt_ns(current))
> >>>>>  		goto err_fini_mnt;
> >>>>>  
> >>>>> @@ -1480,6 +1472,15 @@ static int restore_task_with_children(void *_arg)
> >>>>>  	if (restore_finish_stage(CR_STATE_FORKING) < 0)
> >>>>>  		goto err_fini_mnt;
> >>>>>  
> >>>>> +	/*
> >>>>> +	 * This must be done after forking to allow child
> >>>>> +	 * to get the cgroup fd so it can move into the
> >>>>> +	 * correct /tasks file if it is in a different cgroup
> >>>>> +	 * set than its parent
> >>>>> +	 */
> >>>>> +	if (!(ca->clone_flags & CLONE_FILES) || current->parent == NULL)
> >>>>> +		close_service_fd(CGROUP_YARD);
> >>>>
> >>>> I think what really makes sense here is the CLONE_FILES check, because
> >>>> all children forking happens in create_children_and_session().
> >>>>
> >>>> And it looks like I know why this appeared only recently -- the helper
> >>>> tasks are now created with CLONE_FILES, so we may have a lot of helpers.
> >>>
> >>> After a bit more thinking and reading the code. The place where this close
> >>> is called also matters -- if we close CGYARD before other tasks forked, the
> >>> latter ones may fail to obtain this service fd.
> >>
> >> Yeah, it isn't necessarily clear from the diff, but I moved it after
> >> the FORKING stage to avoid this problem.
> > 
> > Yes, and this is also correct. I'm still thinking about the CLONE_FILES check.
> > 
> >>> And one more notice -- it looks like we have generic service descriptors 
> >>> closing problem. Helpers inherit files table from parent but close the
> >>> descriptors.
> >>
> >> By this you mean that there are some other bugs like this as well?
> > 
> > Should be -- other service descriptors are closed just with close_service_fd,
> > but seem not to emit such warnings -- I've run the session00 test that generates 
> > helpers and haven't found any errors like that.
> 
> Ah, of course. All other close_service_fd-s are not called by helpers :)
> The restore_one_task() switches the task state and only the code run by
> alive and stopped has these closes.
> 
> Can you check whether moving the close for cgyard somewhere inside e.g.
> close_old_fds helps?

We can't close it in close_old_fds, that happens too early :(

Tycho

> Thanks,
> Pavel
> 
>