[CRIU] [PATCH] namespace: mark mount namespaces as populated after the forking stage

Andrew Vagin avagin at virtuozzo.com
Tue Jul 5 13:08:04 PDT 2016


On Thu, Jun 16, 2016 at 02:12:29PM +0300, Pavel Emelyanov wrote:
> On 06/15/2016 07:29 PM, Andrew Vagin wrote:
> > On Wed, Jun 15, 2016 at 02:46:52PM +0300, Pavel Emelyanov wrote:
> >> On 06/11/2016 09:01 PM, Andrey Vagin wrote:
> >>> From: Andrew Vagin <avagin at virtuozzo.com>
> >>>
> >>> Currently we mark a mount namespaces as populated when a target process
> >>> (ns_pid) switches into it. But if a process inherited the right
> >>> namespace from a parent, it doesn't call do_restore_task_mnt_ns()
> >>> and a namespace can remain unmarked.
> >>
> >> Why? It should be marked by the task that created one.
> > 
> > What is it create? All namespaces are created by the root task.
> 
> I don't get the "and a namespace remain unmarked" problem.

do_restore_task_mnt_ns() is not executed for nsid->ns_pid, sp
nsid->ns_populated is never set.

@@ -3112,9 +3112,6 @@ static int do_restore_task_mnt_ns(struct ns_id
*nsid, struct pstree_item *curren
        }
        close(fd);
 
-       if (nsid->ns_pid == current->pid.virt)
-               futex_set_and_wake(&nsid->ns_populated, 1);
-
        return 0;
 }
 



> 
> > All these don't matter with this patch.
> > 
> >>
> >>> af55c059fb6 ("mount: fix a race between restoring namespaces and file mappings")
> >>> After this patch we could simplify logic around ns_populated. Currently
> >>> it's a futex, but nodoby waits on it. We can set ns_populated when we
> >>> are going to close namespace descriptors. To avoid additional locks, we
> >>> can do this when all task pass the forking stage and don't start the
> >>> next stage.
> >>
> >> Don't we need barriers between ns->populated = true and close(ns->fd)?
> > 
> > We don't need, because we set ns_populated when all task are stopped.
> 
> OK.
> 
> >>From a root task, we wait when all task have passed the forking stage
> > and then enumirate all namespaces and mark them as populated.
> >>
> >> (more comments inline)
> >>
> >>> https://jira.sw.ru/browse/PSBM-48222
> >>>
> >>> Signed-off-by: Andrew Vagin <avagin at virtuozzo.com>
> >>> ---
> >>>  criu/cr-restore.c         |  7 ++++---
> >>>  criu/include/namespaces.h |  2 +-
> >>>  criu/mount.c              | 14 ++++++--------
> >>>  criu/namespaces.c         |  4 ++--
> >>>  4 files changed, 13 insertions(+), 14 deletions(-)
> >>>
> >>> diff --git a/criu/cr-restore.c b/criu/cr-restore.c
> >>> index 1e3d823..5511447 100644
> >>> --- a/criu/cr-restore.c
> >>> +++ b/criu/cr-restore.c
> >>> @@ -1309,16 +1309,17 @@ static int restore_task_with_children(void *_arg)
> >>>  
> >>>  	restore_pgid();
> >>>  
> >>> -	if (restore_finish_stage(CR_STATE_FORKING) < 0)
> >>> -		goto err;
> >>> -
> >>>  	if (current->parent == NULL) {
> >>> +		futex_wait_while_gt(&task_entries->nr_in_progress, 1);
> >>
> >> Why is it needed here?
> > 
> > Wait when all tasks have passed the forking stage.
> 
> We have a helper for it, don't we?

We have a helper to wait from the criu process, but we don't have a
helper to wait from the root task.

> Also I'd appreciate a comment here saying why we need to wait.

Ok
> 
> >>
> >>>  		if (depopulate_roots_yard())
> >>>  			goto err;
> >>>  
> >>>  		fini_restore_mntns();
> >>>  	}
> >>>  
> >>> +	if (restore_finish_stage(CR_STATE_FORKING) < 0)
> >>> +		goto err;
> >>> +
> >>>  	if (restore_one_task(current->pid.virt, ca->core))
> >>>  		goto err;
> >>>  
> 


More information about the CRIU mailing list