[Devel] Re: [RFC][PATCH 2/2] pidns: Remove proc flush races when a pid namespaces are exiting.

Louis Rilling Louis.Rilling at kerlabs.com
Fri Jul 9 05:14:25 PDT 2010


On 08/07/10 21:39 -0700, Eric W. Biederman wrote:
> 
> Currently it is possible to put proc_mnt before we have flushed the
> last process that will use the proc_mnt to flush it's proc entries.
> 
> This race is fixed by not flushing proc entries for dead pid
> namespaces, and calling pid_ns_release_proc unconditionally from
> zap_pid_ns_processes after the pid namespace has been declared dead.

One comment below.

> 
> To ensure we don't unnecessarily leak any dcache entries with skipped
> flushes pid_ns_release_proc flushes the entire proc_mnt when it is
> called.
> 
> Signed-off-by: Eric W. Biederman <ebiederm at xmission.com>
> ---
>  fs/proc/base.c         |    9 +++++----
>  fs/proc/root.c         |    3 +++
>  kernel/pid_namespace.c |    1 +
>  3 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index acb7ef8..e9d84e1 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2742,13 +2742,14 @@ void proc_flush_task(struct task_struct *task)
>  
>  	for (i = 0; i <= pid->level; i++) {
>  		upid = &pid->numbers[i];
> +
> +		/* Don't bother flushing dead pid namespaces */
> +		if (test_bit(PIDNS_DEAD, &upid->ns->flags))
> +			continue;
> +

IMHO, nothing prevents zap_pid_ns_processes() from setting PIDNS_DEAD and
calling pid_ns_release_proc() right now. zap_pid_ns_processes() does not wait
for EXIT_DEAD (self-reaping) children to be released.

Thanks,

Louis

>  		proc_flush_task_mnt(upid->ns->proc_mnt, upid->nr,
>  					tgid->numbers[i].nr);
>  	}
> -
> -	upid = &pid->numbers[pid->level];
> -	if (upid->nr == 1)
> -		pid_ns_release_proc(upid->ns);
>  }
>  
>  static struct dentry *proc_pid_instantiate(struct inode *dir,
> diff --git a/fs/proc/root.c b/fs/proc/root.c
> index cfdf032..2298fdd 100644
> --- a/fs/proc/root.c
> +++ b/fs/proc/root.c
> @@ -209,5 +209,8 @@ int pid_ns_prepare_proc(struct pid_namespace *ns)
>  
>  void pid_ns_release_proc(struct pid_namespace *ns)
>  {
> +	/* Flush any cached proc dentries for this pid namespace */
> +	shrink_dcache_parent(ns->proc_mnt->mnt_root);
> +
>  	mntput(ns->proc_mnt);
>  }
> diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
> index 92032d1..43dec5d 100644
> --- a/kernel/pid_namespace.c
> +++ b/kernel/pid_namespace.c
> @@ -189,6 +189,7 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns)
>  		rc = sys_wait4(-1, NULL, __WALL, NULL);
>  	} while (rc != -ECHILD);
>  
> +	pid_ns_release_proc(pid_ns);
>  	acct_exit_ns(pid_ns);
>  	return;
>  }
> -- 
> 1.6.5.2.143.g8cc62
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://lists.openvz.org/pipermail/devel/attachments/20100709/f2deb7e9/attachment-0001.sig>
-------------- next part --------------
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers


More information about the Devel mailing list