[CRIU] Introspecting userns relationships to other namespaces?

Andrew Vagin avagin at virtuozzo.com
Tue Jul 12 17:08:43 PDT 2016


On Sat, Jul 09, 2016 at 01:29:20PM -0500, Eric W. Biederman wrote:
> ebiederm at xmission.com (Eric W. Biederman) writes:
> 
> > Andrew Vagin <avagin at virtuozzo.com> writes:
> >
> >> All these thoughts about security make me thinking that kcmp is what we
> >> should use here. It's maybe something like this:
> >>
> >> kcmp(pid1, pid2, KCMP_NS_USERNS, fd1, fd2)
> >>
> >> - to check if userns of the fd1 namepsace is equal to the fd2 userns
> >>
> >> kcmp(pid1, pid2, KCMP_NS_PARENT, fd1, fd2)
> >>
> >> - to check if a parent namespace of the fd1 pidns is equal to fd pidns.
> >>
> >> fd1 and fd2 is file descriptors to namespace files.
> >>
> >> So if we want to build a hierarchy, we need to collect all namespaces
> >> and then enumerate them to check dependencies with help of kcmp.
> >
> > That is certainly one way to go.
> >
> > There is a funny case where we would want to compare a user namespace
> > file descriptor to a parent user namespace file descriptor.
> >
> >
> > Grumble, Grumble.  I think this may actually a case for creating ioctls
> > for these two cases.  Now that random nsfs file descriptors are bind
> > mountable the original reason for using proc files is not as pressing.
> >
> > One ioctl for the user namespace that owns a file descriptor.
> > One ioctl for the parent namespace of a namespace file descriptor.
> >
> > We also need some way to get a command file descriptor for a file system
> > super block.  Al Viro has a pet project for cleaning up the mount API
> > and this might be the idea excuse to start looking at that.
> >
> > (In principle we might be able to run commands through the namespace
> >  file descriptor and using an ioctl feels dirty.  But an ioctl that
> >  only uses the fd and request argument does not suffer from the same
> >  problems that ioctls that have to pass additional arguments suffer
> >  from.)
> 
> Of course it should be an error perhaps -EINVAL to get a user
> namespace owner or parent namespace that is outside of a processes
> current user namespace or pid namespace.  That way thing stay bounded
> within the current namespaces the process is in.  Which prevents any
> leak possibilities, and keeps CRIU working.

I prepared patches with ioctl-s to understand how it looks like.

Here is a whole series:
https://github.com/avagin/linux-task-diag/commits/namespaces

Here is a patch to get an owning user namespace:
https://github.com/avagin/linux-task-diag/commit/7fad8ff3fc4110bebf0920cec2388390b3bd2238
https://github.com/avagin/linux-task-diag/commit/2663bc803d324785e328261f3c07a0fef37d2088

Here is an example how it looks from user-space:
https://github.com/avagin/linux-task-diag/blob/namespaces/tools/testing/selftests/nsfs/owner.c#L49

I like the idea with ioctl-s. James, Michael, Trevor, what is your
opinion about this?

> 
> Eric


More information about the CRIU mailing list