[Devel] Device Namespaces

Eric W. Biederman ebiederm at xmission.com
Thu Oct 3 02:17:17 PDT 2013


Amir Goldstein <amir at cellrox.com> writes:

> Excellent! let's focus the discussion on a new device driver we want
> to write
> which is namespace aware. let's call this device driver valarm-dev.
> Similarly to Android's alarm-dev, valarm-dev can be used to request
> RTC wakeup calls
> from user space and get/set RTC values, but with valarm-dev, every
> container
> may use different values for current time.
>
> As you can see in our patch set, we already have a version of
> alarm-dev that maintains
> its state inside a context, instead of in global variable, so it is
> capable of providing
> different context per namespace.
>
> And now for the 1M$ question: per *which* namespace do we attribute
> the current realtime clock time?

To none of them.  Just use a different minor per instance, then you
don't have a hard question to answer.

> To UTS namespace (because T historically stands for Time)? To device
> namespace?
> Even if device namespace would exist, we do not want to tie the policy
> decision of "separate time"
> to a very wide definition of "separate devices".
>
> So what we want to create, is an API for device driver writers, that
> will enable to write a namespace
> aware device and allow userspace to configure when the namespace aware
> device context is unshared.


> We would like to share with you our very initial thoughts about how
> this will be implemented:
> - Extend register_pernet_subsys/device(ops) API
> to register_perns_subsys/device(nstype, ops) API
> - Extend pernet_operations to perns_operations that include optional
> migrate() and/or unshare() ops
> - Let valarm-dev register_peruser_subsys/device(&alarm_userns_ops)

For the network subsystem that makes sense.  But it doesn't make sense
for devices.  It is just an unneeded extra complication.

> - Implement a new syscall (or netlink command if it makes more sense)
> setdevns(int dev_fd, int ns_fd, int nstype, int flags)

ioctl?  master device? How do people communicate with raw devices these
days?

> - Unlike the netlink set netns case, this API is not used solely to
> *move* a device to a different namespace,
>   but also to *unshare* a device context between namespaces, for those
> devices that resigtered unshare() ops.

I really think this all makes most sense a driver a virtual driver at a
time.

> This is our missing piece of the puzzle.
> After that, whether we make changes to existing drivers (e.g. evdev)
> or write new virtualized drivers (e.g. vevdev)
> is a technicality. We care not which way to go, whichever way seems
> more maintainable.
>
> What do you think of this master plan?

I think by making your devices behavior depend on which namespace they
are in you are making the drivers unnecesarily fragile, and
unnecessarily unusable.

I think the code will be simpler/cleaner/better if you don't need to
have context outside of your drivers.

> P.S. Please try to refrain from addressing the validity of the use
> case of alarm-dev in particular,
> as we do not wish to get engage "Android sucks" wars. 
> We simply want to present the case for improving the namespace
> infrastructure to cater the needs
> of device driver writers that wish to tailor their drivers for
> containers based products. 

I think this is a driver interface problem, not a namespace problem.
None of the similar drivers that exist in the network namespace
change their behavior depending on which namespace they are in.

The two practical choices I see are.
1) Use a bunch of minors for your driver.
2) Act roughly like /dev/pts and use different mounts of the filesystem
   to create new instances.

I think different minors is probably easier, but we have two successfull
models I am aware of so I have mentioned both.

Eric





More information about the Devel mailing list