[Devel] [PATCH rh7 v2] ve/devpts: Support per-VE mount namespace
Vladimir Davydov
vdavydov at parallels.com
Wed Jul 22 02:59:22 PDT 2015
On Wed, Jul 22, 2015 at 12:23:43PM +0300, Cyrill Gorcunov wrote:
> On Wed, Jul 22, 2015 at 11:55:30AM +0300, Vladimir Davydov wrote:
> > > +
> > > +#ifdef CONFIG_VE
> > > #include <linux/ve.h>
> > > +#endif
> >
> > Why?
>
> There seem to be not many openvz related stuff, so I wrapped everything,
> including header with config, this will make compilation a bit faster.
Please don't do that :-)
>
> > >
> > > +#ifndef CONFIG_VE
> > > /* Require newinstance for all user namespace mounts to ensure
> > > * the mount options are not changed.
> > > */
> > > - if (!IS_ENABLED(CONFIG_VE) &&
> > > - (current_user_ns() != &init_user_ns) && !opts.newinstance)
> > > + if ((current_user_ns() != &init_user_ns) && !opts.newinstance)
> >
> > FWIW, once we switch to userns, which is going to be pretty soon I hope,
> > we have to tweak this again.
>
> Yes, but at moment this hunk reverts code back to former. I believe until
> we don't have a strong reason better keep code unmodified (in turn
> CONFIG_VE marks shows the modifications brought in by us).
I don't mind.
>
> > > return ERR_PTR(-EINVAL);
> > > +#endif
> > >
> > > +#ifdef CONFIG_VE
> > > + /*
> > > + * Each container has to have own devpts superblock for isolation
> > > + * sake but it makes a bad joke for us: in CRIU we test if devpts
> > > + * device in container is the same as on the node, to figure out
> > > + * if @newinstance option has to be passed (simply because in
> > > + * vanilla kernel there is no such devpts virtualization) on
> > > + * the restore. Thus every time we're restoring container
> > > + * we pass @newinstance option even if container has been
> > > + * started without this option initially.
> > > + *
> > > + * To workaround this situation here is an ugly hack: first
> > > + * mount of devpts inside container always runs without
> > > + * @newinstance option providing back virtualized superblock.
> > > + * The next mounts inside container go in a regular way.
> > > + *
> > > + * Note @devpts_once is always set for node. And be careful
> > > + * about @else branch below.
> > > + */
> > > + if (!get_exec_env()->devpts_once && get_exec_env()->_devpts_mnt)
> > > + s = sget(fs_type, compare_init_pts_sb, set_anon_super, flags, NULL);
> > > + else
> > > +#endif
> >
> > If we initialize ve->devpts_sb lazily (as we do now), we don't need this
> > hunk as well as devpts_once flag, do we? This would look cleaner IMO.
>
> 1) _devpts_mnt initialized (mounted) on container start time, as we do for
> a number of other subsystems, I think keeping it in that form better from
> readability view, no?
I'd like to keep ve_start_container as simple as possible.
>
> 2) first attempt to mount devpts inside container should be treated in a
> special way (note that restore procedure starts from inside of ve0, so
> we can't use ve_is_super here) -- ie first mount of devpts must always
> return premounted superblock we allocated when VE has been initialized.
I don't see how your patch helps that.
>
> That's a dirty hack but I don't see other way for workaround -- criu itself
> targeted on vanilla kernel which doesn't provide devpts virtualization by
> default.
>
> Or you mean to mark _devpts_mnt = nil by default, drop init/fini routines
> and use it solely instead of devpts_once + _devpts_mnt pair?
Yeah, that's what I mean, but you'll have to keep a reference to the
super block rather than vfsmount on ve_struct for that.
More information about the Devel
mailing list