[Devel] [PATCH rh7 v2] ve/devpts: Support per-VE mount namespace
Cyrill Gorcunov
gorcunov at virtuozzo.com
Wed Jul 22 02:23:43 PDT 2015
On Wed, Jul 22, 2015 at 11:55:30AM +0300, Vladimir Davydov wrote:
> > +
> > +#ifdef CONFIG_VE
> > #include <linux/ve.h>
> > +#endif
>
> Why?
There seem to be not many openvz related stuff, so I wrapped everything,
including header with config, this will make compilation a bit faster.
> >
> > +#ifndef CONFIG_VE
> > /* Require newinstance for all user namespace mounts to ensure
> > * the mount options are not changed.
> > */
> > - if (!IS_ENABLED(CONFIG_VE) &&
> > - (current_user_ns() != &init_user_ns) && !opts.newinstance)
> > + if ((current_user_ns() != &init_user_ns) && !opts.newinstance)
>
> FWIW, once we switch to userns, which is going to be pretty soon I hope,
> we have to tweak this again.
Yes, but at moment this hunk reverts code back to former. I believe until
we don't have a strong reason better keep code unmodified (in turn
CONFIG_VE marks shows the modifications brought in by us).
> > return ERR_PTR(-EINVAL);
> > +#endif
> >
> > +#ifdef CONFIG_VE
> > + /*
> > + * Each container has to have own devpts superblock for isolation
> > + * sake but it makes a bad joke for us: in CRIU we test if devpts
> > + * device in container is the same as on the node, to figure out
> > + * if @newinstance option has to be passed (simply because in
> > + * vanilla kernel there is no such devpts virtualization) on
> > + * the restore. Thus every time we're restoring container
> > + * we pass @newinstance option even if container has been
> > + * started without this option initially.
> > + *
> > + * To workaround this situation here is an ugly hack: first
> > + * mount of devpts inside container always runs without
> > + * @newinstance option providing back virtualized superblock.
> > + * The next mounts inside container go in a regular way.
> > + *
> > + * Note @devpts_once is always set for node. And be careful
> > + * about @else branch below.
> > + */
> > + if (!get_exec_env()->devpts_once && get_exec_env()->_devpts_mnt)
> > + s = sget(fs_type, compare_init_pts_sb, set_anon_super, flags, NULL);
> > + else
> > +#endif
>
> If we initialize ve->devpts_sb lazily (as we do now), we don't need this
> hunk as well as devpts_once flag, do we? This would look cleaner IMO.
1) _devpts_mnt initialized (mounted) on container start time, as we do for
a number of other subsystems, I think keeping it in that form better
from readability view, no?
2) first attempt to mount devpts inside container should be treated in a
special way (note that restore procedure starts from inside of ve0, so
we can't use ve_is_super here) -- ie first mount of devpts must always
return premounted superblock we allocated when VE has been initialized.
That's a dirty hack but I don't see other way for workaround -- criu itself
targeted on vanilla kernel which doesn't provide devpts virtualization by
default.
Or you mean to mark _devpts_mnt = nil by default, drop init/fini routines
and use it solely instead of devpts_once + _devpts_mnt pair?
More information about the Devel
mailing list