[Devel] [RFC rh7 v2] ve/tty: vt -- Implement per VE support for console and terminals

Wed Aug 5 04:59:50 PDT 2015

On Wed, Aug 05, 2015 at 12:51:11PM +0300, Vladimir Davydov wrote:
> > > Why do you add this functionality if it is not supposed to be used? IMO,
> > > if you only allowed opening vtty-slave using ioctl, the implementation
> > > would be much simpler.
> > 
> > The ability of accessing slave peers from inside of VE using cgroups
> > mechanism has been an initial idea of all this code. It happened later
> > when we deside to provide ioctls instead.
> 
> I'm a bit lost :-/ Whose was the idea? I thought the initial idea was to
> port PCS6 stuff as is, but I started to review this code only after
> several iterations had been done, so I might be completely wrong.

At the very first we wanted to move pcs6 code into pcs7 codebase
but overall idea, as far as I remember, was to not use any specific
code (ie our ioctl interfaces and such) until really needed. Thus
in first iterations I didn't use ioctls at all, so any client
can be accessing container's console, not only our prctl/vzctl tool.
One simply may move client/reader into the VE cgroup and open vtty slave
peer to fetch console data. It may be some loggin agent or whatever.
But instead now we using ioctl to open container's console and
I think this is not the best option but let it be.

> > And no, implementation won't be anyhow simplier because you have to
> > find out _which_ VE you're installing terminal from is currently
> > active.
> 
> But you wouldn't have to implement the case when vtty slave (master in
> PCS6) is looked up or installed then. Less branches, easier to read.

Hm? pcs6 does lookup over both masters and slave, see vtty_lookup in pcs6,
it's not anyhow easier.

> > New terminal allocation is happening from inside of lookup/install helpers
> > which have no other data under the hands than driver+index. Since we are
> > going to rip off driver::ve, we can't longer rely on this.
> > 
> > Moreover, currently I'm testing vzctl console CTID index functionality
> > and as you might imagine this routine is called from inside of ve0 context
> > so I added additional helpers which allow to lookup into specified VEs
> > map, instead of get_exec_env.
> > 
> > > BTW, a side question. In PCS6 /dev/console = vtty slave, while in your
> > > code it is master. Why did you decide to reverse things? Why not just
> > > port what we have in PCS6?
> > 
> > Actually I don't understand why it's a slave peers assigned to /dev/console
> > and /dev/tty's. Maybe it was done for some historical reason? I don't know
> > the background. The vtty code is close to pty code in pcs6 and I presume
> > some ideas have been taken: in particular several "readers/writers" might
> > be opening console and ttys, but in terms of pty code only one "master"
> > is allowed (iow to make a pair one have to open ptmx device, next open
> > of the same device cause new pair to generate). So I assume to be somehow
> > close to the ideas of unix98 terminals we provided "slave" peers to emulate
> > console and ttys, because slave peers may be opened several times.
> > 
> > But I decided to make it straight -- there is many to many mapping:
> 
> Sorry, but it is not straight, it is screwed up IMO. If we used Unix98
> ptys (which is what we would do if we were not so lazy), we would have
> pty slave open by a container (/dev/console) and the master open and
> read/flushed by a dispatcher thread. This is how it works in LXC AFAIK.
> Now, we don't want to patch the dispatcher, so we want to make the
> master throw out data automatically if it is not open.

I was about to write a long description why I think it's not
that good, but I see what you mean. Letme try to reverse the
approach and implement pcs6 1:1 behaviour with as much pcs6
code fetched as I could (I already tried but didn't success
can't remember which exactly problems I met).

> > if someone has opened a master peer (/dev/console|/dev/tty),
> > then it's allowed to open it again while it's alive. And because
> > the primary users of these devices are the container itself
> > (agetty service running inside container) they are the master peers.
> 
> The primary user should be a dispatcher, which works outside the
> container and has the master always open. Container can open/close the
> slave, but the master must always be open.
> 
> > 
> > As to why not simply port pcs6 code -- the code is close to one
> > we use in pcs6, except
> > 
> >  - console and ttys are the masters as they should be I think
> 
> I don't think it's right, conceptually, as I explained above.
> 
> >  - we don't patch tty_struct itself to carry pointers to owner_env,
> 
> You don't have to. Place it in tty->driver_data.

And I did.

> >    all VEs are strictly isolated (in turn in pcs6 we carry a global
> >    list of master peers which makes me wonder: every master is
> >    unique in its name, why on earth we keep them in a global list,
> 
> For simplicity I suppose.
> 
> >    this is a mystery to me. having a global list make lookup to be a way
> >    slower as it could be if there are high number of containers
> >    currently running). In the patch each VE has own vtty map and lookup
> >    is going in O(1) time in compare to O(n) as in pcs6
> 
> Nobody cares about the algorithmic complexity here, because it is far
> not a hot path. Anyway, I don't see why one can't use idr to lookup
> masters in PCS6 design.

Every container opens vtty internally (actually a few pairs)
for console and login daemon. Which means every new container
startup is a bit slower if another is already running. So yes,
it's not _that_ hot path but quite far from "nobody cares".

> >  - the rest is due to changes in tty layer itself
> > 
> 
> > >> +static void vtty_cleanup(struct tty_struct *tty)
> > >> +{
> > >> +	cancel_work_sync(&tty->port->buf.work);
> > > 
> > > Why? Isn't this supposed to be done in tty_port_put?
> > 
> > Not sure I follow this moment. Canceling pending work
> 
> tty_port_put -> tty_port_destructor -> tty_port_destroy ->
> cancel_work_sync

Ah this one. Actually the benefit of using cancel_work_sync
here (and it's called in release_tty too for the same reason)
is to speedup exiting as far as I can say. For our case there
can't be several references on the same port because we're
allocating it by hand and don't reuse, so right, not needed,
thanks.

> 
> > is done in tty_release (ie when file is closed) so
> > we move it here for case when peer is closed on kref-put
> > without tty_release call (otherwise for one peer cancel_work_sync
> > will be called while for other -- not).
> > 
> 
> > >> +static int __init vtty_init(void)
> > >> +{
> > >> +#define TTY_DRIVER_ALLOC_FLAGS			\
> > > 
> > > Please zap this macro and pass the flags as is.
> > 
> > I pass them into two place, so no, not going to do that:
> 
> Please submit a patch upstream that would do the same in
> unix98_pty_init() and legacy_pty_init().

Sole cleanup patches on its own are not welcome
and you know this.

> > if I need a new flag I'll update one macro instead of
> > walking the code and update two places.