[Devel] [RFC rh7 v2] ve/tty: vt -- Implement per VE support for console and terminals
Vladimir Davydov
vdavydov at parallels.com
Tue Aug 4 09:12:27 PDT 2015
On Mon, Aug 03, 2015 at 10:08:19PM +0300, Cyrill Gorcunov wrote:
> Previously in commit 8674c044330fad1458bd59b02f9037fb97e8b7af stubs for
> virtual terminals have been added, they support writes from kernel side
> which simply drops into the void.
>
> In the patch the code has been moved from kernel/ve/console.c
> to drivers/tty/pty.c to reuse a couple of pty helpers.
>
> Now we support up to MAX_NR_VTTY_CONSOLES virtual consoles inside container.
> For /dev/console we reserve first virtual terminal. Terminals represent
> the following mapping
>
> | master slave
> | /dev/cosole|/dev/tty0|/dev/vttym0 /dev/vttys0
> | ...
> | /dev/tty11|/dev/vttym11 /dev/vttys11
>
> major:minor numbers for peers are allocated dynamically by the kenel,
> thus if we need to connect to container's slave peer from the node
> we have to
>
> - scan node's /proc/tty/drivers and find the numbers for
> "vtty_slave" driver, for example
>
> | [root at pcs7 ~]# cat /proc/tty/drivers
> | ...
> | vtty_slave /dev/vttys 252 0-11 console
> | vtty_master /dev/vttym 253 0-11 console
> | ...
>
> - add major:minor pairs into allowed devices, for example
>
> | echo 'c 253:* rwm' > /sys/fs/cgroup/devices/$ctid/devices.allow
>
> Once the bullets above is done the userspace utility may
> open up slave peer and read/write data to/from. Note the
> slave may be opened if only if a master peer been previously
> opened.
>
> Some details on the driver itself:
>
> - The drivers carries per-VE tty instances in @vtty_idr map, once
> VE tries to open a terminal we allocate tty map internally and
> keep it intact until VE destructed, this allow us to not bind
> into device namespaces (ie not rely on tty_class);
>
> - Unlike buffered IO to unix98 driver once internal port buffer
> get full we don't block write operations if there is no reader
> assigned yet but zap them. This is done intentionally to behave
> closely to native consoles;
>
> - The kernel choose which VE request terminal using get_exec_env
> helper thus to connect to a slave peer from a node one have to
> "enter" VE cgroup first.
Why do you add this functionality if it is not supposed to be used? IMO,
if you only allowed opening vtty-slave using ioctl, the implementation
would be much simpler.
BTW, a side question. In PCS6 /dev/console = vtty slave, while in your
code it is master. Why did you decide to reverse things? Why not just
port what we have in PCS6?
>
> Same time here is vtty_open_slave() helper provided which should
> allow to connect to a slave peer from a node directly using ioctl
> call (this is to implement yet).
>
> https://jira.sw.ru/browse/PSBM-34533
> https://jira.sw.ru/browse/PSBM-34532
> https://jira.sw.ru/browse/PSBM-34107
> https://jira.sw.ru/browse/PSBM-32686
> https://jira.sw.ru/browse/PSBM-32685
>
> N.B.: I've done some manual testing with python help
>
> 1) Run container (ctid=200)
> 2) Run the script on the node
>
> | #!/bin/sh
> | master=`cat /proc/tty/drivers | grep vtty_master | awk '{ print $3; }'`
> | slave=`cat /proc/tty/drivers | grep vtty_slave | awk '{ print $3; }'i`
> | echo "c $master:* rwm" > /sys/fs/cgroup/devices/$1/devices.allow
> | echo "c $slave:* rwm" > /sys/fs/cgroup/devices/$1/devices.allow
> | vzctl exec $1 mknod /dev/vttym5 c $master 5
> | vzctl exec $1 mknod /dev/vttys5 c $slave 5
> 3) In container run two instances of python where each does
> a)
> | >>> f=open("/dev/vttym5", "r+", 0)
> | >>> f.write("aaa")
> b)
> | >>> f=open("/dev/vttys5", "r+", 0)
> | >>> f.read(3)
> | >>> 'aaa'
>
> v2:
> - Rename terminals from vtz to vtty
> - Merge code into /drivers/tty/pty.c to reuse some of
> pty functionality
> - Get rid of two array of indices, use one for master
> peers and fetch slaves via @link
> - Drop TTY_VT_OPEN and wait() on it
> - Add vtty_open_slave helper
>
> Signed-off-by: Cyrill Gorcunov <gorcunov at virtuozzo.com>
> CC: Vladimir Davydov <vdavydov at virtuozzo.com>
> CC: Konstantin Khorenko <khorenko at virtuozzo.com>
> CC: Pavel Emelyanov <xemul at virtuozzo.com>
> CC: Andrey Vagin <avagin at virtuozzo.com>
> CC: Igor Sukhih <igor at parallels.com>
> ---
> drivers/tty/pty.c | 543 +++++++++++++++++++++++++++++++++++++++++++++++++++
> drivers/tty/tty_io.c | 29 +-
> include/linux/tty.h | 3
> include/linux/ve.h | 28 +-
> kernel/ve/Makefile | 2
> kernel/ve/console.c | 382 -----------------------------------
> 6 files changed, 572 insertions(+), 415 deletions(-)
>
> FWIW I didn't test yet the opening slave from the node because
> it requires vzctl patching together with the kernel. Will do
> that on top.
>
> Index: linux-pcs7.git/drivers/tty/pty.c
> ===================================================================
> --- linux-pcs7.git.orig/drivers/tty/pty.c
> +++ linux-pcs7.git/drivers/tty/pty.c
> @@ -26,12 +26,21 @@
>
> #include <bc/misc.h>
>
> +#include <linux/ve.h>
> +#include <linux/file.h>
> +#include <linux/anon_inodes.h>
> +
> #ifdef CONFIG_UNIX98_PTYS
> static struct tty_driver *ptm_driver;
> static struct tty_driver *pts_driver;
> static DEFINE_MUTEX(devpts_mutex);
> #endif
>
> +#if defined(CONFIG_VE) && defined(CONFIG_UNIX98_PTYS)
> +static int __init vtty_init(void);
> +static struct file_operations vtty_fops;
> +#endif
> +
> static void pty_close(struct tty_struct *tty, struct file *filp)
> {
> BUG_ON(!tty);
> @@ -824,6 +833,540 @@ static int __init pty_init(void)
> {
> legacy_pty_init();
> unix98_pty_init();
> +#ifdef CONFIG_VE
> + vtty_init();
> +#endif
> return 0;
> }
> module_init(pty_init);
> +
> +#if defined(CONFIG_VE) && defined(CONFIG_UNIX98_PTYS)
> +
> +/*
> + * VTTY architecture overview.
> + *
> + * With VTTY we make /dev/console and /dev/tty[X] virtualized
> + * per container (note the real names may vary because the
> + * kernel itself uses major:minor numbers to distinguish
> + * devices and doesn't care how they are named inside /dev.
> + * /dev/console stands for TTYAUX_MAJOR:1 while /dev/tty[X]
> + * stands for TTY_MAJOR:[0:12]. That said from inside of
> + * VTTY /dev/console is the same as /dev/tty0.
> + *
> + * For every container here is a tty map represented by
> + * vtty_map_t. It carries id of VE and associated masters
> + * tty peers. Upon any master peer open we create
> + * associated slave peer and mark master inside tty map
> + * which allows to reopen it on user request, unlike traditional
> + * Unix98 terminals where only one master is allowed. In turn
> + * when user asks to open a slave peer we lookup for master
> + * first and then use @link member to fetch a slave.
> + *
> + * The major number for vtty is allocated dynamically upon
> + * kernel start and should be found via /proc/tty/drivers output
> + * using vtty[m|s]_driver->driver_name. While the primary purpose
> + * to containerize console and terminals the kernel creates real
> + * vttym[0-11] and vttys[0-11] devices for master and slave peers
> + * appropriately. Thus to read console output one have to open
> + * vttys0 device from inside of container's context.
> + *
> + * Same time to be able to open slave peer from a node we
> + * provide vtty_open_slave helper.
> + */
> +static struct tty_driver *vttym_driver;
> +static struct tty_driver *vttys_driver;
> +static DEFINE_IDR(vtty_idr);
> +
> +#undef VTTY_DEBUG
> +#ifdef VTTY_DEBUG
> +#define vtty_printk(fmt, ...) \
> + printk("%20s: %4d: " fmt, \
> + __func__, __LINE__, ##__VA_ARGS__)
> +#define vtty_printk_one(__tty) \
> + vtty_printk("tty %p count %4d flags 0x%-8lx\n", \
> + __tty, __tty->count, __tty->flags)
> +#define vtty_printk_pair(__tty) \
> + vtty_printk("tty %p count %4d flags 0x%-8lx link %p count %4d flags 0x%-8lx\n", \
> + __tty, __tty->count, __tty->flags, \
> + __tty->link, __tty->link->count, __tty->link->flags)
> +#else
> +#define vtty_printk(fmt, ...)
> +#define vtty_printk_one(__tty)
> +#define vtty_printk_pair(__tty)
> +#endif
> +
> +#define driver_is_master(d) ((d) == vttym_driver)
Call it vtty_driver_is_master to avoid conflicts.
> +#define tty_is_master(t) driver_is_master((t)->driver)
vtty_is_master
> +
> +typedef struct {
> + envid_t veid;
> + struct tty_struct *vttys[MAX_NR_VTTY_CONSOLES];
> + spinlock_t lock;
> +} vtty_map_t;
> +
> +static vtty_map_t *vtty_map_lookup(envid_t veid)
> +{
> + return idr_find(&vtty_idr, veid);
> +}
> +
> +static void vtty_map_remove(vtty_map_t *map)
> +{
> + vtty_printk("map %p id %d\n", map, map ? map->veid : -1);
> + if (map) {
> + idr_remove(&vtty_idr, map->veid);
> + kfree(map);
> + }
> +}
> +
> +/*
> + * While vtty_install and vtty_shutdown are ordered by tty_mutex,
> + * vtty_cleanup can be called in any order so use spinlock to
> + * not throttle in a midle of zapping.
> + */
Can't you call vtty_map_zap from ->shutdown?
> +static void vtty_map_set(vtty_map_t *map, struct tty_struct *tty, int index)
> +{
> + vtty_printk("map %p id %d tty %p index %d\n",
> + map, map->veid, tty, index);
> + spin_lock(&map->lock);
> + map->vttys[index] = tty;
> + spin_unlock(&map->lock);
> +}
> +
> +static void vtty_map_zap(vtty_map_t *map, struct tty_struct *tty, int index)
> +{
> + vtty_printk("map %p id %d tty %p index %d\n", map, map->veid, tty, index);
> + spin_lock(&map->lock);
> + if (map->vttys[index] == tty)
Why is this check?
> + map->vttys[index] = NULL;
> + spin_unlock(&map->lock);
> +}
> +
> +static vtty_map_t *vtty_map_alloc(envid_t veid)
> +{
> + vtty_map_t *map = kzalloc(sizeof(*map), GFP_KERNEL);
> +
> + if (map) {
> + map->veid = veid;
> + veid = idr_alloc(&vtty_idr, map, veid, veid + 1, GFP_KERNEL);
> + if (veid < 0) {
> + kfree(map);
> + return ERR_PTR(veid);
> + }
> + spin_lock_init(&map->lock);
> + } else
> + map = ERR_PTR(-ENOMEM);
> +
> + vtty_printk("map %p id %d\n", map, veid);
> + return map;
> +}
> +
> +static inline int tty_is_exiting(struct tty_struct *tty)
vtty_is_exiting
BTW, why is testing TTY_CLOSING not enough?
> +{
> + return test_bit(TTY_CLOSING, &tty->flags) ||
> + test_bit(TTY_HUPPING, &tty->flags) ||
> + test_bit(TTY_LDISC_CHANGING, &tty->flags) ||
> + tty->count < 1;
> +}
> +
> +static struct tty_struct *vtty_lookup(struct tty_driver *driver,
> + struct inode *inode, int idx)
> +{
> + vtty_map_t *map = vtty_map_lookup(get_exec_env()->veid);
> + struct tty_struct *tty;
> +
> + if (idx < 0 || idx >= MAX_NR_VTTY_CONSOLES)
> + return ERR_PTR(-EIO);
> +
> + vtty_printk("driver %s index %d\n", driver->driver_name, idx);
> + /*
> + * Nothing ever been opened yet, allocate a new
> + * tty map together with both peers from the scratch
> + * in install procedure.
> + */
> + if (!map)
> + return NULL;
> +
> + tty = map->vttys[idx];
> + if (tty) {
> + /*
> + * Fetching existing tty is a bit tricky:
> + * the pair may exist but one of the peer
> + * may start exiting so the caller should
> + * wait until exit complete and re-ask for
> + * a new peer pair. For this we check the
> + * master peer to be alive first and then
> + * the slave peer state if been requested.
> + */
> + if (tty_is_exiting(tty))
> + tty = ERR_PTR(-ENXIO);
tty = NULL ?
> + else if (!driver_is_master(driver)) {
> + tty = tty->link;
> + if (tty_is_exiting(tty))
> + tty = ERR_PTR(-ENXIO);
ditto
> + }
> + }
> + vtty_printk("tty %p count %4d\n",
> + tty, IS_ERR_OR_NULL(tty) ? -1 : tty->count);
> + return tty;
> +}
> +
> +static int vtty_standard_install(struct tty_driver *driver, struct tty_struct *tty)
> +{
> + int ret = tty_init_termios(tty);
> + if (ret)
> + return ret;
> +
> + tty_driver_kref_get(driver);
> + tty->count++;
> + if (driver_is_master(driver))
> + vtty_map_set(tty->driver_data, tty, tty->index);
It's not that standard then. Move this to vtty_install?
> + return 0;
> +}
> +
> +static struct tty_struct *vtty_install_slave(vtty_map_t *map, int index)
> +{
> + struct tty_struct *tty;
> +
> + tty = alloc_tty_struct();
> + if (!tty)
> + return ERR_PTR(-ENOMEM);
> + initialize_tty_struct(tty, vttys_driver, index);
> +
> + tty->port = kzalloc(sizeof(*tty->port), GFP_KERNEL);
> + if (!tty->port) {
> + deinitialize_tty_struct(tty);
> + free_tty_struct(tty);
> + return ERR_PTR(-ENOMEM);
> + }
> + tty->driver_data = map;
You do exactly the same thing is in vtty_install. Move this to
vtty_standard_install?
> + WARN_ON(vtty_standard_install(vttys_driver, tty));
Make vtty_standard_install return void and wrap this WARN_ON around
tty_init_termios please.
> + tty_port_init(tty->port);
> + tty->port->itty = tty;
Again, this could live in vtty_standard_install.
> +
> + set_bit(TTY_EXTRA_REF, &tty->flags);
> + return tty;
> +}
> +
> +static int vtty_install(struct tty_driver *driver, struct tty_struct *tty)
> +{
> + envid_t veid = get_exec_env()->veid;
> + struct tty_struct *peer;
> + vtty_map_t *map;
> +
> + map = vtty_map_lookup(veid);
> + if (!map) {
> + map = vtty_map_alloc(veid);
> + if (IS_ERR(map))
> + return PTR_ERR(map);
> + }
> + tty->driver_data = map;
> +
> + tty->port = kzalloc(sizeof(*tty->port), GFP_KERNEL);
> + if (!tty->port)
> + return -ENOMEM;
> +
> + peer = vtty_install_slave(map, tty->index);
> + if (IS_ERR(peer)) {
> + vtty_map_zap(map, tty, tty->index);
> + kfree(tty->port);
> + return PTR_ERR(peer);
> + }
> + WARN_ON(vtty_standard_install(driver, tty));
> + tty_port_init(tty->port);
> + tty->port->itty = tty;
> +
> + tty->link = peer;
> + peer->link = tty;
> +
> + vtty_printk_pair(tty);
> + return 0;
> +}
> +
> +static int vtty_open(struct tty_struct *tty, struct file *filp)
> +{
> + if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
> + return -EIO;
> +
> + clear_bit(TTY_IO_ERROR, &tty->flags);
> + clear_bit(TTY_OTHER_CLOSED, &tty->flags);
You check that this bit is unset a couple of lines above.
> + clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
> + set_bit(TTY_THROTTLED, &tty->flags);
> +
> + vtty_printk_pair(tty);
> + return 0;
> +}
> +
> +static void vtty_close(struct tty_struct *tty, struct file *filp)
> +{
> + struct tty_struct *peer = tty->link;
> +
> + if (tty->count > 1)
> + return;
> +
> + if (test_bit(TTY_IO_ERROR, &tty->flags))
> + return;
> +
> + wake_up_interruptible(&tty->read_wait);
> + wake_up_interruptible(&tty->write_wait);
> +
> + wake_up_interruptible(&peer->read_wait);
> + wake_up_interruptible(&peer->write_wait);
> +
> + set_bit(TTY_IO_ERROR, &tty->flags);
> + set_bit(TTY_OTHER_CLOSED, &peer->flags);
> +
> + /*
> + * TTY_EXTRA_REF set for the peers which
> + * carries additional @count (ie peers
> + * which were created as a pair to a master
> + * one, thus when master is closed we drop
> + * the reference so slave will be closed
> + * as well).
> + */
> + if (test_bit(TTY_EXTRA_REF, &peer->flags)) {
> + clear_bit(TTY_EXTRA_REF, &peer->flags);
> + peer->count--;
> + }
> +
> + if (tty_is_master(tty)) {
> + tty_unlock(tty);
> + tty_vhangup(peer);
> + tty_lock(tty);
> + }
> +
> + vtty_printk_pair(tty);
> +}
> +
> +static void vtty_cleanup(struct tty_struct *tty)
> +{
> + cancel_work_sync(&tty->port->buf.work);
Why? Isn't this supposed to be done in tty_port_put?
> +
> + /*
> + * Make sure line discipline is off already.
> + * It is just for debug reasons.
> + */
> + WARN_ON_ONCE(!test_bit(TTY_LDISC_HALTED, &tty->flags));
> + WARN_ON_ONCE(test_bit(TTY_LDISC, &tty->flags));
> +
> + if (tty_is_master(tty))
> + vtty_map_zap(tty->driver_data, tty, tty->index);
> +
> + tty_port_put(tty->port);
> + vtty_printk_one(tty);
> +}
> +
> +static int vtty_write(struct tty_struct *tty, const unsigned char *buf, int count)
> +{
> + struct tty_struct *peer = tty->link;
> +
> + if (tty->stopped)
> + return 0;
> +
> + if (count > 0) {
> + count = tty_insert_flip_string(peer->port, buf, count);
> + if (count) {
> + tty_flip_buffer_push(peer->port);
> + tty_wakeup(tty);
> + } else {
> + int _count = test_bit(TTY_EXTRA_REF, &tty->flags) ? 2 : 1;
> + /*
> + * Flush the slave reader if noone
> + * is actually hooked on. Otherwise
> + * wait until reader fetch all data.
> + */
> + if (peer->count < _count)
How can peer->count be less than 1 here?
> + tty_perform_flush(peer, TCIFLUSH);
> + }
> + }
> +
> + return count;
> +}
> +
> +static const struct tty_operations vtty_ops = {
> + .lookup = vtty_lookup,
> + .install = vtty_install,
> + .open = vtty_open,
> + .close = vtty_close,
> + .cleanup = vtty_cleanup,
> + .write = vtty_write,
> + .write_room = pty_space,
That's wrong. There must always be enough room if slave is not open.
> + .set_termios = pty_set_termios,
> + .unthrottle = pty_unthrottle,
> + .remove = pty_unix98_remove,
> +};
> +
> +struct tty_driver *vtty_console_driver(int *index)
> +{
> + *index = 0;
> + return vttym_driver;
> +}
> +
> +int vtty_match(dev_t device)
> +{
> + return MAJOR(device) == TTY_MAJOR && MINOR(device) < MAX_NR_VTTY_CONSOLES;
> +}
> +
> +struct tty_driver *vtty_driver(dev_t dev, int *index)
> +{
> + BUG_ON(MINOR(dev) >= MAX_NR_VTTY_CONSOLES);
> +
> + *index = MINOR(dev);
> + return vttym_driver;
> +}
> +
> +static void ve_vtty_fini(void *data)
> +{
> + struct ve_struct *ve = data;
> + vtty_map_remove(vtty_map_lookup(ve->veid));
Making vtty_map_remove take veid would make the code more accurate IMO.
> +}
> +
> +static struct ve_hook vtty_hook = {
> + .fini = ve_vtty_fini,
> + .priority = HOOK_PRIO_DEFAULT,
> + .owner = THIS_MODULE,
> +};
> +
> +static int __init vtty_init(void)
> +{
> +#define TTY_DRIVER_ALLOC_FLAGS \
Please zap this macro and pass the flags as is.
> + (TTY_DRIVER_REAL_RAW | \
> + TTY_DRIVER_RESET_TERMIOS | \
> + TTY_DRIVER_DEVPTS_MEM)
> +
> + vttym_driver = tty_alloc_driver(MAX_NR_VTTY_CONSOLES, TTY_DRIVER_ALLOC_FLAGS);
> + if (IS_ERR(vttym_driver))
> + panic(pr_fmt("Can't allocate master vtty driver\n"));
> +
> + vttys_driver = tty_alloc_driver(MAX_NR_VTTY_CONSOLES, TTY_DRIVER_ALLOC_FLAGS);
> + if (IS_ERR(vttys_driver))
> + panic(pr_fmt("Can't allocate slave vtty driver\n"));
[...]
More information about the Devel
mailing list