[Devel] [PATCH RH9 4/8] ve/tty: vt -- Implement per VE support for console and terminals

Cyrill Gorcunov gorcunov at gmail.com
Thu Oct 7 18:18:49 MSK 2021


Previously in commit 8674c044330fad1458bd59b02f9037fb97e8b7af stubs for
virtual terminals have been added, they support writes from kernel side
which simply drops into the void.

In the patch the code has been moved from kernel/ve/console.c
to drivers/tty/pty.c to reuse a couple of pty helpers.

Now we support up to MAX_NR_VTTY_CONSOLES virtual consoles inside container.
For /dev/console we reserve the first virtual terminal.

Some details on the driver itself:

 - The drivers carries per-VE tty instances in @vtty_idr map, once
   VE tries to open a terminal we allocate tty map internally and
   keep it intact until VE destructed, this allow us to not bind
   into device namespaces (ie not rely on tty_class);

 - Unlike buffered IO to unix98 driver once internal port buffer
   get full we don't block write operations if there is no reader
   assigned yet but zap them. This is done intentionally to behave
   closely to native consoles;

 - The kernel choose which VE request terminal using get_exec_env
   helper, but for opening master peer from the nodes ve0 it uses
   vtty_set_context/vtty_get_context/vtty_drop_context to notify
   tty layer which @vtty_idr to use instead of get_exec_env.

https://jira.sw.ru/browse/PSBM-34533
https://jira.sw.ru/browse/PSBM-34532
https://jira.sw.ru/browse/PSBM-34107
https://jira.sw.ru/browse/PSBM-32686
https://jira.sw.ru/browse/PSBM-32685

v2:
 - Rename terminals from vtz to vtty
 - Merge code into /drivers/tty/pty.c to reuse some of
   pty functionality
 - Get rid of two array of indices, use one for master
   peers and fetch slaves via @link
 - Drop TTY_VT_OPEN and wait() on it
 - Add vtty_open_slave helper

v3:
 - Reverse the scheme, the peers opened from inside of
   container are the slave peers as it were in pcs6
 - Add vtty_set_context/vtty_drop_context/vtty_get_context
   to open needed tty from ve0 context
 - In vtty_open_master reuse existing vtty_lookup, vtty_open
   helpers
 - In ve_vtty_fini zap active tty tracking, such ttys are
   sitting here because the node has been opening the console
   and didn't release file descriptors yet with tty associated.
   The kernel will clean them up once they are closed but the
   tacking map pointer should be zapped to escape nil dereference

v4:
 - Use lockdep_assert_held in vtty @map operations to make sure
   we're under @tty_mutex
 - vtty_install now requests for port memory earlier for vtty_install_peer
   simplification
 - Drop tty_vhangup call from vtty_close, as been found it doesn't
   bring any benefit
 - Drop TTY_BUFFER_PAGE and fix typo in vtty_write_room
 - Rework tty counting to be the same as in pcs6: drivers
   became TTY_DRIVER_TYPE_PTY and @count adjusted accordingly

v5:
 - Treat zero as unused flag in vtty_get_context
 - vtty_printk helpers are dropped off
 - Don't test for exit state in lookup procedure: the kernel
   will do that on its own when slave is opened from inside
   of a container and for ioctl call we do such test explicitly
 - When pair is to open from the node and the existing peer is
   exiting we're allocating new pair early removin old one from
   per-VE ttys map, this is done to speedup open from the node
 - vtty_match is no longer exported into the rest of the tty code
 - When peer is to be closed we use own per-VE spinlock to read
   and modify own and peer counters, this is because the general
   tty->close routine is called without tty-mutex held and only
   one peer is locked thus such modifications are unsafe if do
   them locklessly. In current vanilla kernel there is no need
   for such lock if Unix ptys are used because master peers are
   always opened first and always get closed in constrast to the
   our driver where any peer end may be opened sole

v6:
 - Reworked tty counting: no need for extra reference but make
   it close to how native Unix98 ptys are working: once master
   is opened it takes new TTY_PINNED flags and when it getting
   closed with active slave peer we defer tty destruction until
   both ends are spare.

v7:
 - Move MAX_NR_VTTY_CONSOLES from header into pty.c
 - Drop vtty_zap_tty_map
 - Assign @driver_data in vtty_map_set
 - Rename vtty_map_del to vtty_map_clear
 - Merge map cleaning into vtty_map_free
 - Rename @current_veid to @vtty_context_veid
 - Rename TTY_PINNED to TTY_PINNED_BY_OTHER
 - Assing TTY_PINNED_BY_OTHER early in pair creation
 - Wake both ends of a peer in vtty_close

vdavydov:
 - Remove kernel/ve/console.c
 - Drop CONFIG_VTTYS
 - Export vtty_open_master

Signed-off-by: Cyrill Gorcunov <gorcunov at virtuozzo.com>

Reviewed-by: Vladimir Davydov <vdavydov at parallels.com>
CC: Konstantin Khorenko <khorenko at virtuozzo.com>

+++
ve/tty: WARN_ON commented out in vtty_open_master()

i've just commented it out for now because of compilation issues
and the fact we are going to rework tty code anyway.

Reason of the problem: RedHat applied
36697529b5bbe36911e39a6309e7a7c9250d280a
("tty: Replace ldisc locking with ldisc_sem").

(related to rebase to 3.10.0-327.3.1.el7)

To be merged into 9ad5a211409ee78dbd7c92696c87ceca6e435622
("ve/tty: vt -- Implement per VE support for console and terminals").

Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>

tty/pty: lockdep warning fixed in vtty_open_master()

I expect it should fix PSBM-80049

[   96.177828] [ INFO: possible recursive locking detected ]
[   96.179329] 3.10.0-862.ovz.rh7-3.10.0-862.el7 #34 Tainted: G        W      ------------
[   96.181556] ---------------------------------------------
[   96.182956] prl_vzvncserver/2795 is trying to acquire lock:
[   96.184406]  (&tty->legacy_mutex){+.+.+.}, at: [<ffffffff95fa5717>] tty_lock+0x57/0xb0
[   96.186611]
[   96.186611] but task is already holding lock:
[   96.188133]  (&tty->legacy_mutex){+.+.+.}, at: [<ffffffff95fa5717>] tty_lock+0x57/0xb0
[   96.190424]
[   96.190424] other info that might help us debug this:
[   96.192114]  Possible unsafe locking scenario:
[   96.192114]
[   96.193656]        CPU0
[   96.194307]        ----
[   96.194968]   lock(&tty->legacy_mutex);
[   96.196074]   lock(&tty->legacy_mutex);
[   96.197181]
[   96.197181]  *** DEADLOCK ***
[   96.197181]
[   96.198737]  May be due to missing lock nesting notation
[   96.198737]
[   96.200518] 1 lock held by prl_vzvncserver/2795:
[   96.201743]  #0:  (&tty->legacy_mutex){+.+.+.}, at: [<ffffffff95fa5717>] tty_lock+0x57/0xb0

vtty slave should be marked properly to hide lockdep warning

Signed-off-by: Vasily Averin <vvs at virtuozzo.com>

Acked-by: Cyrill Gorcunov <gorcunov at openvz.org>

+++
tty/pty: drop dead vtty code

code below is dead since rebase to 3.10.0-514.el7, i.e. ~year ago,
but we still have no related complains.

Let's drop it.

Signed-off-by: Vasily Averin <vvs at virtuozzo.com>

Acked-by: Cyrill Gorcunov <gorcunov at openvz.org>

+++
vtty: fixed error path in vtty_map_alloc

found by smatch:
drivers/tty/pty.c:935 vtty_map_alloc() warn:
 unsigned 'veid' is never less than zero.

Signed-off-by: Vasily Averin <vvs at virtuozzo.com>

+++
vtty: possible ERR_PTR dereferencing in vtty_open_master

found by swatch:
drivers/tty/pty.c:1306 vtty_open_master() error:
 'tty' dereferencing possible ERR_PTR()

Signed-off-by: Vasily Averin <vvs at virtuozzo.com>
Acked-by: Konstantin Khorenko <khorenko at virtuozzo.com>

v2: do set proper "ret" values on error paths
(cherry-picked form 368f1f2e66d928394c157b505d642439c4450782)
Signed-off-by: Valeriy Vdovin <valeriy.vdovin at virtuozzo.com>

https://jira.sw.ru/browse/PSBM-132299

+++
ve/tty: vtty fix noctty flag in tty_open after port from vz7

During vtty port tty_open has been reworked to also include scenarios
where /dev/console /dev/tty are being opened from insided of CT (!= VE0).
tty_open is implemented on top of some other helper functions:
- tty_open_by_driver
- tty_lookup_driver
In vz7 these helpers were arranged differently. tty_lookup_driver had
noctty as one of its argument. And in newer kernels it has been reworked.
Now tty_open deduces noctty value itself. 'noctty' is a crucial variable
that tells if the current terminal should be made controlling or not.
For virtual inside-CT terminal the logic of determining this value was
inside of tty_lookup_driver, but now there is no pointer to noctty inside
of it and so the same logic should be put in the consolidated place.

https://jira.sw.ru/browse/PSBM-132299

Signed-off-by: Valeriy Vdovin <valeriy.vdovin at virtuozzo.com>

+++
ve/tty: Fix NULL pointer dereference at vtty_open_master error path.

Ported function vtty_open_master crashes the kernel with a NULL pointer
dereference at it's error path, while deconstructing a file object.
This started since ms commit 72c2d53192004845cbc19cd8a30b3212a9288140
'file->f_op is never NULL...'

Actually vtty_open_master assigned NULL to file->f_op. Let's fix that
by removing this NULL assignment and ensuring that the remaining f_op
operations could safely exist in this context. For that lets:
1. make sure that file->f_op->release (tty_release) is safe to be called with
file->private_data being NULL, because at error path it's already destroyed.
2. make sure that f_op->fasync never gets called by disabling the FASYNC
flash manually before fput.
Currently, __fput only calls two file->f_op's so, this should suffice.

https://jira.sw.ru/browse/PSBM-132299

Signed-off-by: Valeriy Vdovin <valeriy.vdovin at virtuozzo.com>

+++
ve/vtty: n_tty -- Allow write on sole slave vtty peer

In pcs6 the pty counting has been somewhat sophisticated so in
pcs7 we've simplified it to bring as less changes into vanilla
code as possible (introducing TTY_PINNED_BY_OTHER bit to order
closing sequence). The new accounting works as expected but
there is a small issue -- until master peer get a real user hooked
on it (say containers opens /dev/console on its own and writes
log into it) any write operation return -EIO because line
discipline module tests the @count on the other side of a peer.

I think we can add one small code snippet (just the same
as we did in tty_release() helper to track such situation
and allow to write into sole open vtty.

Basically the issue were that getty inside container wrote
some data upon container's starup and connection from the
node simply didn't get it because data was lost which as
a side effect forced a console user to hit "enter" second
time.

https://jira.sw.ru/browse/PSBM-40740

Signed-off-by: Cyrill Gorcunov <gorcunov at virtuozzo.com>
Reviewed-by: Vladimir Davydov <vdavydov at virtuozzo.com>

CC: Konstantin Khorenko <khorenko at virtuozzo.com>
CC: Igor Sukhih <igor at parallels.com>
CC: Nikolay Breykin <nbreykin at odin.com>

https://jira.sw.ru/browse/PSBM-132299

(cherry-picked from b84f8d5546698afeb04011f8f24ea548284ae742)
Signed-off-by: Valeriy Vdovin <valeriy.vdovin at virtuozzo.com>

+++
ve/tty: vtty -- Drop TTY_PINNED_BY_OTHER bit

This bit was introduced during our vttys code rework but eventually
we don't need it, plain comparision with slave vtty driver is enough.
So lets drop it off since it might conflict with some new tty bits
in future.

Signed-off-by: Cyrill Gorcunov <gorcunov at virtuozzo.com>
Reviewed-by: Vladimir Davydov <vdavydov at virtuozzo.com>

https://jira.sw.ru/browse/PSBM-132299

(cherry-picked from ba9b6c7897be3a591342fe102dd3e4ef6e105a2c)
Signed-off-by: Valeriy Vdovin <valeriy.vdovin at virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov at virtuozzo.com>
---
 drivers/tty/n_tty.c  |   6 +
 drivers/tty/pty.c    | 516 +++++++++++++++++++++++++++++++++++++++++++
 drivers/tty/tty_io.c |  53 ++++-
 include/linux/ve.h   |   7 +
 kernel/ve/vecalls.c  |   3 +
 5 files changed, 576 insertions(+), 9 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index 0ec93f1a61f5..f8df123de258 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -49,6 +49,7 @@
 #include <linux/module.h>
 #include <linux/ratelimit.h>
 #include <linux/vmalloc.h>
+#include <linux/ve.h>
 #include "tty.h"
 
 /*
@@ -2279,7 +2280,12 @@ static ssize_t n_tty_write(struct tty_struct *tty, struct file *file,
 			retval = -ERESTARTSYS;
 			break;
 		}
+#ifdef CONFIG_VE
+		if (tty_hung_up_p(file) ||
+		 (tty->link && !tty->link->count && !vtty_is_master(tty->link))) {
+#else
 		if (tty_hung_up_p(file) || (tty->link && !tty->link->count)) {
+#endif
 			retval = -EIO;
 			break;
 		}
diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index 74bfabe5b453..c830fe09712c 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -595,6 +595,521 @@ static void __init legacy_pty_init(void)
 static inline void legacy_pty_init(void) { }
 #endif
 
+#if defined(CONFIG_VE)
+
+/*
+ * VTTY architecture overview.
+ *
+ * With VTTY we make /dev/console and /dev/tty[X] virtualized
+ * per container (note the real names may vary because the
+ * kernel itself uses major:minor numbers to distinguish
+ * devices and doesn't care how they are named inside /dev.
+ * /dev/console stands for TTYAUX_MAJOR:1 while /dev/tty[X]
+ * stands for TTY_MAJOR:[0:12]. That said from inside of
+ * VTTY /dev/console is the same as /dev/tty0.
+ *
+ * For every container here is a tty map represented by
+ * vtty_map_t. It carries @veid of VE and associated slave
+ * tty peers.
+ *
+ * map
+ *  veid -> CTID
+ *    vttys -> [ 0 ]
+ *               `- @slave -> link -> @master
+ *             [ 1 ]
+ *               `- @slave -> link -> @master
+ */
+
+#include <linux/ve.h>
+#include <linux/file.h>
+#include <linux/anon_inodes.h>
+
+static struct tty_driver *vttym_driver;
+static struct tty_driver *vttys_driver;
+static DEFINE_IDR(vtty_idr);
+
+static struct file_operations vtty_fops;
+
+#define MAX_NR_VTTY_CONSOLES	(12)
+#define vtty_match_index(idx)	((idx) >= 0 && (idx) < MAX_NR_VTTY_CONSOLES)
+
+bool vtty_is_master(struct tty_struct *tty)
+{
+	return tty->driver == vttym_driver;
+}
+
+typedef struct {
+	envid_t			veid;
+	struct tty_struct	*vttys[MAX_NR_VTTY_CONSOLES];
+} vtty_map_t;
+
+static vtty_map_t *vtty_map_lookup(envid_t veid)
+{
+	lockdep_assert_held(&tty_mutex);
+	return idr_find(&vtty_idr, veid);
+}
+
+static void vtty_map_set(vtty_map_t *map, struct tty_struct *tty)
+{
+	lockdep_assert_held(&tty_mutex);
+	WARN_ON(map->vttys[tty->index]);
+
+	tty->driver_data = tty->link->driver_data = map;
+	map->vttys[tty->index] = tty;
+}
+
+static void vtty_map_clear(struct tty_struct *tty)
+{
+	vtty_map_t *map = tty->driver_data;
+
+	lockdep_assert_held(&tty_mutex);
+	if (map) {
+		struct tty_struct *p = map->vttys[tty->index];
+
+		WARN_ON(p != (tty->driver == vttys_driver ? tty : tty->link));
+		map->vttys[tty->index] = NULL;
+		tty->driver_data = tty->link->driver_data = NULL;
+	}
+}
+
+static void vtty_map_free(vtty_map_t *map)
+{
+	int i;
+
+	lockdep_assert_held(&tty_mutex);
+
+	for (i = 0; i < MAX_NR_VTTY_CONSOLES; i++) {
+		struct tty_struct *tty = map->vttys[i];
+		if (!tty)
+			continue;
+		tty->driver_data = tty->link->driver_data = NULL;
+	}
+
+	idr_remove(&vtty_idr, map->veid);
+	kfree(map);
+}
+
+static vtty_map_t *vtty_map_alloc(envid_t veid)
+{
+	vtty_map_t *map = kzalloc(sizeof(*map), GFP_KERNEL);
+
+	lockdep_assert_held(&tty_mutex);
+	if (map) {
+		int id;
+
+		map->veid = veid;
+		id = idr_alloc(&vtty_idr, map, veid, veid + 1, GFP_KERNEL);
+		if (id < 0) {
+			kfree(map);
+			return ERR_PTR(id);
+		}
+	} else
+		map = ERR_PTR(-ENOMEM);
+	return map;
+}
+
+/*
+ * vttys are never supposed to be opened from inside
+ * of VE0 except special ioctl call, so treat zero as
+ * "unused" sign.
+ */
+static envid_t vtty_context_veid;
+
+static void vtty_set_context(envid_t veid)
+{
+	lockdep_assert_held(&tty_mutex);
+	WARN_ON(!veid);
+	vtty_context_veid = veid;
+}
+
+static void vtty_drop_context(void)
+{
+	lockdep_assert_held(&tty_mutex);
+	vtty_context_veid = 0;
+}
+
+static envid_t vtty_get_context(void)
+{
+	lockdep_assert_held(&tty_mutex);
+	return vtty_context_veid ?: get_exec_env()->veid;
+}
+
+static struct tty_struct *vtty_lookup(struct tty_driver *driver,
+				      struct file *file, int idx)
+{
+	vtty_map_t *map = vtty_map_lookup(vtty_get_context());
+	struct tty_struct *tty;
+
+	if (!vtty_match_index(idx))
+		return ERR_PTR(-EIO);
+
+	/*
+	 * Nothing ever been opened yet, allocate a new
+	 * tty map together with both peers from the scratch
+	 * in install procedure.
+	 */
+	if (!map)
+		return NULL;
+
+	tty = map->vttys[idx];
+	if (tty) {
+		if (driver == vttym_driver)
+			tty = tty->link;
+		WARN_ON(!tty);
+	}
+	return tty;
+}
+
+static void vtty_standard_install(struct tty_driver *driver,
+				  struct tty_struct *tty)
+{
+	tty_init_termios(tty);
+
+	tty_driver_kref_get(driver);
+	tty_port_init(tty->port);
+	tty->port->itty = tty;
+}
+
+static struct tty_struct *vtty_install_peer(struct tty_driver *driver,
+					    struct tty_port *port, int index)
+{
+	struct tty_struct *tty;
+
+	tty = alloc_tty_struct(driver, index);
+	if (!tty)
+		return ERR_PTR(-ENOMEM);
+	tty->port = port;
+	vtty_standard_install(driver, tty);
+	return tty;
+}
+
+static int vtty_install(struct tty_driver *driver, struct tty_struct *tty)
+{
+	envid_t veid = vtty_get_context();
+	struct tty_port *peer_port;
+	struct tty_struct *peer;
+	vtty_map_t *map;
+	int ret;
+
+	WARN_ON_ONCE(driver != vttys_driver);
+
+	map = vtty_map_lookup(veid);
+	if (!map) {
+		map = vtty_map_alloc(veid);
+		if (IS_ERR(map))
+			return PTR_ERR(map);
+	}
+
+	tty->port = kzalloc(sizeof(*tty->port), GFP_KERNEL);
+	peer_port = kzalloc(sizeof(*peer_port), GFP_KERNEL);
+	if (!tty->port || !peer_port) {
+		ret = -ENOMEM;
+		goto err_free;
+	}
+
+	peer = vtty_install_peer(vttym_driver, peer_port, tty->index);
+	if (IS_ERR(peer)) {
+		ret = PTR_ERR(peer);
+		goto err_free;
+	}
+
+	vtty_standard_install(vttys_driver, tty);
+	tty->count++;
+
+	tty->link = peer;
+	peer->link = tty;
+
+	vtty_map_set(map, tty);
+	return 0;
+
+err_free:
+	kfree(tty->port);
+	kfree(peer_port);
+	return ret;
+}
+
+static int vtty_open(struct tty_struct *tty, struct file *filp)
+{
+	set_bit(TTY_THROTTLED, &tty->flags);
+	return 0;
+}
+
+static void vtty_close(struct tty_struct *tty, struct file *filp)
+{
+	int count = (tty->driver == vttys_driver) ? 2 : 1;
+	if (tty->count <= count) {
+		wake_up_interruptible(&tty->read_wait);
+		wake_up_interruptible(&tty->write_wait);
+
+		wake_up_interruptible(&tty->link->read_wait);
+		wake_up_interruptible(&tty->link->write_wait);
+	}
+}
+
+static void vtty_shutdown(struct tty_struct *tty)
+{
+	vtty_map_clear(tty);
+}
+
+static int vtty_write(struct tty_struct *tty,
+		      const unsigned char *buf, int count)
+{
+	struct tty_struct *peer = tty->link;
+
+	if (tty->flow.stopped)
+		return 0;
+
+	if (count > 0) {
+		count = tty_insert_flip_string(peer->port, buf, count);
+		if (count) {
+			tty_flip_buffer_push(peer->port);
+			tty_wakeup(tty);
+		} else {
+			int _count = (tty->driver == vttym_driver) ? 2 : 1;
+			/*
+			 * Flush the slave reader if noone
+			 * is actually hooked on. Otherwise
+			 * wait until reader fetch all data.
+			 */
+			if (peer->count < _count)
+				tty_perform_flush(peer, TCIFLUSH);
+		}
+	}
+
+	return count;
+}
+
+static unsigned int vtty_write_room(struct tty_struct *tty)
+{
+	struct tty_struct *peer = tty->link;
+	int count = (tty->driver == vttym_driver) ? 2 : 1;
+
+	if (tty->flow.stopped)
+		return 0;
+
+	if (peer->count < count)
+		return 2048;
+
+	return tty_buffer_space_avail(peer->port);
+}
+
+static void vtty_remove(struct tty_driver *driver, struct tty_struct *tty)
+{
+}
+
+static const struct tty_operations vtty_ops = {
+	.lookup		= vtty_lookup,
+	.install	= vtty_install,
+	.open		= vtty_open,
+	.close		= vtty_close,
+	.shutdown	= vtty_shutdown,
+	.cleanup	= pty_cleanup,
+	.write		= vtty_write,
+	.write_room	= vtty_write_room,
+	.set_termios	= pty_set_termios,
+	.unthrottle	= pty_unthrottle,
+	.remove		= vtty_remove,
+};
+
+struct tty_driver *vtty_console_driver(int *index)
+{
+	*index = 0;
+	return vttys_driver;
+}
+
+struct tty_driver *vtty_driver(dev_t dev, int *index)
+{
+	if (MAJOR(dev) == TTY_MAJOR &&
+	    MINOR(dev) < MAX_NR_VTTY_CONSOLES) {
+		*index = MINOR(dev);
+		return vttys_driver;
+	}
+	return NULL;
+}
+
+static void ve_vtty_fini(void *data)
+{
+	struct ve_struct *ve = data;
+	vtty_map_t *map;
+
+	mutex_lock(&tty_mutex);
+	map = vtty_map_lookup(ve->veid);
+	if (map)
+		vtty_map_free(map);
+	mutex_unlock(&tty_mutex);
+}
+
+static struct ve_hook vtty_hook = {
+	.fini           = ve_vtty_fini,
+	.priority       = HOOK_PRIO_DEFAULT,
+	.owner          = THIS_MODULE,
+};
+
+static int __init vtty_init(void)
+{
+#define VTTY_DRIVER_ALLOC_FLAGS			\
+	(TTY_DRIVER_REAL_RAW		|	\
+	 TTY_DRIVER_RESET_TERMIOS	|	\
+	 TTY_DRIVER_DYNAMIC_DEV		|	\
+	 TTY_DRIVER_INSTALLED		|	\
+	 TTY_DRIVER_DEVPTS_MEM)
+
+	vttym_driver = tty_alloc_driver(MAX_NR_VTTY_CONSOLES,
+					VTTY_DRIVER_ALLOC_FLAGS);
+	if (IS_ERR(vttym_driver))
+		panic(pr_fmt("Can't allocate master vtty driver\n"));
+
+	vttys_driver = tty_alloc_driver(MAX_NR_VTTY_CONSOLES,
+					VTTY_DRIVER_ALLOC_FLAGS);
+	if (IS_ERR(vttys_driver))
+		panic(pr_fmt("Can't allocate slave vtty driver\n"));
+
+	vttym_driver->driver_name		= "vtty_master";
+	vttym_driver->name			= "vttym";
+	vttym_driver->name_base			= 0;
+	vttym_driver->major			= 0;
+	vttym_driver->minor_start		= 0;
+	vttym_driver->type			= TTY_DRIVER_TYPE_PTY;
+	vttym_driver->subtype			= PTY_TYPE_MASTER;
+	vttym_driver->init_termios		= tty_std_termios;
+	vttym_driver->init_termios.c_iflag	= 0;
+	vttym_driver->init_termios.c_oflag	= 0;
+
+	/* 38400 boud rate, 8 bit char size, enable receiver */
+	vttym_driver->init_termios.c_cflag	= B38400 | CS8 | CREAD;
+	vttym_driver->init_termios.c_lflag	= 0;
+	vttym_driver->init_termios.c_ispeed	= 38400;
+	vttym_driver->init_termios.c_ospeed	= 38400;
+	tty_set_operations(vttym_driver, &vtty_ops);
+
+	vttys_driver->driver_name		= "vtty_slave";
+	vttys_driver->name			= "vttys";
+	vttys_driver->name_base			= 0;
+	vttys_driver->major			= 0;
+	vttys_driver->minor_start		= 0;
+	vttys_driver->type			= TTY_DRIVER_TYPE_PTY;
+	vttys_driver->subtype			= PTY_TYPE_SLAVE;
+	vttys_driver->init_termios		= tty_std_termios;
+	vttys_driver->init_termios.c_iflag	= 0;
+	vttys_driver->init_termios.c_oflag	= 0;
+	vttys_driver->init_termios.c_cflag	= B38400 | CS8 | CREAD;
+	vttys_driver->init_termios.c_lflag	= 0;
+	vttys_driver->init_termios.c_ispeed	= 38400;
+	vttys_driver->init_termios.c_ospeed	= 38400;
+	tty_set_operations(vttys_driver, &vtty_ops);
+
+	if (tty_register_driver(vttym_driver))
+		panic(pr_fmt("Can't register master vtty driver\n"));
+
+	if (tty_register_driver(vttys_driver))
+		panic(pr_fmt("Can't register slave vtty driver\n"));
+
+	ve_hook_register(VE_SS_CHAIN, &vtty_hook);
+	tty_default_fops(&vtty_fops);
+	return 0;
+}
+
+int vtty_open_master(envid_t veid, int idx)
+{
+	struct tty_struct *tty;
+	struct file *file;
+	char devname[64];
+	int fd, ret;
+
+	if (!vtty_match_index(idx))
+		return -EIO;
+
+	fd = get_unused_fd_flags(0);
+	if (fd < 0)
+		return fd;
+
+	snprintf(devname, sizeof(devname), "v%utty%d", veid, idx);
+	file = anon_inode_getfile(devname, &vtty_fops, NULL, O_RDWR);
+	if (IS_ERR(file)) {
+		ret = PTR_ERR(file);
+		goto err_put_unused_fd;
+	}
+	nonseekable_open(NULL, file);
+
+	ret = tty_alloc_file(file);
+	if (ret)
+		goto err_fput;
+
+	/*
+	 * Opening comes from ve0 context so
+	 * setup VE's context until master fetched.
+	 * This is done under @tty_mutex so noone
+	 * else would access it while we're holding
+	 * the lock.
+	 */
+	mutex_lock(&tty_mutex);
+	vtty_set_context(veid);
+
+	tty = vtty_lookup(vttym_driver, NULL, idx);
+	if (IS_ERR(tty)) {
+		ret = PTR_ERR(tty);
+		goto err_install;
+	}
+
+	if (!tty) {
+		tty = tty_init_dev(vttys_driver, idx);
+		if (IS_ERR(tty)) {
+			ret = PTR_ERR(tty);
+			goto err_install;
+		}
+		tty->count--;
+		tty_unlock(tty);
+		tty_set_lock_subclass(tty);
+		tty = tty->link;
+	}
+
+	/* One master at a time */
+	if (tty->count >= 1) {
+		ret = -EBUSY;
+		goto err_install;
+	}
+
+	vtty_drop_context();
+
+	/*
+	 * We're the master peer so increment
+	 * slave counter as well.
+	 */
+	tty_add_file(tty, file);
+	tty->count++;
+	tty->link->count++;
+	fd_install(fd, file);
+	vtty_open(tty, file);
+
+	mutex_unlock(&tty_mutex);
+	ret = fd;
+out:
+	return ret;
+
+err_install:
+	vtty_drop_context();
+	mutex_unlock(&tty_mutex);
+	tty_free_file(file);
+err_fput:
+	/*
+	 * __fput will try to call file->f_op->fasync and file->f_op->release
+	 * We don't want that.
+	 * fasync ( will not get called without FASYNC flag.
+	 * release (tty_release in our case) has private_data == NULL checked
+	 * for early return.
+	 */
+	file->f_flags &= ~FASYNC;
+	fput(file);
+err_put_unused_fd:
+	put_unused_fd(fd);
+	goto out;
+}
+EXPORT_SYMBOL(vtty_open_master);
+#else
+static void vtty_init(void) { };
+#endif /* CONFIG_VE */
+
+
 /* Unix98 devices */
 #ifdef CONFIG_UNIX98_PTYS
 static struct cdev ptmx_cdev;
@@ -952,6 +1467,7 @@ static int __init pty_init(void)
 {
 	legacy_pty_init();
 	unix98_pty_init();
+	vtty_init();
 	return 0;
 }
 device_initcall(pty_init);
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 7f8006227451..ca5bc2bad1b9 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -108,6 +108,7 @@
 
 #include <linux/kmod.h>
 #include <linux/nsproxy.h>
+#include <linux/ve.h>
 #include "tty.h"
 
 #undef TTY_DEBUG_HANGUP
@@ -1740,13 +1741,22 @@ EXPORT_SYMBOL_GPL(tty_release_struct);
 
 int tty_release(struct inode *inode, struct file *filp)
 {
-	struct tty_struct *tty = file_tty(filp);
+	struct tty_struct *tty;
 	struct tty_struct *o_tty = NULL;
 	int	do_sleep, final;
 	int	idx;
 	long	timeout = 0;
 	int	once = 1;
 
+	/*
+	 * filp can be released at error path with private_data already
+	 * reverted to NULL, see vtty_open_master.
+	 */
+	if (!filp->private_data)
+		return 0;
+
+	tty = file_tty(filp);
+
 	if (tty_paranoia_check(tty, inode, __func__))
 		return 0;
 
@@ -1932,6 +1942,20 @@ static struct tty_driver *tty_lookup_driver(dev_t device, struct file *filp,
 {
 	struct tty_driver *driver = NULL;
 
+#ifdef CONFIG_VE
+	struct ve_struct *ve = get_exec_env();
+
+	if (!ve_is_super(ve)) {
+		driver = vtty_driver(device, index);
+		if (driver)
+			/*
+			 * noctty = 1 has been removed at porting in hope that
+			 * at tty_open noctty will be set as expected.
+			 */
+			return tty_driver_kref_get(driver);
+	}
+#endif
+
 	switch (device) {
 #ifdef CONFIG_VT
 	case MKDEV(TTY_MAJOR, 0): {
@@ -1944,7 +1968,10 @@ static struct tty_driver *tty_lookup_driver(dev_t device, struct file *filp,
 #endif
 	case MKDEV(TTYAUX_MAJOR, 1): {
 		struct tty_driver *console_driver = console_device(index);
-
+#ifdef CONFIG_VE
+		if (!ve_is_super(ve))
+			console_driver = vtty_console_driver(index);
+#endif
 		if (console_driver) {
 			driver = tty_driver_kref_get(console_driver);
 			if (driver && filp) {
@@ -2051,23 +2078,22 @@ EXPORT_SYMBOL_GPL(tty_kopen_shared);
  *	  - concurrent tty driver removal w/ lookup
  *	  - concurrent tty removal from driver table
  */
-static struct tty_struct *tty_open_by_driver(dev_t device,
-					     struct file *filp)
+static struct tty_struct *tty_open_by_driver(dev_t device, struct inode *inode,
+					     struct file *filp, int *index)
 {
 	struct tty_struct *tty;
 	struct tty_driver *driver = NULL;
-	int index = -1;
 	int retval;
 
 	mutex_lock(&tty_mutex);
-	driver = tty_lookup_driver(device, filp, &index);
+	driver = tty_lookup_driver(device, filp, index);
 	if (IS_ERR(driver)) {
 		mutex_unlock(&tty_mutex);
 		return ERR_CAST(driver);
 	}
 
 	/* check whether we're reopening an existing tty */
-	tty = tty_driver_lookup_tty(driver, filp, index);
+	tty = tty_driver_lookup_tty(driver, filp, *index);
 	if (IS_ERR(tty)) {
 		mutex_unlock(&tty_mutex);
 		goto out;
@@ -2095,7 +2121,7 @@ static struct tty_struct *tty_open_by_driver(dev_t device,
 			tty = ERR_PTR(retval);
 		}
 	} else { /* Returns with the tty_lock held for now */
-		tty = tty_init_dev(driver, index);
+		tty = tty_init_dev(driver, *index);
 		mutex_unlock(&tty_mutex);
 	}
 out:
@@ -2133,6 +2159,7 @@ static int tty_open(struct inode *inode, struct file *filp)
 	int noctty, retval;
 	dev_t device = inode->i_rdev;
 	unsigned saved_flags = filp->f_flags;
+	int index = -1;
 
 	nonseekable_open(inode, filp);
 
@@ -2143,7 +2170,7 @@ static int tty_open(struct inode *inode, struct file *filp)
 
 	tty = tty_open_current_tty(device, filp);
 	if (!tty)
-		tty = tty_open_by_driver(device, filp);
+		tty = tty_open_by_driver(device, inode, filp, &index);
 
 	if (IS_ERR(tty)) {
 		tty_free_file(filp);
@@ -2191,6 +2218,14 @@ static int tty_open(struct inode *inode, struct file *filp)
 		 device == MKDEV(TTYAUX_MAJOR, 1) ||
 		 (tty->driver->type == TTY_DRIVER_TYPE_PTY &&
 		  tty->driver->subtype == PTY_TYPE_MASTER);
+#ifdef CONFIG_VE
+	if (!noctty) {
+		if (vtty_driver(device, &index)) {
+			if (MINOR(device) == 0)
+				noctty = 1;
+		}
+	}
+#endif
 	if (!noctty)
 		tty_open_proc_set_tty(filp, tty);
 	tty_unlock(tty);
diff --git a/include/linux/ve.h b/include/linux/ve.h
index fab80f0da567..91ee80b58ecf 100644
--- a/include/linux/ve.h
+++ b/include/linux/ve.h
@@ -167,6 +167,13 @@ static inline void ve_set_task_start_time(struct ve_struct *ve,
 extern bool current_user_ns_initial(void);
 struct user_namespace *ve_init_user_ns(void);
 
+#ifdef CONFIG_TTY
+extern struct tty_driver *vtty_driver(dev_t dev, int *index);
+extern struct tty_driver *vtty_console_driver(int *index);
+extern int vtty_open_master(envid_t veid, int idx);
+extern bool vtty_is_master(struct tty_struct *tty);
+#endif /* CONFIG_TTY */
+
 extern struct cgroup *cgroup_get_ve_root1(struct cgroup *cgrp);
 
 #define ve_uevent_seqnum       (get_exec_env()->_uevent_seqnum)
diff --git a/kernel/ve/vecalls.c b/kernel/ve/vecalls.c
index 6bb9d477275d..1b23181acc56 100644
--- a/kernel/ve/vecalls.c
+++ b/kernel/ve/vecalls.c
@@ -370,6 +370,9 @@ static int ve_configure(envid_t veid, unsigned int key,
 	struct ve_struct *ve;
 	int err = -ENOKEY;
 
+	if (key == VE_CONFIGURE_OPEN_TTY)
+		return vtty_open_master(veid, val);
+
 	ve = get_ve_by_id(veid);
 	if (!ve)
 		return -EINVAL;
-- 
2.31.1



More information about the Devel mailing list