[Devel] [PATCH vz8 v5 5/5] trusted/ve/mmap: Protect from unsecure library load from CT image

Konstantin Khorenko khorenko at virtuozzo.com
Tue Jun 8 19:31:30 MSK 2021


From: Valeriy Vdovin <valeriy.vdovin at virtuozzo.com>

Current kernel version wards a priviledged task from running unsecure
binaries via exec or uselib syscalls.

The condition of unsecurity of the subject binary is performed via a
series of sub-checks to see if the binary is located inside of a
Container image. This is considered unsafe and is only allowed in case
if admin explicitly allows to run executable code from that image or the
system as a whole.

The problem is that dynamic libraries also have another way aside from
uselib to link the library to a running process. Simple userspace linker
would simply read out elf sections and load all the loadable sections
from the file to the process address space via mmap syscall. If the
secion is of executable type the linker will mmap this section with
PROC_EXEC flag. This way it is possible to load potetialy malicious
library into a priviledged task and there is no mechanism to stop it.

* This can happen if the priviledged task has entered a mount namespace,
  that belongs to some container and call some function that is not
  linked into the process already.
* Another possible way to link to a Container library is to provide
  LD_LIBRARY_PATH=/vz/root/VEID/lib64 env var to a starting process.

In both cases any lazy-linked function will trigger library loading at
the moment the function from that library gets called for the first
time.

To stop the code from a Container library to get executed in the host-
level process, let's add ve_trusted_exec to the mmap code under
condition that mmap has been called in PROT_EXEC non-anonymous mode.

https://jira.sw.ru/browse/PSBM-129741

Signed-off-by: Valeriy Vdovin <valeriy.vdovin at virtuozzo.com>
Reviewed-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
Reviewed-by: Konstantin Khorenko <khorenko at virtuozzo.com>
---
 include/linux/ve.h |  1 +
 kernel/ve/ve.c     | 21 +++++++++++++++++++++
 mm/util.c          |  5 +++++
 3 files changed, 27 insertions(+)

diff --git a/include/linux/ve.h b/include/linux/ve.h
index edf4e95d97e7..42883b4b898a 100644
--- a/include/linux/ve.h
+++ b/include/linux/ve.h
@@ -163,6 +163,7 @@ extern void put_ve(struct ve_struct *ve);
 void ve_stop_ns(struct pid_namespace *ns);
 void ve_exit_ns(struct pid_namespace *ns);
 bool ve_check_trusted_exec(struct file *file, struct filename *name);
+bool ve_check_trusted_mmap(struct file *file);
 
 #ifdef CONFIG_TTY
 #define MAX_NR_VTTY_CONSOLES	(12)
diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
index 1ff815f30a4d..a04cdd5fe1f2 100644
--- a/kernel/ve/ve.c
+++ b/kernel/ve/ve.c
@@ -1822,6 +1822,27 @@ static bool ve_check_trusted_file(struct file *file)
 #define SIGSEGV_RATELIMIT_INTERVAL	(24 * 60 * 60 * HZ)
 #define SIGSEGV_RATELIMIT_BURST		3
 
+bool ve_check_trusted_mmap(struct file *file)
+{
+	const char *filename = "";
+
+	static DEFINE_RATELIMIT_STATE(sigsegv_rs, SIGSEGV_RATELIMIT_INTERVAL,
+						  SIGSEGV_RATELIMIT_BURST);
+	if (ve_check_trusted_file(file))
+		return true;
+
+	if (!__ratelimit(&sigsegv_rs))
+		return false;
+
+	if (file->f_path.dentry)
+		filename = file->f_path.dentry->d_name.name;
+
+	WARN(1, "VE0 %s tried to map code from file '%s' from VEX\n",
+			current->comm, filename);
+	force_sigsegv(SIGSEGV, current);
+	return false;
+}
+
 /*
  * We don't want a VE0-privileged user intentionally or by mistake
  * to execute files of container, these files are untrusted.
diff --git a/mm/util.c b/mm/util.c
index fc3d40eb1fc0..e915d5c74ea6 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -15,6 +15,7 @@
 #include <linux/hugetlb.h>
 #include <linux/vmalloc.h>
 #include <linux/userfaultfd_k.h>
+#include <linux/ve.h>
 
 #include <linux/uaccess.h>
 
@@ -418,6 +419,10 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
 	unsigned long populate;
 	LIST_HEAD(uf);
 
+	if (!(flag & MAP_ANONYMOUS) && (prot & PROT_EXEC) &&
+		!ve_check_trusted_mmap(file))
+		return -EBADF;
+
 	ret = security_mmap_file(file, prot, flag);
 	if (!ret) {
 		if (down_write_killable(&mm->mmap_sem))
-- 
2.28.0



More information about the Devel mailing list