[Debian] Re: lenny updates

Mon Mar 9 20:17:47 EDT 2009

Kir Kolyshkin wrote:
> I am currently checking all the ~80 patches that are not in openvz 
> lenny kernel. Looks like most are really needed. Let me suggest some 
> in a few emails I will send as a reply to this one.


Misc patches that do not fall into one of the above categories. I am 
only including important stuff.


http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=5d70bbc8780b474371b555cd6eeaaafdea82efe9
binfmt_misc: fix false -ENOEXEC when coupled with other binary handlers
A backport from mainstream patch.
Attached as 0014*

http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=4c9010eff11d97bf013f53601a76990b017e45b7
autofs4: pidns friendly oz_mode
Fix oz_mode detect to prevent autofs daemon hang inside CT.
Fix for OpenVZ bug #959 (http://bugzilla.openvz.org/959)
Attached as 0020*


http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=7ebcbe3c7ad977f1a9bfb03a6d7f7dca9f883b83
autofs: fix default pgrp vnr
Attached as 0021*

http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=ff3483aef4dbbddf6ee5ca483555c0ef8f8a047f
Fix erratum that causes memory corruption
Attached as 0027*.


http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=6b9fe0296b1aa5b2e70e9ba9790e4bd9af5908c6
vzwdog: walk through the block devices list properly
A fix for kernel oops, OpenVZ bug #1064 (http://bugzilla.openvz.org/1064)
Attached as 0044*


http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=134416f49ad04db56afd7eb2a41ddef4f157ea6f
Correct per-process capabilities bounding set in CT
Important security fix.
Attached as 0045*


http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=86d74166a99f5ece5bcd46b85cba4ebd54126685
ms: fix inotify umount
A fix for inotify vs. umount, backported from mainstream.
Attached as 0052*


http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=14131d2abbd2554276fe4488e3403d4c0a747cdf
ve: sanitize capability checks for namespaces creation
Fix for OpenVZ bug #1113 (http://bugzilla.openvz.org/1113)
Attached as 0054*


http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=c5c1032d4b6519d1e3a37853c5c0fd7fbd1f8798
Don't dereference NULL tsk->mm in ve_move_task
Attached as 0059*


http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=8aa704481f80e55dce430c0c01d276e8ca13018e
Fix broken permissions for Unix98 pty.
Attached as 0065*


http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=397500cb89baf75c8035060585c0886b3012708a
autofs4: fix ia32 compat mode
Attached as 0067*


http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=a65ea96551f370afb7174472dcd4c43b8165710c
simfs: don't work with buggy input
Attached as 0069*


http://git.openvz.org/?p=linux-2.6.26-openvz;a=commitdiff;h=0328e3d32c6915650b14dd40fcd7598a420b1364
OpenVZ bug #1160 (http://bugzilla.openvz.org/1160)
Attached as 0070*
-------------- next part --------------
>From 5d70bbc8780b474371b555cd6eeaaafdea82efe9 Mon Sep 17 00:00:00 2001
From: Pavel Emelyanov <xemul at openvz.org>
Date: Wed, 20 Aug 2008 22:50:13 +0000
Subject: [PATCH] binfmt_misc: fix false -ENOEXEC when coupled with other binary handlers

commit ff9bc512f198eb47204f55b24c6fe3d36ed89592 upstream

Date: Wed, 20 Aug 2008 14:09:10 -0700
Subject: binfmt_misc: fix false -ENOEXEC when coupled with other binary handlers

In case the binfmt_misc binary handler is registered *before* the e.g.
script one (when for example being compiled as a module) the following
situation may occur:

1. user launches a script, whose interpreter is a misc binary;
2. the load_misc_binary sets the misc_bang and returns -ENOEVEC,
   since the binary is a script;
3. the load_script_binary loads one and calls for search_binary_hander
   to run the interpreter;
4. the load_misc_binary is called again, but refuses to load the
   binary due to misc_bang bit set.

The fix is to move the misc_bang setting lower - prior to the actual
call to the search_binary_handler.

Caused by the commit 3a2e7f47 (binfmt_misc.c: avoid potential kernel
stack overflow)

Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
Reported-by: Kirill A. Shutemov <kirill at shutemov.name>
Tested-by: Kirill A. Shutemov <kirill at shutemov.name>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh at suse.de>
---
 fs/binfmt_misc.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index 7191306..a0a7157 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -119,8 +119,6 @@ static int load_misc_binary(struct linux_binprm *bprm, struct pt_regs *regs)
 	if (bprm->misc_bang)
 		goto _ret;
 
-	bprm->misc_bang = 1;
-
 	/* to keep locking time low, we copy the interpreter string */
 	read_lock(&entries_lock);
 	fmt = check_file(bprm);
@@ -198,6 +196,8 @@ static int load_misc_binary(struct linux_binprm *bprm, struct pt_regs *regs)
 	if (retval < 0)
 		goto _error;
 
+	bprm->misc_bang = 1;
+
 	retval = search_binary_handler (bprm, regs);
 	if (retval < 0)
 		goto _error;
-- 
1.6.0.6

-------------- next part --------------
>From 4c9010eff11d97bf013f53601a76990b017e45b7 Mon Sep 17 00:00:00 2001
From: Konstantin Khlebnikov <khlebnikov at openvz.org>
Date: Mon, 22 Sep 2008 13:20:00 +0400
Subject: [PATCH] autofs4: pidns friendly oz_mode

Fix oz_mode detect to prevent autofs daemon hang inside CT.

Switch from pid_t to struct pid of oz mode process group.
The same changes as in mainstream commit fa0334f1 for autofs.

http://bugzilla.openvz.org/show_bug.cgi?id=959

Signed-off-by: Konstantin Khlebnikov <khlebnikov at openvz.org>
Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 fs/autofs4/autofs_i.h |    4 ++--
 fs/autofs4/inode.c    |   24 ++++++++++++++++++------
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index c3d352d..4c8d035 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -98,7 +98,7 @@ struct autofs_sb_info {
 	u32 magic;
 	int pipefd;
 	struct file *pipe;
-	pid_t oz_pgrp;
+	struct pid *oz_pgrp;
 	int catatonic;
 	int version;
 	int sub_version;
@@ -131,7 +131,7 @@ static inline struct autofs_info *autofs4_dentry_ino(struct dentry *dentry)
    filesystem without "magic".) */
 
 static inline int autofs4_oz_mode(struct autofs_sb_info *sbi) {
-	return sbi->catatonic || task_pgrp_nr(current) == sbi->oz_pgrp;
+	return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp;
 }
 
 /* Does a dentry have some pending activity? */
diff --git a/fs/autofs4/inode.c b/fs/autofs4/inode.c
index 2fdcf5e..2d8dcb2 100644
--- a/fs/autofs4/inode.c
+++ b/fs/autofs4/inode.c
@@ -165,6 +165,8 @@ void autofs4_kill_sb(struct super_block *sb)
 	/* Clean up and release dangling references */
 	autofs4_force_release(sbi);
 
+	put_pid(sbi->oz_pgrp);
+
 	sb->s_fs_info = NULL;
 	kfree(sbi);
 
@@ -186,7 +188,7 @@ static int autofs4_show_options(struct seq_file *m, struct vfsmount *mnt)
 		seq_printf(m, ",uid=%u", root_inode->i_uid);
 	if (root_inode->i_gid != 0)
 		seq_printf(m, ",gid=%u", root_inode->i_gid);
-	seq_printf(m, ",pgrp=%d", sbi->oz_pgrp);
+	seq_printf(m, ",pgrp=%d", pid_vnr(sbi->oz_pgrp));
 	seq_printf(m, ",timeout=%lu", sbi->exp_timeout/HZ);
 	seq_printf(m, ",minproto=%d", sbi->min_proto);
 	seq_printf(m, ",maxproto=%d", sbi->max_proto);
@@ -231,7 +233,7 @@ static int parse_options(char *options, int *pipefd, uid_t *uid, gid_t *gid,
 
 	*uid = current->uid;
 	*gid = current->gid;
-	*pgrp = task_pgrp_nr(current);
+	*pgrp = task_pgrp_vnr(current);
 
 	*minproto = AUTOFS_MIN_PROTO_VERSION;
 	*maxproto = AUTOFS_MAX_PROTO_VERSION;
@@ -316,6 +318,7 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
 	int pipefd;
 	struct autofs_sb_info *sbi;
 	struct autofs_info *ino;
+	pid_t pgrp;
 
 	sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
 	if (!sbi)
@@ -328,7 +331,6 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
 	sbi->pipe = NULL;
 	sbi->catatonic = 1;
 	sbi->exp_timeout = 0;
-	sbi->oz_pgrp = task_pgrp_nr(current);
 	sbi->sb = s;
 	sbi->version = 0;
 	sbi->sub_version = 0;
@@ -366,7 +368,7 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
 
 	/* Can this call block? */
 	if (parse_options(data, &pipefd, &root_inode->i_uid, &root_inode->i_gid,
-				&sbi->oz_pgrp, &sbi->type, &sbi->min_proto,
+				&pgrp, &sbi->type, &sbi->min_proto,
 				&sbi->max_proto)) {
 		printk("autofs: called with bogus options\n");
 		goto fail_dput;
@@ -394,12 +396,20 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
 		sbi->version = sbi->max_proto;
 	sbi->sub_version = AUTOFS_PROTO_SUBVERSION;
 
-	DPRINTK("pipe fd = %d, pgrp = %u", pipefd, sbi->oz_pgrp);
+	DPRINTK("pipe fd = %d, pgrp = %u", pipefd, pgrp);
+
+	sbi->oz_pgrp = find_get_pid(pgrp);
+
+	if (!sbi->oz_pgrp) {
+		printk("autofs: could not find process group %d\n", pgrp);
+		goto fail_dput;
+	}
+
 	pipe = fget(pipefd);
 	
 	if (!pipe) {
 		printk("autofs: could not open pipe file descriptor\n");
-		goto fail_dput;
+		goto fail_put_pid;
 	}
 	if (!pipe->f_op || !pipe->f_op->write)
 		goto fail_fput;
@@ -420,6 +430,8 @@ fail_fput:
 	printk("autofs: pipe file descriptor does not contain proper ops\n");
 	fput(pipe);
 	/* fall through */
+fail_put_pid:
+	put_pid(sbi->oz_pgrp);
 fail_dput:
 	dput(root);
 	goto fail_free;
-- 
1.6.0.6

-------------- next part --------------
>From 7ebcbe3c7ad977f1a9bfb03a6d7f7dca9f883b83 Mon Sep 17 00:00:00 2001
From: Konstantin Khlebnikov <khlebnikov at openvz.org>
Date: Mon, 22 Sep 2008 13:21:20 +0400
Subject: [PATCH] autofs: fix default pgrp vnr

Default pgrp should be virtual-nr,
because autofs lookup pid struct via find_get_pid.

Signed-off-by: Konstantin Khlebnikov <khlebnikov at openvz.org>
Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 fs/autofs/inode.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/autofs/inode.c b/fs/autofs/inode.c
index dda510d..1f6e222 100644
--- a/fs/autofs/inode.c
+++ b/fs/autofs/inode.c
@@ -78,7 +78,7 @@ static int parse_options(char *options, int *pipefd, uid_t *uid, gid_t *gid,
 
 	*uid = current->uid;
 	*gid = current->gid;
-	*pgrp = task_pgrp_nr(current);
+	*pgrp = task_pgrp_vnr(current);
 
 	*minproto = *maxproto = AUTOFS_PROTO_VERSION;
 
-- 
1.6.0.6

-------------- next part --------------
>From ff3483aef4dbbddf6ee5ca483555c0ef8f8a047f Mon Sep 17 00:00:00 2001
From: Vitaliy Gusev <vgusev at openvz.org>
Date: Thu, 25 Sep 2008 13:03:45 +0400
Subject: [PATCH] Fix erratum that causes memory corruption.

Signed-off-by: Vitaliy Gusev <vgusev at openvz.org>
Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 drivers/base/core.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 660ecc0..47d5db2 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1089,7 +1089,7 @@ EXPORT_SYMBOL_GPL(devices_init);
 
 void devices_fini(void)
 {
-	kset_unregister(devices_kset);
+	kset_unregister(ve_devices_kset);
 }
 EXPORT_SYMBOL_GPL(devices_fini);
 
-- 
1.6.0.6

-------------- next part --------------
>From 6b9fe0296b1aa5b2e70e9ba9790e4bd9af5908c6 Mon Sep 17 00:00:00 2001
From: Pavel Emelyanov <xemul at openvz.org>
Date: Wed, 5 Nov 2008 11:53:48 +0300
Subject: [PATCH] vzwdog: walk through the block devices list properly

Copied check from the show_partitions...

http://bugzilla.openvz.org/show_bug.cgi?id=1064

Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 block/genhd.c         |    5 +++--
 include/linux/genhd.h |    1 +
 kernel/ve/vzwdog.c    |    6 +++++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 901cf04..93ffcfb 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -24,7 +24,8 @@ static DEFINE_MUTEX(block_class_lock);
 struct kobject *block_depr;
 #endif
 
-static struct device_type disk_type;
+struct device_type disk_type;
+EXPORT_SYMBOL(disk_type);
 
 /*
  * Can be deleted altogether. Later.
@@ -515,7 +516,7 @@ struct class block_class = {
 };
 EXPORT_SYMBOL(block_class);
 
-static struct device_type disk_type = {
+struct device_type disk_type = {
 	.name		= "disk",
 	.groups		= disk_attr_groups,
 	.release	= disk_release,
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index ae7aec3..8f28767 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -21,6 +21,7 @@
 extern struct device_type part_type;
 extern struct kobject *block_depr;
 extern struct class block_class;
+extern struct device_type disk_type;
 
 extern const struct seq_operations partitions_op;
 extern const struct seq_operations diskstats_op;
diff --git a/kernel/ve/vzwdog.c b/kernel/ve/vzwdog.c
index 7117365..4510f5d 100644
--- a/kernel/ve/vzwdog.c
+++ b/kernel/ve/vzwdog.c
@@ -184,8 +184,12 @@ static void show_diskio(void)
 
 	list_for_each_entry(dev, &block_class.devices, node) {
 		char *name;
-		struct gendisk *gd = dev_to_disk(dev);
+		struct gendisk *gd;
+		
+		if (dev->type != &disk_type)
+			continue;
 
+		gd = dev_to_disk(dev);
 		name = disk_name(gd, 0, buf);
 		if ((strlen(name) > 4) && (strncmp(name, "loop", 4) == 0) &&
 		    isdigit(name[4]))
-- 
1.6.0.6

-------------- next part --------------
>From 134416f49ad04db56afd7eb2a41ddef4f157ea6f Mon Sep 17 00:00:00 2001
From: Konstantin Khlebnikov <khlebnikov at openvz.org>
Date: Fri, 14 Nov 2008 19:19:40 +0300
Subject: [PATCH] Correct per-process capabilities bounding set in CT

Otherwise tasks in container may have unlimited capabilities...

(#127136)

Singed-off-by: Konstantin Khlebnikov <khlebnikov at openvz.org>
Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 kernel/ve/vecalls.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/kernel/ve/vecalls.c b/kernel/ve/vecalls.c
index 55d8b7b..4a4a24b 100644
--- a/kernel/ve/vecalls.c
+++ b/kernel/ve/vecalls.c
@@ -869,6 +869,7 @@ static void set_task_ve_caps(struct task_struct *tsk, struct ve_struct *ve)
 	tsk->cap_effective = cap_intersect(tsk->cap_effective, bset);
 	tsk->cap_inheritable = cap_intersect(tsk->cap_inheritable, bset);
 	tsk->cap_permitted = cap_intersect(tsk->cap_permitted, bset);
+	tsk->cap_bset = cap_intersect(tsk->cap_bset, bset);
 	spin_unlock(&task_capability_lock);
 }
 
-- 
1.6.0.6

-------------- next part --------------
>From 86d74166a99f5ece5bcd46b85cba4ebd54126685 Mon Sep 17 00:00:00 2001
From: Dmitri Monakhov <dmonakhov at openvz.org>
Date: Wed, 26 Nov 2008 15:29:09 +0300
Subject: [PATCH] ms: fix inotify umount

On umount two event will be dispatched to watcher:
1: inotify_dev_queue_event(.., IN_UNMOUNT,..)
2: remove_watch(watch, dev)
    ->inotify_dev_queue_event(.., IN_IGNORED, ..)
But if watcher has IN_ONESHOT bit set then the watcher will be released
inside first event. Which result in accessing invalid object later.
IMHO it is not pure regression. This bug wasn't triggered while initial
inotify interface testing phase because of another bug in IN_ONESHOT
handling logic :)
  commit ac74c00e499ed276a965e5b5600667d5dc04a84a
  Author: Ulisses Furquim <ulissesf at gmail.com>
  Date:   Fri Feb 8 04:18:16 2008 -0800
    inotify: fix check for one-shot watches before destroying them
    As the IN_ONESHOT bit is never set when an event is sent we must check it
    in the watch's mask and not in the event's mask.

TESTCASE:
#Seems rkagan@ was the only one who try this since feb 2008 :)
mkdir mnt
mount -ttmpfs none mnt
mkdir mnt/d
/inotify mnt/d&
umount mnt ## << lockup or crash here

TESTSOURCE:
/* gcc -oinotify inotify.c */
#include <stdio.h>
#include <stdlib.h>
#include <sys/inotify.h>

int main(int argc, char **argv)
{
        char buf[1024];
        struct inotify_event *ie;
        char *p;
        int i;
        ssize_t l;

        p = argv[1];
        i = inotify_init();
        inotify_add_watch(i, p, ~0);

        l = read(i, buf, sizeof(buf));
        printf("read %d bytes\n", l);
        ie = (struct inotify_event *) buf;
        printf("event mask: %d\n", ie->mask);
	return 0;
}

From: Dmitri Monakhov <dmonakhov at openvz.org>
Date: Wed, 26 Nov 2008 15:18:24 +0300
Subject: [PATCH] Fix incorrect refcount while dispatching unmount event

On umount two event will be dispatched to watcher:
1: inotify_dev_queue_event(.., IN_UNMOUNT,..)
2: remove_watch(watch, dev)
    ->inotify_dev_queue_event(.., IN_IGNORED, ..)
But if watcher has IN_ONESHOT bit set then the watcher will be released
inside first event. Which result in accessing invalid object later.
IMHO it is not pure regression. This bug wasn't triggered while initial
inotify interface testing phase because of another bug in IN_ONESHOT
handling logic :)
  commit ac74c00e499ed276a965e5b5600667d5dc04a84a
  Author: Ulisses Furquim <ulissesf at gmail.com>
  Date:   Fri Feb 8 04:18:16 2008 -0800
    inotify: fix check for one-shot watches before destroying them
    As the IN_ONESHOT bit is never set when an event is sent we must check it
    in the watch's mask and not in the event's mask.

TESTCASE:
mkdir mnt
mount -ttmpfs none mnt
mkdir mnt/d
/inotify mnt/d&
umount mnt ## << lockup or crash here

TESTSOURCE:
/* gcc -oinotify inotify.c */
#include <stdio.h>
#include <stdlib.h>
#include <sys/inotify.h>

int main(int argc, char **argv)
{
        char buf[1024];
        struct inotify_event *ie;
        char *p;
        int i;
        ssize_t l;

        p = argv[1];
        i = inotify_init();
        inotify_add_watch(i, p, ~0);

        l = read(i, buf, sizeof(buf));
        printf("read %d bytes\n", l);
        ie = (struct inotify_event *) buf;
        printf("event mask: %d\n", ie->mask);
	return 0;
}

Signed-off-by: Dmitri Monakhov <dmonakhov at openvz.org>
Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 fs/inotify.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/inotify.c b/fs/inotify.c
index 01ddb06..b2671b6 100644
--- a/fs/inotify.c
+++ b/fs/inotify.c
@@ -398,11 +398,13 @@ void inotify_unmount_inodes(struct list_head *list)
 		watches = &inode->inotify_watches;
 		list_for_each_entry_safe(watch, next_w, watches, i_list) {
 			struct inotify_handle *ih= watch->ih;
+			get_inotify_watch(watch);
 			mutex_lock(&ih->mutex);
 			ih->in_ops->handle_event(watch, watch->wd, IN_UNMOUNT, 0,
 						 NULL, NULL);
 			inotify_remove_watch_locked(ih, watch);
 			mutex_unlock(&ih->mutex);
+			put_inotify_watch(watch);
 		}
 		mutex_unlock(&inode->inotify_mutex);
 		iput(inode);		
-- 
1.6.0.6

-------------- next part --------------
>From 14131d2abbd2554276fe4488e3403d4c0a747cdf Mon Sep 17 00:00:00 2001
From: Konstantin Khlebnikov <khlebnikov at openvz.org>
Date: Fri, 9 Jan 2009 12:18:20 +0300
Subject: [PATCH] ve: sanitize capability checks for namespaces creation

The existing hard checking for namespaces mask is too bad. The
intention was to ban namespaces creation for containers, but
there aready exists a proper security mechanism to govern this
question.

Switch to existing capability-driven policy, thus allowing for
namespaces creation from the HN.

http://bugzilla.openvz.org/show_bug.cgi?id=1113

Signed-off-by: Konstantin Khlebnikov <khlebnikov at openvz.org>
Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 include/linux/nsproxy.h |    2 +-
 include/linux/sched.h   |    4 ----
 kernel/fork.c           |   15 +--------------
 kernel/nsproxy.c        |    7 +++----
 kernel/ve/vecalls.c     |    5 +++--
 5 files changed, 8 insertions(+), 25 deletions(-)

diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index dd6d50f..e707e2c 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -62,7 +62,7 @@ static inline struct nsproxy *task_nsproxy(struct task_struct *tsk)
 	return rcu_dereference(tsk->nsproxy);
 }
 
-int copy_namespaces(unsigned long flags, struct task_struct *tsk);
+int copy_namespaces(unsigned long flags, struct task_struct *tsk, int force_admin);
 void exit_task_namespaces(struct task_struct *tsk);
 void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new);
 void free_nsproxy(struct nsproxy *ns);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 272da80..ab38d35 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -29,10 +29,6 @@
 #define CLONE_NEWNET		0x40000000	/* New network namespace */
 #define CLONE_IO		0x80000000	/* Clone io context */
 
-/* mask of clones which are disabled in OpenVZ VEs */
-#define CLONE_NAMESPACES_MASK	(CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWUSER | \
-				 CLONE_NEWPID | CLONE_NEWNET)
-
 /*
  * Scheduling policies
  */
diff --git a/kernel/fork.c b/kernel/fork.c
index f366869..2cd4ab7 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -917,13 +917,8 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	struct task_struct *p;
 	int cgroup_callbacks_done = 0;
 
-#ifdef CONFIG_VE
-	if (clone_flags & CLONE_NAMESPACES_MASK)
-		return ERR_PTR(-EINVAL);
-#else
 	if ((clone_flags & (CLONE_NEWNS|CLONE_FS)) == (CLONE_NEWNS|CLONE_FS))
 		return ERR_PTR(-EINVAL);
-#endif
 
 	/*
 	 * Thread groups must share signals as well, and detached threads
@@ -1099,7 +1094,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 		goto bad_fork_cleanup_signal;
 	if ((retval = copy_keys(clone_flags, p)))
 		goto bad_fork_cleanup_mm;
-	if ((retval = copy_namespaces(clone_flags, p)))
+	if ((retval = copy_namespaces(clone_flags, p, 0)))
 		goto bad_fork_cleanup_keys;
 	if ((retval = copy_io(clone_flags, p)))
 		goto bad_fork_cleanup_namespaces;
@@ -1651,10 +1646,6 @@ asmlinkage long sys_unshare(unsigned long unshare_flags)
 				CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWUSER|
 				CLONE_NEWNET))
 		goto bad_unshare_out;
-#ifdef CONFIG_VE
-	if (unshare_flags & CLONE_NAMESPACES_MASK)
-		goto bad_unshare_out;
-#endif
 
 	/*
 	 * CLONE_NEWIPC must also detach from the undolist: after switching
@@ -1673,11 +1664,9 @@ asmlinkage long sys_unshare(unsigned long unshare_flags)
 		goto bad_unshare_cleanup_sigh;
 	if ((err = unshare_fd(unshare_flags, &new_fd)))
 		goto bad_unshare_cleanup_vm;
-#ifndef CONFIG_VE
 	if ((err = unshare_nsproxy_namespaces(unshare_flags, &new_nsproxy,
 			new_fs)))
 		goto bad_unshare_cleanup_fd;
-#endif
 
 	if (new_fs ||  new_mm || new_fd || do_sysvsem || new_nsproxy) {
 		if (do_sysvsem) {
@@ -1721,9 +1710,7 @@ asmlinkage long sys_unshare(unsigned long unshare_flags)
 	if (new_nsproxy)
 		put_nsproxy(new_nsproxy);
 
-#ifndef CONFIG_VE
 bad_unshare_cleanup_fd:
-#endif
 	if (new_fd)
 		put_files_struct(new_fd);
 
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 1c0848f..49ff461 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -127,7 +127,8 @@ out_ns:
  * called from clone.  This now handles copy for nsproxy and all
  * namespaces therein.
  */
-int copy_namespaces(unsigned long flags, struct task_struct *tsk)
+int copy_namespaces(unsigned long flags, struct task_struct *tsk,
+		int force_admin)
 {
 	struct nsproxy *old_ns = tsk->nsproxy;
 	struct nsproxy *new_ns;
@@ -142,12 +143,10 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
 				CLONE_NEWUSER | CLONE_NEWPID | CLONE_NEWNET)))
 		return 0;
 
-#ifndef CONFIG_VE
-	if (!capable(CAP_SYS_ADMIN)) {
+	if (!capable(CAP_SYS_ADMIN) && !force_admin) {
 		err = -EPERM;
 		goto out;
 	}
-#endif
 
 	/*
 	 * CLONE_NEWIPC must detach from the undolist: after switching
diff --git a/kernel/ve/vecalls.c b/kernel/ve/vecalls.c
index 4a4a24b..33e3ab1 100644
--- a/kernel/ve/vecalls.c
+++ b/kernel/ve/vecalls.c
@@ -680,7 +680,8 @@ static inline int init_ve_namespaces(struct ve_struct *ve,
 	tsk = current;
 	cur = tsk->nsproxy;
 
-	err = copy_namespaces(CLONE_NAMESPACES_MASK & ~CLONE_NEWNET, tsk);
+	err = copy_namespaces(CLONE_NEWUTS | CLONE_NEWIPC
+			| CLONE_NEWUSER | CLONE_NEWPID, tsk, 1);
 	if (err < 0)
 		return err;
 
@@ -723,7 +724,7 @@ static int init_ve_netns(struct ve_struct *ve, struct nsproxy **old)
 	tsk = current;
 	cur = tsk->nsproxy;
 
-	err = copy_namespaces(CLONE_NEWNET, tsk);
+	err = copy_namespaces(CLONE_NEWNET, tsk, 1);
 	if (err < 0)
 		return err;
 
-- 
1.6.0.6

-------------- next part --------------
>From c5c1032d4b6519d1e3a37853c5c0fd7fbd1f8798 Mon Sep 17 00:00:00 2001
From: Vitaliy Gusev <vgusev at openvz.org>
Date: Tue, 13 Jan 2009 18:23:55 +0300
Subject: [PATCH] Don't dereference NULL tsk->mm in ve_move_task

Kthreads are mmless...

Signed-off-by: Vitaliy Gusev <vgusev at openvz.org>
Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 kernel/ve/vecalls.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/ve/vecalls.c b/kernel/ve/vecalls.c
index 376bbfb..34bcbcd 100644
--- a/kernel/ve/vecalls.c
+++ b/kernel/ve/vecalls.c
@@ -883,7 +883,8 @@ void ve_move_task(struct task_struct *tsk, struct ve_struct *new)
 	BUG_ON(!(thread_group_leader(tsk) && thread_group_empty(tsk)));
 
 	/* this probihibts ptracing of task entered to VE from host system */
-	tsk->mm->vps_dumpable = 0;
+	if (tsk->mm)
+		tsk->mm->vps_dumpable = 0;
 	/* setup capabilities before enter */
 	set_task_ve_caps(tsk, new);
 
-- 
1.6.0.6

-------------- next part --------------
>From 8aa704481f80e55dce430c0c01d276e8ca13018e Mon Sep 17 00:00:00 2001
From: Konstantin Ozerkov <kozerkov at openvz.org>
Date: Fri, 23 Jan 2009 17:43:33 +0300
Subject: [PATCH] Fix broken permissions for Unix98 pty.

This bug is not very critical because modern software can
automatically choose between legacy pty or Unix98 one.

Signed-off-by: Konstantin Ozerkov <kozerkov at openvz.org>
Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 security/device_cgroup.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index d1da90a..ef9fc6b 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -88,7 +88,7 @@ static int devcgroup_can_attach(struct cgroup_subsys *ss,
 #ifdef CONFIG_VE
 static struct dev_whitelist_item default_whitelist_items[] = {
 	{ ~0,                     ~0, DEV_ALL,  ACC_MKNOD },
-	{ UNIX98_PTY_SLAVE_MAJOR, ~0, DEV_CHAR, ACC_READ | ACC_WRITE },
+	{ UNIX98_PTY_MASTER_MAJOR, ~0, DEV_CHAR, ACC_READ | ACC_WRITE },
 	{ UNIX98_PTY_SLAVE_MAJOR, ~0, DEV_CHAR, ACC_READ | ACC_WRITE },
 	{ PTY_MASTER_MAJOR,       ~0, DEV_CHAR, ACC_READ | ACC_WRITE },
 	{ PTY_SLAVE_MAJOR,        ~0, DEV_CHAR, ACC_READ | ACC_WRITE },
-- 
1.6.0.6

-------------- next part --------------
>From 397500cb89baf75c8035060585c0886b3012708a Mon Sep 17 00:00:00 2001
From: Konstantin Khlebnikov <khlebnikov at openvz.org>
Date: Tue, 27 Jan 2009 14:34:57 +0300
Subject: [PATCH] autofs4: fix ia32 compat mode

autofs4_notify_daemon is called from the context of task accessing
the autofs, not the daemon one. Thus the bitness check of current is
wrong for mixed environments.

Signed-off-by: Konstantin Khlebnikov <khlebnikov at openvz.org>
Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 fs/autofs4/autofs_i.h |    1 +
 fs/autofs4/inode.c    |    4 ++++
 fs/autofs4/waitq.c    |    2 +-
 3 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index 4c8d035..80dc520 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -114,6 +114,7 @@ struct autofs_sb_info {
 	struct autofs_wait_queue *queues; /* Wait queue pointer */
 	spinlock_t rehash_lock;
 	struct list_head rehash_list;
+	unsigned is32bit:1;
 };
 
 static inline struct autofs_sb_info *autofs4_sbi(struct super_block *sb)
diff --git a/fs/autofs4/inode.c b/fs/autofs4/inode.c
index 2d8dcb2..40b7b90 100644
--- a/fs/autofs4/inode.c
+++ b/fs/autofs4/inode.c
@@ -337,6 +337,10 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
 	sbi->type = 0;
 	sbi->min_proto = 0;
 	sbi->max_proto = 0;
+#if defined CONFIG_X86_64 && defined CONFIG_IA32_EMULATION
+	if (test_thread_flag(TIF_IA32))
+		sbi->is32bit = 1;
+#endif
 	mutex_init(&sbi->wq_mutex);
 	spin_lock_init(&sbi->fs_lock);
 	sbi->queues = NULL;
diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
index 67d444c..c6d34ea 100644
--- a/fs/autofs4/waitq.c
+++ b/fs/autofs4/waitq.c
@@ -143,7 +143,7 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
 		 *
 		 * reduce size if work in 32-bit mode to satisfy userspace hope
 		 */
-		if (test_thread_flag(TIF_IA32))
+		if (sbi->is32bit)
 			pktsz -= 4;
 #endif
 
-- 
1.6.0.6

-------------- next part --------------
>From a65ea96551f370afb7174472dcd4c43b8165710c Mon Sep 17 00:00:00 2001
From: Konstantin Khlebnikov <khlebnikov at openvz.org>
Date: Tue, 3 Feb 2009 13:57:32 +0300
Subject: [PATCH] simfs: don't work with buggy input

Some (buggy) filesystems (aufs for example) pass NULL as mnt to getatts
and hope for the better...

Let's not confuse the user with the oops at least.

http://bugzilla.openvz.org/show_bug.cgi?id=1054

Signed-off-by: Konstantin Khlebnikov <khlebnikov at openvz.org>
Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 fs/simfs.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/simfs.c b/fs/simfs.c
index 211f604..c062d6d 100644
--- a/fs/simfs.c
+++ b/fs/simfs.c
@@ -55,6 +55,8 @@ static int sim_getattr(struct vfsmount *mnt, struct dentry *dentry,
 			return err;
 	}
 
+	if (!mnt)
+		return 0;
 	sb = mnt->mnt_sb;
 	if (sb->s_op == &sim_super_ops)
 		stat->dev = sb->s_dev;
-- 
1.6.0.6

-------------- next part --------------
>From 0328e3d32c6915650b14dd40fcd7598a420b1364 Mon Sep 17 00:00:00 2001
From: Konstantin Khlebnikov <khlebnikov at openvz.org>
Date: Tue, 24 Feb 2009 16:47:23 +0300
Subject: [PATCH] pidns: update leader_pid at pidns attach

after commit fea9d17 it_real_fn send SIGALRM to task->signal->leader_pid
(used for sys_alarm(...) and sys_setitimer(ITIMER_REAL,...))

Thus, __pid_ns_attach_task hack-n-dirty cross pid-ns task movement must
update this pid too

http://bugzilla.openvz.org/show_bug.cgi?id=1160
127384

Signed-off-by: Konstantin Khlebnikov <khlebnikov at openvz.org>
Signed-off-by: Pavel Emelyanov <xemul at openvz.org>
---
 kernel/pid_namespace.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index c478b80..1445b22 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -209,6 +209,7 @@ static int __pid_ns_attach_task(struct pid_namespace *ns,
 	set_task_session(tsk, pid_nr(pid));
 	reattach_pid(tsk, PIDTYPE_PGID, pid);
 	tsk->signal->__pgrp = pid_nr(pid);
+	tsk->signal->leader_pid = pid;
 	current->signal->tty_old_pgrp = NULL;
 
 	reattach_pid(tsk, PIDTYPE_PID, pid);
-- 
1.6.0.6