[Devel] [PATCH RHEL7 COMMIT] ms/fs/namespace.c: WARN if mnt_count has become negative

Konstantin Khorenko khorenko at virtuozzo.com
Tue Nov 29 19:43:18 MSK 2022


The commit is pushed to "branch-rh7-3.10.0-1160.80.1.vz7.190.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-1160.80.1.vz7.190.1
------>
commit 4b1fd5d747cf1fcf8e3f495c627fa1c294f8ca8a
Author: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
Date:   Mon Nov 28 21:48:23 2022 +0200

    ms/fs/namespace.c: WARN if mnt_count has become negative
    
    Missing calls to mntget() (or equivalently, too many calls to mntput())
    are hard to detect because mntput() delays freeing mounts using
    task_work_add(), then again using call_rcu().  As a result, mnt_count
    can often be decremented to -1 without getting a KASAN use-after-free
    report.  Such cases are still bugs though, and they point to real
    use-after-frees being possible.
    
    For an example of this, see the bug fixed by commit 1b0b9cc8d379
    ("vfs: fsmount: add missing mntget()"), discussed at
    https://lkml.kernel.org/linux-fsdevel/20190605135401.GB30925@xxxxxxxxxxxxxxxxxxxxxxxxx/T/#u.
    This bug *should* have been trivial to find.  But actually, it wasn't
    found until syzkaller happened to use fchdir() to manipulate the
    reference count just right for the bug to be noticeable.
    
    Address this by making mntput_no_expire() issue a WARN if mnt_count has
    become negative.
    
    Suggested-by: Miklos Szeredi <miklos at szeredi.hu>
    Signed-off-by: Eric Biggers <ebiggers at google.com>
    Signed-off-by: Al Viro <viro at zeniv.linux.org.uk>
    
    (mainstream commit edf7ddbf1c5eb98b720b063b73e20e8a4a1ce673)
    https://jira.sw.ru/browse/PSBM-142996
    Signed-off-by: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
---
 fs/namespace.c | 9 ++++++---
 fs/pnode.h     | 2 +-
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 16a94d9ba877..fcfe15ed28f2 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -176,10 +176,10 @@ static inline void mnt_add_count(struct mount *mnt, int n)
 /*
  * vfsmount lock must be held for write
  */
-unsigned int mnt_get_count(struct mount *mnt)
+int mnt_get_count(struct mount *mnt)
 {
 #ifdef CONFIG_SMP
-	unsigned int count = 0;
+	int count = 0;
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
@@ -1263,6 +1263,7 @@ static inline bool stack_is_low(void)
 
 static void mntput_no_expire(struct mount *mnt)
 {
+	int count;
 	rcu_read_lock();
 	if (likely(READ_ONCE(mnt->mnt_ns))) {
 		/*
@@ -1280,7 +1281,9 @@ static void mntput_no_expire(struct mount *mnt)
 	}
 	lock_mount_hash();
 	mnt_add_count(mnt, -1);
-	if (mnt_get_count(mnt)) {
+	count = mnt_get_count(mnt);
+	if (count != 0) {
+		WARN_ON(count < 0);
 		rcu_read_unlock();
 		unlock_mount_hash();
 		return;
diff --git a/fs/pnode.h b/fs/pnode.h
index 6317777d7d32..3429a5ca6a7f 100644
--- a/fs/pnode.h
+++ b/fs/pnode.h
@@ -46,7 +46,7 @@ int propagate_mount_busy(struct mount *, int);
 void propagate_mount_unlock(struct mount *);
 void mnt_release_group_id(struct mount *);
 int get_dominating_id(struct mount *mnt, const struct path *root);
-unsigned int mnt_get_count(struct mount *mnt);
+int mnt_get_count(struct mount *mnt);
 void mnt_set_mountpoint(struct mount *, struct mountpoint *,
 			struct mount *);
 void mnt_change_mountpoint(struct mount *parent, struct mountpoint *mp,


More information about the Devel mailing list