[Devel] [PATCH RH8 1/2] fs/sync: fix nullptr dereference ve->ve_ns->mnt_ns

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Tue Jun 29 12:33:00 MSK 2021



On 25.06.2021 20:12, Andrey Zhadchenko wrote:
> ve_ns is not guaranteed to be non-NULL. Fix
> is_sb_ve_accessible() and sync_collect_filesystems()
> Also add rcu_dereference since ve->ve_ns is rcu-protected
> 
> An example of shell commands to crash kernel:
> 
> mkdir /sys/fs/cgroup/ve/10001
> cat /sys/fs/cgroup/ve/10001/tasks
> echo 10001 >  /sys/fs/cgroup/ve/10001/ve.veid
> echo $$ > /sys/fs/cgroup/ve/10001/tasks
> sync
> 
> [59390.889322] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
> [59390.889395] PGD 0 P4D 0
> [59390.889442] Oops: 0000 [#1] SMP PTI
> [59390.889492] CPU: 1 PID: 8950 Comm: sync ve: 10001 Kdump: loaded Not tainted 4.18.0-240.1.1.vz8.5.47 #1 5.47
> [59390.889554] Hardware name: Virtuozzo KVM, BIOS 1.10.2-3.1.vz7.3 04/01/2014
> [59390.889622] RIP: 0010:sync_filesystems_ve+0x34/0x220
> [59390.889673] Code: 55 41 54 55 53 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 48 8b 87 98 01 00 00 48 8d 6c 24 08 48 89 6c 24 08 <4c> 8b 68 18 48 8b 44 24 08 48 89 6c 24 10 48 39 c5 0f 85 ce 01 00
> [59390.889798] RSP: 0018:ffffb1b7810a7ec0 EFLAGS: 00010246
> [59390.889849] RAX: 0000000000000000 RBX: ffff92309ab7c418 RCX: 0000000000000000
> [59390.889903] RDX: ffff92308bbff180 RSI: 0000000000000000 RDI: ffff92309ab7c418
> [59390.889958] RBP: ffffb1b7810a7ec8 R08: 0000000000000000 R09: 0000000000000000
> [59390.890016] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [59390.890071] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [59390.890126] FS:  00007fd7880b6540(0000) GS:ffff9230bbb00000(0000) knlGS:0000000000000000
> [59390.890184] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [59390.890235] CR2: 0000000000000018 CR3: 000000010b22e000 CR4: 00000000000006e0
> [59390.890293] Call Trace:
> [59390.890351]  ? __do_page_fault+0x23a/0x4f0
> [59390.890407]  ksys_sync+0x10d/0x130
> [59390.890456]  __ia32_sys_sync+0xa/0x10
> [59390.890509]  do_syscall_64+0x5b/0x1a0
> [59390.890562]  entry_SYSCALL_64_after_hwframe+0x65/0xca
> [59390.890620] RIP: 0033:0x7fd787fe4ffb
> [59390.890667] Code: c3 48 8b 0d a7 8e 0c 00 f7 d8 64 89 01 b8 ff ff ff ff eb c2 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 a2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8e 0c 00 f7 d8 64 89 01 48
> [59390.890791] RSP: 002b:00007ffd853dd328 EFLAGS: 00000246 ORIG_RAX: 00000000000000a2
> [59390.890848] RAX: ffffffffffffffda RBX: 00007ffd853dd468 RCX: 00007fd787fe4ffb
> [59390.890903] RDX: 00007fd7880b2001 RSI: 0000000000000000 RDI: 00007fd788079b5e
> [59390.890957] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
> [59390.891012] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> [59390.891067] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fd7880ae1b4
> [59390.896038] CR2: 0000000000000018
> 
> https://jira.sw.ru/browse/PSBM-130894
> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
> ---
>   fs/sync.c | 24 ++++++++++++++++++++++--
>   1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/sync.c b/fs/sync.c
> index ef4b1d1..2a7c96b 100644
> --- a/fs/sync.c
> +++ b/fs/sync.c
> @@ -127,10 +127,20 @@ static int sync_filesystem_collected(struct list_head *sync_list, struct super_b
>   static int sync_collect_filesystems(struct ve_struct *ve, struct list_head *sync_list)
>   {
>   	struct mount *mnt;
> -	struct mnt_namespace *mnt_ns = ve->ve_ns->mnt_ns;
> +	struct mnt_namespace *mnt_ns;
> +	struct nsproxy *ve_ns;
>   	struct sync_sb *ss;
>   	int ret = 0;
>   
> +	rcu_read_lock();
> +	ve_ns = rcu_dereference(ve->ve_ns);

Above you read ve->ve_ns to ve_ns variable but never use it. You 
probably designed it to use in the below code, because ve->ve_ns can 
change to NULL after the check below and we still get crash.

> +	if (!ve->ve_ns) {
> +		rcu_read_unlock();
> +		return 0;
> +	}
> +	mnt_ns = ve->ve_ns->mnt_ns;
> +	rcu_read_unlock();
> +
>   	BUG_ON(!list_empty(sync_list));
>   
>   	down_read(&namespace_sem);
> @@ -185,9 +195,19 @@ static void sync_filesystems_ve(struct ve_struct *ve, int wait)
>   static int is_sb_ve_accessible(struct ve_struct *ve, struct super_block *sb)
>   {
>   	struct mount *mnt;
> -	struct mnt_namespace *mnt_ns = ve->ve_ns->mnt_ns;
> +	struct mnt_namespace *mnt_ns;
> +	struct nsproxy *ve_ns;
>   	int ret = 0;
>   
> +	rcu_read_lock();
> +	ve_ns = rcu_dereference(ve->ve_ns);

Same thing here.

> +	if (!ve->ve_ns) {
> +		rcu_read_unlock();
> +		return 0;
> +	}
> +	mnt_ns = ve->ve_ns->mnt_ns;
> +	rcu_read_unlock();
> +
>   	down_read(&namespace_sem);
>   	list_for_each_entry(mnt, &mnt_ns->list, mnt_list) {
>   		if (mnt->mnt.mnt_sb == sb) {
> 

-- 
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.


More information about the Devel mailing list