[Devel] [PATCH RH8 1/2] fs/sync: fix nullptr dereference ve->ve_ns->mnt_ns

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Tue Jun 29 12:41:15 MSK 2021



On 29.06.2021 12:33, Pavel Tikhomirov wrote:
> 
> 
> On 25.06.2021 20:12, Andrey Zhadchenko wrote:
>> ve_ns is not guaranteed to be non-NULL. Fix
>> is_sb_ve_accessible() and sync_collect_filesystems()
>> Also add rcu_dereference since ve->ve_ns is rcu-protected
>>
>> An example of shell commands to crash kernel:
>>
>> mkdir /sys/fs/cgroup/ve/10001
>> cat /sys/fs/cgroup/ve/10001/tasks
>> echo 10001 >  /sys/fs/cgroup/ve/10001/ve.veid
>> echo $$ > /sys/fs/cgroup/ve/10001/tasks
>> sync
>>
>> [59390.889322] BUG: unable to handle kernel NULL pointer dereference 
>> at 0000000000000018
>> [59390.889395] PGD 0 P4D 0
>> [59390.889442] Oops: 0000 [#1] SMP PTI
>> [59390.889492] CPU: 1 PID: 8950 Comm: sync ve: 10001 Kdump: loaded Not 
>> tainted 4.18.0-240.1.1.vz8.5.47 #1 5.47
>> [59390.889554] Hardware name: Virtuozzo KVM, BIOS 1.10.2-3.1.vz7.3 
>> 04/01/2014
>> [59390.889622] RIP: 0010:sync_filesystems_ve+0x34/0x220
>> [59390.889673] Code: 55 41 54 55 53 48 83 ec 20 65 48 8b 04 25 28 00 
>> 00 00 48 89 44 24 18 31 c0 48 8b 87 98 01 00 00 48 8d 6c 24 08 48 89 
>> 6c 24 08 <4c> 8b 68 18 48 8b 44 24 08 48 89 6c 24 10 48 39 c5 0f 85 ce 
>> 01 00
>> [59390.889798] RSP: 0018:ffffb1b7810a7ec0 EFLAGS: 00010246
>> [59390.889849] RAX: 0000000000000000 RBX: ffff92309ab7c418 RCX: 
>> 0000000000000000
>> [59390.889903] RDX: ffff92308bbff180 RSI: 0000000000000000 RDI: 
>> ffff92309ab7c418
>> [59390.889958] RBP: ffffb1b7810a7ec8 R08: 0000000000000000 R09: 
>> 0000000000000000
>> [59390.890016] R10: 0000000000000000 R11: 0000000000000000 R12: 
>> 0000000000000000
>> [59390.890071] R13: 0000000000000000 R14: 0000000000000000 R15: 
>> 0000000000000000
>> [59390.890126] FS:  00007fd7880b6540(0000) GS:ffff9230bbb00000(0000) 
>> knlGS:0000000000000000
>> [59390.890184] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [59390.890235] CR2: 0000000000000018 CR3: 000000010b22e000 CR4: 
>> 00000000000006e0
>> [59390.890293] Call Trace:
>> [59390.890351]  ? __do_page_fault+0x23a/0x4f0
>> [59390.890407]  ksys_sync+0x10d/0x130
>> [59390.890456]  __ia32_sys_sync+0xa/0x10
>> [59390.890509]  do_syscall_64+0x5b/0x1a0
>> [59390.890562]  entry_SYSCALL_64_after_hwframe+0x65/0xca
>> [59390.890620] RIP: 0033:0x7fd787fe4ffb
>> [59390.890667] Code: c3 48 8b 0d a7 8e 0c 00 f7 d8 64 89 01 b8 ff ff 
>> ff ff eb c2 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 a2 00 00 
>> 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8e 0c 00 f7 d8 64 89 
>> 01 48
>> [59390.890791] RSP: 002b:00007ffd853dd328 EFLAGS: 00000246 ORIG_RAX: 
>> 00000000000000a2
>> [59390.890848] RAX: ffffffffffffffda RBX: 00007ffd853dd468 RCX: 
>> 00007fd787fe4ffb
>> [59390.890903] RDX: 00007fd7880b2001 RSI: 0000000000000000 RDI: 
>> 00007fd788079b5e
>> [59390.890957] RBP: 0000000000000001 R08: 0000000000000000 R09: 
>> 0000000000000000
>> [59390.891012] R10: 0000000000000000 R11: 0000000000000246 R12: 
>> 0000000000000000
>> [59390.891067] R13: 0000000000000000 R14: 0000000000000000 R15: 
>> 00007fd7880ae1b4
>> [59390.896038] CR2: 0000000000000018
>>
>> https://jira.sw.ru/browse/PSBM-130894
>> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
>> ---
>>   fs/sync.c | 24 ++++++++++++++++++++++--
>>   1 file changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/sync.c b/fs/sync.c
>> index ef4b1d1..2a7c96b 100644
>> --- a/fs/sync.c
>> +++ b/fs/sync.c
>> @@ -127,10 +127,20 @@ static int sync_filesystem_collected(struct 
>> list_head *sync_list, struct super_b
>>   static int sync_collect_filesystems(struct ve_struct *ve, struct 
>> list_head *sync_list)
>>   {
>>       struct mount *mnt;
>> -    struct mnt_namespace *mnt_ns = ve->ve_ns->mnt_ns;
>> +    struct mnt_namespace *mnt_ns;
>> +    struct nsproxy *ve_ns;
>>       struct sync_sb *ss;
>>       int ret = 0;
>> +    rcu_read_lock();
>> +    ve_ns = rcu_dereference(ve->ve_ns);
> 
> Above you read ve->ve_ns to ve_ns variable but never use it. You 
> probably designed it to use in the below code, because ve->ve_ns can 
> change to NULL after the check below and we still get crash.
> 
>> +    if (!ve->ve_ns) {
>> +        rcu_read_unlock();
>> +        return 0;
>> +    }
>> +    mnt_ns = ve->ve_ns->mnt_ns;

One more thing, should not we get some refference on mnt_ns? Can not it 
be released/freed by put_nsproxy() on ve_ns?

>> +    rcu_read_unlock();
>> +
>>       BUG_ON(!list_empty(sync_list));
>>       down_read(&namespace_sem);
>> @@ -185,9 +195,19 @@ static void sync_filesystems_ve(struct ve_struct 
>> *ve, int wait)
>>   static int is_sb_ve_accessible(struct ve_struct *ve, struct 
>> super_block *sb)
>>   {
>>       struct mount *mnt;
>> -    struct mnt_namespace *mnt_ns = ve->ve_ns->mnt_ns;
>> +    struct mnt_namespace *mnt_ns;
>> +    struct nsproxy *ve_ns;
>>       int ret = 0;
>> +    rcu_read_lock();
>> +    ve_ns = rcu_dereference(ve->ve_ns);
> 
> Same thing here.
> 
>> +    if (!ve->ve_ns) {
>> +        rcu_read_unlock();
>> +        return 0;
>> +    }
>> +    mnt_ns = ve->ve_ns->mnt_ns;
>> +    rcu_read_unlock();
>> +
>>       down_read(&namespace_sem);
>>       list_for_each_entry(mnt, &mnt_ns->list, mnt_list) {
>>           if (mnt->mnt.mnt_sb == sb) {
>>
> 

-- 
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.


More information about the Devel mailing list