[Devel] [PATCH RH8] ve/devtmpfs: lightweight virtualization

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Mon Jul 26 15:15:38 MSK 2021


>> @@ -62,6 +63,13 @@ static struct dentry *public_dev_mount(struct 
>> file_system_type *fs_type, int fla
>>                const char *dev_name, void *data)
>>  {
>>      struct super_block *s = mnt->mnt_sb;
>> +#ifdef CONFIG_VE
>> +    struct ve_struct *ve = get_exec_env();
>> +
>> +    if (!ve_is_super(ve))
>> +        s = ve->devtmpfs_mnt->mnt_sb;
>> +#endif
>> +
> 
> We don't have any lock here, so why can't we get a race with ve_destroy()?
> Because ve_destroy() is called on ve cgroup destruction and no processes
> are in this cgroup and thus get_exec_env() can't return semi-dead ve 
> cgroup?

Yea in common case current is holding ve cgroup and it can't get to 
ve_destroy as cgroup is populated.

https://github.com/OpenVZ/vzkernel/blob/branch-rh8-4.18.0-305.3.1.vz8.7.x-ovz/kernel/cgroup/cgroup.c#L5485

But in uncommon case where we first do ve=get_exec_env() and then 
current task is moved into another ve cgroup and our old ve cgroup is 
destroyed and freed and only after that we access 
ve->devtmpfs_mnt->mnt_sb and probably crash on it.

Any other place where we access pointers from ve got from get_exec_env() 
without locks are affected e.g. in ve_relative_clock() and probably many 
more... We could fix this by adding actual reference count to 
get_exec_env() and adding new helper e.g. put_ve(ve) to drop this 
reference... But it would change code everywhere and this case of task 
moved to other ve by other task while in syscall is probably a rare one...

> 
>>      atomic_inc(&s->s_active);
>>      down_write(&s->s_umount);
>>      return dget(s->s_root);
>> @@ -82,6 +90,7 @@ static struct file_system_type internal_fs_type = {
>>  static struct file_system_type dev_fs_type = {
>>      .name = "devtmpfs",
>>      .mount = public_dev_mount,
>> +    .fs_flags = FS_VIRTUALIZED | FS_USERNS_MOUNT,
> 
> i'll put FS_VE_MOUNT instead of FS_USERNS_MOUNT.
> 
> i've checked on host:
>   # unshare -U
>   # mount -t devtmpfs devtmpfs /mnt
>   does not work
> 
> If you have any arguments against this - please let me know.
> 

Right, I was initially planning to do so, but forgot.

-- 
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.


More information about the Devel mailing list