[Devel] [PATCH RH8] ve/devtmpfs: lightweight virtualization
Pavel Tikhomirov
ptikhomirov at virtuozzo.com
Mon Jul 26 15:15:38 MSK 2021
>> @@ -62,6 +63,13 @@ static struct dentry *public_dev_mount(struct
>> file_system_type *fs_type, int fla
>> const char *dev_name, void *data)
>> {
>> struct super_block *s = mnt->mnt_sb;
>> +#ifdef CONFIG_VE
>> + struct ve_struct *ve = get_exec_env();
>> +
>> + if (!ve_is_super(ve))
>> + s = ve->devtmpfs_mnt->mnt_sb;
>> +#endif
>> +
>
> We don't have any lock here, so why can't we get a race with ve_destroy()?
> Because ve_destroy() is called on ve cgroup destruction and no processes
> are in this cgroup and thus get_exec_env() can't return semi-dead ve
> cgroup?
Yea in common case current is holding ve cgroup and it can't get to
ve_destroy as cgroup is populated.
https://github.com/OpenVZ/vzkernel/blob/branch-rh8-4.18.0-305.3.1.vz8.7.x-ovz/kernel/cgroup/cgroup.c#L5485
But in uncommon case where we first do ve=get_exec_env() and then
current task is moved into another ve cgroup and our old ve cgroup is
destroyed and freed and only after that we access
ve->devtmpfs_mnt->mnt_sb and probably crash on it.
Any other place where we access pointers from ve got from get_exec_env()
without locks are affected e.g. in ve_relative_clock() and probably many
more... We could fix this by adding actual reference count to
get_exec_env() and adding new helper e.g. put_ve(ve) to drop this
reference... But it would change code everywhere and this case of task
moved to other ve by other task while in syscall is probably a rare one...
>
>> atomic_inc(&s->s_active);
>> down_write(&s->s_umount);
>> return dget(s->s_root);
>> @@ -82,6 +90,7 @@ static struct file_system_type internal_fs_type = {
>> static struct file_system_type dev_fs_type = {
>> .name = "devtmpfs",
>> .mount = public_dev_mount,
>> + .fs_flags = FS_VIRTUALIZED | FS_USERNS_MOUNT,
>
> i'll put FS_VE_MOUNT instead of FS_USERNS_MOUNT.
>
> i've checked on host:
> # unshare -U
> # mount -t devtmpfs devtmpfs /mnt
> does not work
>
> If you have any arguments against this - please let me know.
>
Right, I was initially planning to do so, but forgot.
--
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.
More information about the Devel
mailing list