[Devel] [PATCH RH7 00/32] port nsfs from vz8
Konstantin Khorenko
khorenko at virtuozzo.com
Tue Jun 16 13:17:34 MSK 2020
On 06/08/2020 08:05 PM, Pavel Tikhomirov wrote:
> We have problems with /proc/pid/ns/name bind-mounts in CRIU
>
> 1) Currently (without nsfs) such a bind mount have same superblock with
> /proc mount, but in case of nested pid-namespaces container can have
> multiple different /proc mounts and for ns-bind-mount we need to bind it
> from the right pidns. So we will need to enter proper pid-namespace to
> reopen ns-file fd from proper proc, it looks too complex.
>
> If we port nsfs ns-bind-mounts will be all on the same superblock which
> does not depend from procfs's we opened the ns-file on.
>
> 2) Bigger problem will come then we will wan't to migrate ns-bind-mounts
> from non-nsfs to nsfs (vz8) kernel this would bring a lot of crutches,
> we will need to workaround the fact that before migration mounts were
> with same superblock and after migration they can't be.
>
> To overcome those we can port nsfs to vz7 and do ns-bind-mount support in
> a new world of nsfs, looks like it would be easier.
>
> First we need to revert all patches which depend from nsfs:
>
> 8782a0069f1b proc: add a proc_show_path method to fix mountinfo
> b823f8df2fcb ms/tun: Add ioctl() TUNGETDEVNETNS cmd to allow obtaining real net ns of tun device
> 302889fa2e3d ms/net: add an ioctl to get a socket network namespace
> 7cb9e7ae7041 ms/tun: Add ioctl() SIOCGSKNS cmd to allow obtaining net ns of tun device
> ac08c64138ac nsfs: add ioctl to get a parent namespace
> a8e0dd94d5cd nsfs: add ioctl to get an owning user namespace for ns file descriptor
> 93dca538d184 kernel: add a helper to get an owning user namespace for a namespace
> edaecdb8adac ms/pidns: expose task pid_ns_for_children to userspace
> 2b151c3f8909 ms/ns: allow ns_entries to have custom symlink content
>
> Cherry-pick nsfs from VZ8:
>
> 435d5f4bb2cc common object embedded into various struct ....ns
> 58be28256d98 make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns
> ff24870f46d5 netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns
> 3c0411846118 switch the rest of proc_ns_operations to working with &...->ns
> 64964528b24e make proc_ns_operations work with struct ns_common * instead of void *
> 6344c433a452 new helpers: ns_alloc_inum/ns_free_inum
> 33c429405a2c copy address of proc_ns_ops into ns_common
> f77c80142e1a bury struct proc_ns in fs/proc
> 292662014509 dcache.c: call ->d_prune() regardless of d_unhashed()
> e149ed2b805f take the targets of /proc/*/ns/* symlinks to separate fs
>
> Cherry-pick part of reverted patches back from VZ8:
>
> bcac25a58bfc kernel: add a helper to get an owning user namespace for a namespace
> 6786741dbf99 nsfs: add ioctl to get an owning user namespace for ns file descriptor
> a7306ed8d94a nsfs: add ioctl to get a parent namespace
> c62cce2caee5 net: add an ioctl to get a socket network namespace
> 25b14e92af1a ns: allow ns_entries to have custom symlink content
> eaa0d190bfe1 pidns: expose task pid_ns_for_children to userspace
>
> Cherry-pick reverted patches back from MS (we also need them to vz8):
>
> 75509fd88fbd nsfs: Add a show_path method to fix mountinfo
> 24dce0800baa net: Export open_related_ns()
> d8d211a2a0c3 net: Make extern and export get_net_ns()
> f2780d6d7475 tun: Add ioctl() SIOCGSKNS cmd to allow obtaining net ns of tun device
> 0c3e0e3bb623 tun: Add ioctl() TUNGETDEVNETNS cmd to allow obtaining real net ns of tun device
Only those 2 patches (tun related) were missing in vz8 kernel, so backported them from ms.
> 073c516ff735 nsfs: mark dentry with DCACHE_RCUACCESS
>
> On this kernel I've runed zdtm, so the change should not break interfaces.
>
> https://jira.sw.ru/browse/PSBM-102357
>
> Al Viro (10):
> ms: common object embedded into various struct ....ns
> make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns
> netns: switch ->get()/->put()/->install()/->inum() to working with
> &net->ns
> switch the rest of proc_ns_operations to working with &...->ns
> make proc_ns_operations work with struct ns_common * instead of void *
> new helpers: ns_alloc_inum/ns_free_inum
> copy address of proc_ns_ops into ns_common
> bury struct proc_ns in fs/proc
> dcache.c: call ->d_prune() regardless of d_unhashed()
> take the targets of /proc/*/ns/* symlinks to separate fs
>
> Andrey Vagin (4):
> kernel: add a helper to get an owning user namespace for a namespace
> nsfs: add ioctl to get an owning user namespace for ns file descriptor
> nsfs: add ioctl to get a parent namespace
> net: add an ioctl to get a socket network namespace
>
> Cong Wang (1):
> nsfs: mark dentry with DCACHE_RCUACCESS
>
> Eric W. Biederman (1):
> nsfs: Add a show_path method to fix mountinfo
>
> Kirill Tkhai (6):
> ns: allow ns_entries to have custom symlink content
> pidns: expose task pid_ns_for_children to userspace
> net: Export open_related_ns()
> net: Make extern and export get_net_ns()
> tun: Add ioctl() SIOCGSKNS cmd to allow obtaining net ns of tun device
> tun: Add ioctl() TUNGETDEVNETNS cmd to allow obtaining real net ns of
> tun device
>
> Pavel Tikhomirov (10):
> Revert "proc: add a proc_show_path method to fix mountinfo"
> Revert "ms/tun: Add ioctl() TUNGETDEVNETNS cmd to allow obtaining real
> net ns of tun device"
> Revert "ms/net: add an ioctl to get a socket network namespace"
> Revert "ms/tun: Add ioctl() SIOCGSKNS cmd to allow obtaining net ns of
> tun device"
> Revert "nsfs: add ioctl to get a parent namespace"
> Revert "nsfs: add ioctl to get an owning user namespace for ns file
> descriptor"
> Revert "kernel: add a helper to get an owning user namespace for a
> namespace"
> Revert "ms/pidns: expose task pid_ns_for_children to userspace"
> Revert "ms/ns: allow ns_entries to have custom symlink content"
> userns: move EXPORT_SYMBOL closer to current_in_userns
>
> drivers/net/tun.c | 15 +-
> fs/Makefile | 2 +-
> fs/dcache.c | 2 +-
> fs/internal.h | 5 +
> fs/mount.h | 3 +-
> fs/namespace.c | 56 ++++----
> fs/nfs_common/grace.c | 2 +-
> fs/nfsd/nfs4recover.c | 2 +-
> fs/nsfs.c | 255 +++++++++++++++++++++++++++++++++
> fs/proc/inode.c | 24 ----
> fs/proc/internal.h | 5 +
> fs/proc/namespaces.c | 245 +++----------------------------
> include/linux/ipc_namespace.h | 3 +-
> include/linux/ns_common.h | 12 ++
> include/linux/pid_namespace.h | 3 +-
> include/linux/proc_fs.h | 4 +
> include/linux/proc_ns.h | 52 ++++---
> include/linux/socket.h | 2 +
> include/linux/user_namespace.h | 10 +-
> include/linux/utsname.h | 3 +-
> include/net/net_namespace.h | 3 +-
> include/uapi/linux/magic.h | 1 +
> include/uapi/linux/nsfs.h | 13 ++
> init/main.c | 2 +
> init/version.c | 5 +-
> ipc/msgutil.c | 5 +-
> ipc/namespace.c | 37 +++--
> kernel/nsproxy.c | 10 +-
> kernel/pid.c | 5 +-
> kernel/pid_namespace.c | 55 ++++---
> kernel/user.c | 5 +-
> kernel/user_namespace.c | 46 +++---
> kernel/utsname.c | 36 +++--
> net/core/net_namespace.c | 44 +++---
> net/socket.c | 26 +---
> 35 files changed, 521 insertions(+), 477 deletions(-)
> create mode 100644 fs/nsfs.c
> create mode 100644 include/linux/ns_common.h
> create mode 100644 include/uapi/linux/nsfs.h
>
More information about the Devel
mailing list