[CRIU] [RFC PATCH 0/4] namespacefs: Proof-of-Concept

Mike Rapoport rppt at kernel.org
Fri Nov 19 00:24:02 MSK 2021


(added more CRIU folks)

On Thu, Nov 18, 2021 at 08:12:06PM +0200, Yordan Karadzhov (VMware) wrote:
> We introduce a simple read-only virtual filesystem that provides
> direct mechanism for examining the existing hierarchy of namespaces
> on the system. For the purposes of this PoC, we tried to keep the
> implementation of the pseudo filesystem as simple as possible. Only
> two namespace types (PID and UTS) are coupled to it for the moment.
> Nevertheless, we do not expect having significant problems when
> adding all other namespace types.
> 
> When fully functional, 'namespacefs' will allow the user to see all
> namespaces that are active on the system and to easily retrieve the
> specific data, managed by each namespace. For example the PIDs of
> all tasks enclosed in the individual PID namespaces. Any existing
> namespace on the system will be represented by its corresponding
> directory in namespacesfs. When a namespace is created a directory
> will be added. When a namespace is destroyed, its corresponding
> directory will be removed. The hierarchy of the directories will
> follow the hierarchy of the namespaces.
> 
> One may argue that most of the information, being exposed by this
> new filesystem is already provided by 'procfs' in /proc/*/ns/. In
> fact, 'namespacefs' aims to be complementary to 'procfs', showing not
> only the individual connections between a process and its namespaces,
> but also the global hierarchy of these connections. As a usage example,
> before playing with 'namespacefs', I had no idea that the Chrome web
> browser creates a number of nested PID namespaces. I can only guess
> that each tab or each site is isolated in a nested namespace.
> 
> Being able to see the structure of the namespaces can be very useful
> in the context of the containerized workloads. This will provide
> universal methods for detecting, examining and monitoring all sorts
> of containers running on the system, without relaying on any specific
> user-space software. Fore example, with the help of 'namespacefs',
> the simple Python script below can discover all containers, created
> by 'Docker' and Podman' (by all user) that are currently running on
> the system.
> 
> 
> import sys
> import os
> import pwd
> 
> path = '/sys/fs/namespaces'
> 
> def pid_ns_tasks(inum):
>     tasks_file = '{0}/pid/{1}/tasks'.format(path ,inum)
>     with open(tasks_file) as f:
>         return [int(pid) for pid in f]
> 
> def uts_ns_inum(pid):
>     uts_ns_file = '/proc/{0}/ns/uts'.format(pid)
>     uts_ns = os.readlink(uts_ns_file)
>     return  uts_ns.split('[')[1].split(']')[0]
> 
> def container_info(pid_inum):
>     pids = pid_ns_tasks(inum)
>     name = ''
>     uid = -1
> 
>     if len(pids):
>         uts_inum = uts_ns_inum(pids[0])
>         uname_file = '{0}/uts/{1}/uname'.format(path, uts_inum)
>         if os.path.exists(uname_file):
>             stat_info = os.stat(uname_file)
>             uid = stat_info.st_uid
>             with open(uname_file) as f:
>                 name = f.read().split()[1]
> 
>     return name, pids, uid
> 
> if __name__ == "__main__":
>     pid_ns_list = os.listdir('{0}/pid'.format(path))
>     for inum in pid_ns_list:
>         name, pids, uid = container_info(inum)
>         if (name):
>             user = pwd.getpwuid(uid).pw_name
>             print("{0} -> pids: {1} user: {2}".format(name, pids, user))
> 
> 
> 
> The idea for 'namespacefs' is inspired by the discussion of the
> 'Container tracing' topic [1] during the 'Tracing micro-conference' [2]
> at LPC 2021.
> 
> 1. https://www.youtube.com/watch?v=09bVK3f0MPg&t=5455s
> 2. https://www.linuxplumbersconf.org/event/11/page/104-accepted-microconferences
> 
> 
> Yordan Karadzhov (VMware) (4):
>   namespacefs: Introduce 'namespacefs'
>   namespacefs: Add methods to create/remove PID namespace directories
>   namespacefs: Couple namespacefs to the PID namespace
>   namespacefs: Couple namespacefs to the UTS namespace
> 
>  fs/Kconfig                  |   1 +
>  fs/Makefile                 |   1 +
>  fs/namespacefs/Kconfig      |   6 +
>  fs/namespacefs/Makefile     |   4 +
>  fs/namespacefs/inode.c      | 410 ++++++++++++++++++++++++++++++++++++
>  include/linux/namespacefs.h |  73 +++++++
>  include/linux/ns_common.h   |   4 +
>  include/uapi/linux/magic.h  |   2 +
>  kernel/pid_namespace.c      |   9 +
>  kernel/utsname.c            |   9 +
>  10 files changed, 519 insertions(+)
>  create mode 100644 fs/namespacefs/Kconfig
>  create mode 100644 fs/namespacefs/Makefile
>  create mode 100644 fs/namespacefs/inode.c
>  create mode 100644 include/linux/namespacefs.h
> 
> -- 
> 2.33.1
> 

-- 
Sincerely yours,
Mike.


More information about the CRIU mailing list