[CRIU] [PATCH RESEND v1 00/55] Nested pid namespaces support
Andrei Vagin
avagin at virtuozzo.com
Tue Mar 28 14:41:21 PDT 2017
On Tue, Mar 28, 2017 at 01:06:12PM +0300, Kirill Tkhai wrote:
> On 27.03.2017 21:13, Andrei Vagin wrote:
> > On Fri, Mar 24, 2017 at 06:09:03PM +0300, Kirill Tkhai wrote:
> >> Hi,
> >>
> >> this is the first version of nested pid namespaces support.
> >> Completelly implemented dump and restore of tasks and threads,
> >> and pidns01 test is added.
> >>
> >> The most problem and strange thing I bumped to is that a task may
> >> write "/proc/sys/kernel/ns_last_pid" only of its active pid namespace,
> >> independently of in which pid ns the file is open and which caps
> >> task has. To solve this problem, pid_ns helpers populating appropriate
> >> ns_last_pid were implemented. See the corresponding patches for the
> >> details.
> >>
> >> Patches [1-5/55] is duplicate of <<[PATCH 0/5] Restore of "/proc/self/ns/net" fixes>>
> >> I sent today, because it fixes the errors which are seen only after
> >> patch [6/55], and I don't want to receive crappy travis errors.
> >>
> >> Also, there is a error with nested net_ns (see <<[net_ns] Problem of
> >> restoring tun in nested net namespace>>), so I temporary switched tun
> >> test off in 55/55 till it's fixed.
> >>
> >> The patch set doesnt't do anything to restore pgid/sid if nested pid_ns
> >> is dumped (Pavel Tikhomirov is completelly reworking on pgid/sid
> >> at the moment, so we do not do the same work both),
> >> so for such cases I switch restore of pgid/sid off in [55/55]. If you
> >> want, we may put it into criu option till it's implemented.
> >>
> >
> > Can you explain how do you restore userns for pid namespaces?
>
> I do not restore in any way, but it may be simply made using intermediate
> helper with appropriate user_ns set. Do you think we need this in next
> version of patchset?
Yes, I think we need.
>
> > If I
> > understand you right, now userns is restored before opening files. If
> > the answer is yes, we can meet a situation when a process will not have
> > enough rights to restore a file descriptor. Have you thought about this
> > problem and do you have any ideas how to solve it?
>
> We may create fake file_list_entry'es in parent processes, who have the rights,
> mark such fles as masters, and receive fds by transport socket.
>
> >> ---
> >>
> >> Kirill Tkhai (55):
> >> user_ns: Close pid proc in create_user_ns_hierarhy_fn()
> >> ns: Pack functionality of storing ns fd to store_self_ns()
> >> user_ns: Keep root_user_ns ns fd in fdstore
> >> ns: Fix wrong opened net ns file
> >> zdtm: Add proc-self01 test
> >> net: Do not change net_ns of root_item in create_net_ns()
> >> cr-restore: Open transport socket earlier
> >> zdtm: Add pidns00 test
> >> kerndat: Check that "/proc/[pid]/status" file has NS{pid,..} lines
> >> pid: Add pid::level field and level argument for __alloc_pstree_item()
> >> pid: Add equel_pid() helper
> >> pid: Add last_level_pid() helper
> >> pid: Make pgid and sid be allocated dynamically
> >> pid: Use last_level_pid() in restore_pgid()
> >> pid: Alloc threads dynamically
> >> pid: Pass thread pid to caller
> >> pstree: Change arguments of read_pstree_ids()
> >> pstree: Read ids earlier in read_pstree_image()
> >> pid: Add top_pid_ns
> >> pid: Add ns::pid::rb_root
> >> ids: Copy unexisted ids from root_item
> >> pstree: Move parent assignment in read_pstree_image() up
> >> pstree: Assign ids for dead tasks in read_pstree_image()
> >> pstree: Dump pid and user ns ids for dead tasks
> >> pstree: Add pid_ns check in read_pstree_image
> >> pstree: Split lookup_create_pid()
> >> pstree: Add pid_ns id argument to lookup_create_pid()
> >> ns: Add MAX_NS_NESTING
> >> pstree: Make lookup_create_pid() able to create tasks with pid->level > 1
> >> pid: Implement populate_ns_pids() helper
> >> proc_parse: Implement collect_pid_status()
> >> pid_ns: Implement pid_ns_root_off()
> >> pid: Use collect_pid_status() to populate item's pids
> >> images: Add NSpids pstree descriptions
> >> pstree: Dump and restore NSpid, NSsid etc
> >> pstree: Make get_free_pid() work for different pid_ns and export it
> >> pstree: Extract __pstree_item_by_virt() to act on any pid_ns
> >> ns: Reserve pid_ns helpers
> >> restore: Implement set_next_pid() helper
> >> pid: Always lock last pid file on clone()
> >> pid: Add fdstore id for pid_ns descriptor
> >> fdstore: Init fdstore earlier
> >> pid: Save created pid_ns fd to fdstore
> >> ns: Always start usernsd
> >> pid: Add pid ns futex helper_created
> >> ns: Install transport fd socket in usernsd
> >> pid: Create pid_ns helpers
> >> pid: Wait till pid_ns created before we create a child of this ns
> >> pid: Set pid_ns before we create a child
> >> pid: Teach set_next_pid() working with nested pid_ns
> >> restorer: Close transport socket later
> >> restorer: Set NStids in all pid_ns for thread before we create it.
> >> pid: Check for equality of getpid() of child to last_level_pid
> >> pstree: Use CLONE_NEWPID only to create child reaper of pid_ns
> >> ns: Nested pid_ns support
> >>
> >>
> >> criu/cr-dump.c | 123 +++++++--
> >> criu/cr-restore.c | 237 +++++++++++++----
> >> criu/files-reg.c | 10 -
> >> criu/files.c | 20 +
> >> criu/include/kerndat.h | 1
> >> criu/include/namespaces.h | 21 +
> >> criu/include/parasite-syscall.h | 2
> >> criu/include/pid.h | 23 ++
> >> criu/include/proc_parse.h | 13 +
> >> criu/include/pstree.h | 30 +-
> >> criu/include/restorer.h | 6
> >> criu/include/rst_info.h | 1
> >> criu/kerndat.c | 28 ++
> >> criu/namespaces.c | 404 ++++++++++++++++++++++++++--
> >> criu/net.c | 62 +---
> >> criu/ns-common.c | 51 ++++
> >> criu/parasite-syscall.c | 6
> >> criu/pie/restorer.c | 50 +++
> >> criu/proc_parse.c | 123 ++++++++-
> >> criu/pstree.c | 559 +++++++++++++++++++++++++++------------
> >> criu/seize.c | 32 ++
> >> criu/tty.c | 6
> >> images/pstree.proto | 17 +
> >> test/zdtm/static/Makefile | 3
> >> test/zdtm/static/pidns00.c | 206 ++++++++++++++
> >> test/zdtm/static/pidns00.desc | 1
> >> test/zdtm/static/proc-self.c | 4
> >> test/zdtm/static/proc-self01.c | 1
> >> test/zdtm/static/tun.desc | 2
> >> 29 files changed, 1650 insertions(+), 392 deletions(-)
> >> create mode 100644 criu/ns-common.c
> >> create mode 100644 test/zdtm/static/pidns00.c
> >> create mode 100644 test/zdtm/static/pidns00.desc
> >> create mode 120000 test/zdtm/static/proc-self01.c
> >>
> >> --
> >> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
More information about the CRIU
mailing list