[CRIU] [PATCH RESEND v1 00/55] Nested pid namespaces support

Kirill Tkhai ktkhai at virtuozzo.com
Tue Mar 28 03:06:12 PDT 2017


On 27.03.2017 21:13, Andrei Vagin wrote:
> On Fri, Mar 24, 2017 at 06:09:03PM +0300, Kirill Tkhai wrote:
>> Hi,
>>
>> this is the first version of nested pid namespaces support.
>> Completelly implemented dump and restore of tasks and threads,
>> and pidns01 test is added.
>>
>> The most problem and strange thing I bumped to is that a task may
>> write "/proc/sys/kernel/ns_last_pid" only of its active pid namespace,
>> independently of in which pid ns the file is open and which caps
>> task has. To solve this problem, pid_ns helpers populating appropriate
>> ns_last_pid were implemented. See the corresponding patches for the
>> details.
>>
>> Patches [1-5/55] is duplicate of <<[PATCH 0/5] Restore of "/proc/self/ns/net" fixes>>
>> I sent today, because it fixes the errors which are seen only after
>> patch [6/55], and I don't want to receive crappy travis errors.
>>
>> Also, there is a error with nested net_ns (see <<[net_ns] Problem of
>> restoring tun in nested net namespace>>), so I temporary switched tun
>> test off in 55/55 till it's fixed.
>>
>> The patch set doesnt't do anything to restore pgid/sid if nested pid_ns
>> is dumped (Pavel Tikhomirov is completelly reworking on pgid/sid
>> at the moment, so we do not do the same work both),
>> so for such cases I switch restore of pgid/sid off in [55/55]. If you
>> want, we may put it into criu option till it's implemented.
>>
> 
> Can you explain how do you restore userns for pid namespaces?

I do not restore in any way, but it may be simply made using intermediate
helper with appropriate user_ns set. Do you think we need this in next
version of patchset?

> If I
> understand you right, now userns is restored before opening files. If
> the answer is yes, we can meet a situation when a process will not have
> enough rights to restore a file descriptor. Have you thought about this
> problem and do you have any ideas how to solve it?

We may create fake file_list_entry'es in parent processes, who have the rights,
mark such fles as masters, and receive fds by transport socket.
 
>> ---
>>
>> Kirill Tkhai (55):
>>       user_ns: Close pid proc in create_user_ns_hierarhy_fn()
>>       ns: Pack functionality of storing ns fd to store_self_ns()
>>       user_ns: Keep root_user_ns ns fd in fdstore
>>       ns: Fix wrong opened net ns file
>>       zdtm: Add proc-self01 test
>>       net: Do not change net_ns of root_item in create_net_ns()
>>       cr-restore: Open transport socket earlier
>>       zdtm: Add pidns00 test
>>       kerndat: Check that "/proc/[pid]/status" file has NS{pid,..} lines
>>       pid: Add pid::level field and level argument for __alloc_pstree_item()
>>       pid: Add equel_pid() helper
>>       pid: Add last_level_pid() helper
>>       pid: Make pgid and sid be allocated dynamically
>>       pid: Use last_level_pid() in restore_pgid()
>>       pid: Alloc threads dynamically
>>       pid: Pass thread pid to caller
>>       pstree: Change arguments of read_pstree_ids()
>>       pstree: Read ids earlier in read_pstree_image()
>>       pid: Add top_pid_ns
>>       pid: Add ns::pid::rb_root
>>       ids: Copy unexisted ids from root_item
>>       pstree: Move parent assignment in read_pstree_image() up
>>       pstree: Assign ids for dead tasks in read_pstree_image()
>>       pstree: Dump pid and user ns ids for dead tasks
>>       pstree: Add pid_ns check in read_pstree_image
>>       pstree: Split lookup_create_pid()
>>       pstree: Add pid_ns id argument to lookup_create_pid()
>>       ns: Add MAX_NS_NESTING
>>       pstree: Make lookup_create_pid() able to create tasks with pid->level > 1
>>       pid: Implement populate_ns_pids() helper
>>       proc_parse: Implement collect_pid_status()
>>       pid_ns: Implement pid_ns_root_off()
>>       pid: Use collect_pid_status() to populate item's pids
>>       images: Add NSpids pstree descriptions
>>       pstree: Dump and restore NSpid, NSsid etc
>>       pstree: Make get_free_pid() work for different pid_ns and export it
>>       pstree: Extract __pstree_item_by_virt() to act on any pid_ns
>>       ns: Reserve pid_ns helpers
>>       restore: Implement set_next_pid() helper
>>       pid: Always lock last pid file on clone()
>>       pid: Add fdstore id for pid_ns descriptor
>>       fdstore: Init fdstore earlier
>>       pid: Save created pid_ns fd to fdstore
>>       ns: Always start usernsd
>>       pid: Add pid ns futex helper_created
>>       ns: Install transport fd socket in usernsd
>>       pid: Create pid_ns helpers
>>       pid: Wait till pid_ns created before we create a child of this ns
>>       pid: Set pid_ns before we create a child
>>       pid: Teach set_next_pid() working with nested pid_ns
>>       restorer: Close transport socket later
>>       restorer: Set NStids in all pid_ns for thread before we create it.
>>       pid: Check for equality of getpid() of child to last_level_pid
>>       pstree: Use CLONE_NEWPID only to create child reaper of pid_ns
>>       ns: Nested pid_ns support
>>
>>
>>  criu/cr-dump.c                  |  123 +++++++--
>>  criu/cr-restore.c               |  237 +++++++++++++----
>>  criu/files-reg.c                |   10 -
>>  criu/files.c                    |   20 +
>>  criu/include/kerndat.h          |    1 
>>  criu/include/namespaces.h       |   21 +
>>  criu/include/parasite-syscall.h |    2 
>>  criu/include/pid.h              |   23 ++
>>  criu/include/proc_parse.h       |   13 +
>>  criu/include/pstree.h           |   30 +-
>>  criu/include/restorer.h         |    6 
>>  criu/include/rst_info.h         |    1 
>>  criu/kerndat.c                  |   28 ++
>>  criu/namespaces.c               |  404 ++++++++++++++++++++++++++--
>>  criu/net.c                      |   62 +---
>>  criu/ns-common.c                |   51 ++++
>>  criu/parasite-syscall.c         |    6 
>>  criu/pie/restorer.c             |   50 +++
>>  criu/proc_parse.c               |  123 ++++++++-
>>  criu/pstree.c                   |  559 +++++++++++++++++++++++++++------------
>>  criu/seize.c                    |   32 ++
>>  criu/tty.c                      |    6 
>>  images/pstree.proto             |   17 +
>>  test/zdtm/static/Makefile       |    3 
>>  test/zdtm/static/pidns00.c      |  206 ++++++++++++++
>>  test/zdtm/static/pidns00.desc   |    1 
>>  test/zdtm/static/proc-self.c    |    4 
>>  test/zdtm/static/proc-self01.c  |    1 
>>  test/zdtm/static/tun.desc       |    2 
>>  29 files changed, 1650 insertions(+), 392 deletions(-)
>>  create mode 100644 criu/ns-common.c
>>  create mode 100644 test/zdtm/static/pidns00.c
>>  create mode 100644 test/zdtm/static/pidns00.desc
>>  create mode 120000 test/zdtm/static/proc-self01.c
>>
>> --
>> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>


More information about the CRIU mailing list