[CRIU] [PATCH] mount: fix a race between restoring namespaces and file mappings (v2)
Andrey Vagin
avagin at openvz.org
Wed Dec 9 07:58:06 PST 2015
From: Andrew Vagin <avagin at virtuozzo.com>
Currently we wait when a namespace will be restored to get its root.
We need to open a namespace root to open a file to restore a memory mapping.
A process restores mappings and only then forks children. So we can have
a situation, when we need to open a file from a namespace, which will be
"restored" by one of our children.
The root task restores all mount namespaces and opens a file descriptor
for each of them. In this patch we open root for each mntns in the root
task.
If we neeed to get root of a namespace which isn't populated, we can get
it from the root task. After the CR_STATE_FORKING stage, the root task
closes all namespace descriptors ane we know that all namespaces are
populated at this moment.
v2: don't close root_fd for root ns, because it was not opened
Signed-off-by: Andrew Vagin <avagin at virtuozzo.com>
---
include/namespaces.h | 1 +
mount.c | 39 +++++++++++++++++++++++++++++++++------
2 files changed, 34 insertions(+), 6 deletions(-)
diff --git a/include/namespaces.h b/include/namespaces.h
index c655890..953b874 100644
--- a/include/namespaces.h
+++ b/include/namespaces.h
@@ -38,6 +38,7 @@ struct ns_id {
struct mount_info *mntinfo_list;
struct mount_info *mntinfo_tree;
int ns_fd;
+ int root_fd;
} mnt;
struct {
diff --git a/mount.c b/mount.c
index 3de13cf..662ecf4 100644
--- a/mount.c
+++ b/mount.c
@@ -2972,6 +2972,8 @@ void fini_restore_mntns(void)
if (nsid->nd != &mnt_ns_desc)
continue;
close(nsid->mnt.ns_fd);
+ if (nsid->type != NS_ROOT)
+ close(nsid->mnt.root_fd);
}
}
@@ -3177,6 +3179,8 @@ int prepare_mnt_ns(void)
nsid->mnt.ns_fd = open_proc(PROC_SELF, "ns/mnt");
if (nsid->mnt.ns_fd < 0)
goto err;
+ /* we set ns_populated so we don't need to open root_fd */
+ futex_set(&nsid->ns_populated, 1);
continue;
}
@@ -3197,6 +3201,11 @@ int prepare_mnt_ns(void)
if (nsid->mnt.ns_fd < 0)
goto err;
+ /* root_fd is used to restore file mappings */
+ nsid->mnt.root_fd = open_proc(PROC_SELF, "root");
+ if (nsid->mnt.root_fd < 0)
+ goto err;
+
/* And return back to regain the access to the roots yard */
if (setns(rst, CLONE_NEWNS)) {
pr_perror("Can't restore mntns back");
@@ -3287,15 +3296,33 @@ set_root:
int mntns_get_root_fd(struct ns_id *mntns) {
/*
- * We need to find a task from the target namespace and open its root.
- * For that we need to wait when one of tasks enters into required
- * namespaces.
+ * All namespaces are restored from the root task and during the
+ * CR_STATE_FORKING stage the root task has two file descriptors for
+ * each mntns. One is associated with a namespace and another one is a
+ * root of this mntns.
+ *
+ * When a non-root task is forked, it enters into a proper mount
+ * namespace, restores private mappings and forks children. Some of
+ * these mappings can be associated with files from other namespaces.
*
- * The root task is born in the root mount namespace.
+ * After the CR_STATE_FORKING stage the root task has to close all
+ * mntns file descriptors to restore its descriptors and at this moment
+ * we know that all tasks live in their mount namespaces.
+ *
+ * If we find that a mount namespace isn't populated, we can get its
+ * root from the root task.
*/
- if (mntns->type != NS_ROOT)
- futex_wait_while_eq(&mntns->ns_populated, 0);
+ if (!futex_get(&mntns->ns_populated)) {
+ int fd;
+
+ fd = open_proc(root_item->pid.virt, "fd/%d", mntns->mnt.root_fd);
+ if (fd < 0)
+ return -1;
+
+ return mntns_set_root_fd(mntns->ns_pid, fd);
+ }
+
return __mntns_get_root_fd(mntns->ns_pid);
}
--
2.4.3
More information about the CRIU
mailing list