[CRIU] [PATCH] mount: fix regression where open_mountpoint failed on readonly fs

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Wed Jul 11 13:43:33 MSK 2018


If we fail to create temporary directory for doing a clean mount we can
make mount clean reusing the code which enters new mountns to umount
overmounts. As when last process exits mntns all mounts are implicitly
cleaned from children, see in kernel source - sys_exit->do_exit
->exit_task_namespaces->switch_task_namespaces->free_nsproxy
->put_mnt_ns->umount_tree->drop_collected_mounts->umount_tree:

    /* Hide the mounts from mnt_mounts */
    list_for_each_entry(p, &tmp_list, mnt_list) {
            list_del_init(&p->mnt_child);
    }

Fixes commit b6cfb1ce2948 ("mount: make open_mountpoint handle overmouts
properly")

https://github.com/checkpoint-restore/criu/issues/520
Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
---
 criu/mount.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/criu/mount.c b/criu/mount.c
index 0a8406c80..8689cecdd 100644
--- a/criu/mount.c
+++ b/criu/mount.c
@@ -1298,10 +1298,18 @@ int ns_open_mountpoint(void *arg)
 	if (umount_overmounts(mi))
 		goto err;
 
-	/* Save fd which we opened for parent due to CLONE_FILES flag */
-	*fd = get_clean_fd(mi);
-	if (*fd < 0)
+	/*
+	 * Save fd which we opened for parent due to CLONE_FILES flag
+	 *
+	 * Mount can still have children in it, but we don't need to clean it
+	 * explicitly as when last process exits mntns all mounts in it are
+	 * cleaned from their children, and we are exactly the last process.
+	 */
+	*fd = open(mi->mountpoint, O_DIRECTORY|O_RDONLY);
+	if (*fd < 0) {
+		pr_perror("Unable to open %s", mi->mountpoint);
 		goto err;
+	}
 
 	return 0;
 err:
@@ -1340,18 +1348,22 @@ int open_mountpoint(struct mount_info *pm)
 
 	if (!mnt_is_overmounted(pm)) {
 		pr_info("\tmount has children %s\n", pm->mountpoint);
-
 		fd = get_clean_fd(pm);
-		if (fd < 0)
-			goto err;
-	} else {
+	}
+
+	/*
+	 * Mount is overmounted or probably we can't create a temporary
+	 * direcotry for a cleaned mount
+	 */
+	if (fd < 0) {
 		int pid, status;
 		struct clone_arg ca = {
 			.mi = pm,
 			.fd = &fd
 		};
 
-		pr_info("\tmount is overmounted %s\n", pm->mountpoint);
+		pr_info("\tmount is overmounted or has children %s\n",
+				pm->mountpoint);
 
 		/*
 		 * We are overmounted - not accessible in a regular way. We
-- 
2.17.0



More information about the CRIU mailing list