[Devel] [PATCH] spfs: Main process wakes and kills its children on exit

Kirill Tkhai ktkhai at virtuozzo.com
Tue Jan 23 12:41:46 MSK 2018


Stanislav Kinsburskiy says:

"SPFS manager has a special "--exit-with-spfs" options, which is used by CRIU.
 The idea of the option is simple: force SPFS manager to exit, when it has some
 SPFS processes among its children (i.e. spfs was mounted at least once),
 but all these processes have exited for whatever reason (which usually happens
 when restore has failed and spfs mounts where unmounted).
 Although it works in overall (main SPFS manager process exits), its children
 (responsible to SPFS replacement) may wait on FUTEX for "release" command
 for corresponding SPFS mount and thus never stop until they are killed".

1 spfs-manager
2   \_ spfs
3   \_ spfs-manager
4   \_ spfs
5   \_ spfs-manager

2 and 3 are pair of a mount, and 4 and 5 are pair of another mount.
The patch makes spfs-manager 1 kill 3 in case of 2 exited.

https://jira.sw.ru/browse/PSBM-80055

Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
---
 manager/context.c |    4 ++++
 manager/spfs.c    |    1 +
 2 files changed, 5 insertions(+)

diff --git a/manager/context.c b/manager/context.c
index 1eb37c9..4464a23 100644
--- a/manager/context.c
+++ b/manager/context.c
@@ -53,6 +53,9 @@ static void cleanup_spfs_mount(struct spfs_manager_context_s *ctx,
 		/* SPFS master was killed. We need to release the reference */
 		spfs_release_mnt(info);
 
+	if (killed || WEXITSTATUS(status))
+		kill(info->replacer, SIGKILL);
+
 	info->dead = true;
 	del_spfs_info(ctx->spfs_mounts, info);
 
@@ -88,6 +91,7 @@ static void sigchld_handler(int signal, siginfo_t *siginfo, void *data)
 				 * corresponding fd.
 				 */
 				spfs_release_mnt(info);
+			info->replacer = -1;
 		} else {
 			info = find_spfs_by_pid(ctx->spfs_mounts, pid);
 			if (info) {
diff --git a/manager/spfs.c b/manager/spfs.c
index 3e0f667..0b94587 100644
--- a/manager/spfs.c
+++ b/manager/spfs.c
@@ -39,6 +39,7 @@ int create_spfs_info(const char *id,
 		pr_err("failed to allocate\n");
 		return -ENOMEM;
 	}
+	info->replacer = -1;
 
 	err = init_mount_info(&info->mnt, id, mountpoint, ns_mountpoint);
 	if (err)



More information about the Devel mailing list