[CRIU] [PATCH] seize: Wait the freezer to complete before processing tags

Cyrill Gorcunov gorcunov at gmail.com
Mon Jul 11 02:06:45 PDT 2016


Currently, when we use cgroup freezer to seize the tasks we start freezer
and then without waiting the completion of transition procedure we are
seizing tasks read from freezer @tasks file, using fgets.

This is fragile construction because fgets uses internal buffer and tasks
we've read might be exiting same time while we're freezing them,
the kernel won't freeze these exiting tasks because they are dying
anyway and I fear we might read a pid here which is not even in
our cgroup anymore but reused with another out of cgroup task.

Thus lets do the following: use iterations to freeze tasks waiting
for freezer to change its state and then collect/seize all tasks
in one pass.

For example on container I'm playing with it takes just one iteration

 | (00.013690) cg: Set 1 is criu one
 | (00.013705) freezing processes: 1800000 attempst with 100 ms steps
 | (00.013720) freezer.state=THAWED
 | (00.013795) freezer.state=FREEZING
 | (00.113962) freezer.state=FROZEN
 | (00.113990) freezing processes: 1 attempts done
 | (00.114073) SEIZE 240893 (comm systemd): success
 | (00.114110) Warn  (ptrace.c:121): Unable to interrupt task: 240905 (comm kthreadd/1) (Operation not permitted)
 | (00.114136) Warn  (ptrace.c:121): Unable to interrupt task: 240906 (comm khelper) (Operation not permitted)
 | (00.114155) SEIZE 240969 (comm screen): success
 | (00.114166) SEIZE 240970 (comm sendmail): success
 | (00.114179) SEIZE 240971 (comm sendmail): success
 | (00.114189) SEIZE 240972 (comm saslauthd): success
 | (00.114202) SEIZE 240973 (comm crond): success
 | (00.114211) SEIZE 240974 (comm agetty): success
 | (00.114221) SEIZE 240975 (comm agetty): success
 | ...

https://jira.sw.ru/browse/PSBM-49439

Signed-off-by: Cyrill Gorcunov <gorcunov at virtuozzo.com>
---
Andrew, take a look please. The patch is for vz7's criu but
same may be used for vanilla's one.

 criu/seize.c | 53 ++++++++++++++++++++++++-----------------------------
 1 file changed, 24 insertions(+), 29 deletions(-)

diff --git a/criu/seize.c b/criu/seize.c
index ad88ea0..59e133b 100644
--- a/criu/seize.c
+++ b/criu/seize.c
@@ -291,45 +291,40 @@ static int freeze_processes(void)
 			close(fd);
 			return -1;
 		}
-	}
 
-	/*
-	 * There is not way to wait a specified state, so we need to poll the
-	 * freezer.state.
-	 * Here is one extra attempt to check that everything are frozen.
-	 */
-	for (i = 0; i <= nr_attempts; i++) {
-		if (seize_cgroup_tree(opts.freeze_cgroup, state) < 0)
-			goto err;
+		/*
+		 * Wait the freezer to complete before
+		 * processing tasks. They might be exiting
+		 * before freezing complete so we should
+		 * not read @tasks pids while freezer in
+		 * transition stage.
+		 */
+		for (i = 0; i <= nr_attempts; i++) {
+			state = get_freezer_state(fd);
+			if (!state) {
+				close(fd);
+				return -1;
+			}
 
-		if (state == frozen)
-			break;
+			if (state == frozen)
+				break;
+			if (alarm_timeouted())
+				goto err;
+			nanosleep(&req, NULL);
+		}
 
-		state = get_freezer_state(fd);
-		if (!state)
+		if (i > nr_attempts) {
+			pr_err("Unable to freeze cgroup %s\n", opts.freeze_cgroup);
 			goto err;
-
-		if (state == frozen) {
-			/*
-			 * Enumerate all tasks one more time to collect all new
-			 * tasks, which can be born while the cgroup is being frozen.
-			 */
-
-			continue;
 		}
 
-		if (alarm_timeouted())
-			goto err;
-		nanosleep(&req, NULL);
+		pr_debug("freezing processes: %lu attempts done\n", i);
+		exit_code = 0;
 	}
 
-	if (i > nr_attempts) {
-		pr_err("Unable to freeze cgroup %s\n", opts.freeze_cgroup);
+	if (seize_cgroup_tree(opts.freeze_cgroup, state) < 0)
 		goto err;
-	}
 
-	pr_debug("freezing processes: %lu attempts done\n", i);
-	exit_code = 0;
 err:
 	if (exit_code == 0 || freezer_thawed) {
 		lseek(fd, 0, SEEK_SET);
-- 
2.7.4



More information about the CRIU mailing list