[CRIU] [PATCH 4/8] memory: don't use parent memdump if detected possible pid reuse

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Fri Feb 9 19:06:41 MSK 2018


We have a problem when a pid is reused between consequent dumps we can't
understand if pagemap and pages from images of parent dump are invalid
to restore these pid already. That can lead even to wrong memory
restored for these pid, see the test in last patch.

So these is a try do separate processes with (likely) invalid previous
memory dump from processes with 100% valid previous dump.

For that we use the value of /proc/<pid>/stat's start_time and also the
timestamp of each (pre)dump. If the start time is strictly less than the
timestamp, that means that the pagemap for these pid from previous dump
is valid - was done for exactly the same process.

Creation time is in centiseconds by default so if predump is really fast
(<1csec) we can have false negative decisions for some processes, but in
case of long running processes we are fine.

https://jira.sw.ru/browse/PSBM-67502

Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
---
 criu/mem.c | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/criu/mem.c b/criu/mem.c
index 4c6942a11..355c992c7 100644
--- a/criu/mem.c
+++ b/criu/mem.c
@@ -30,9 +30,11 @@
 #include "fault-injection.h"
 #include "prctl.h"
 #include <compel/compel.h>
+#include "proc_parse.h"
 
 #include "protobuf.h"
 #include "images/pagemap.pb-c.h"
+#include "images/stats.pb-c.h"
 
 static int task_reset_dirty_track(int pid)
 {
@@ -303,6 +305,7 @@ static int __parasite_dump_pages_seized(struct pstree_item *item,
 	int ret = -1;
 	unsigned cpp_flags = 0;
 	unsigned long pmc_size;
+	bool possible_pid_reuse = false;
 
 	if (opts.check_only)
 		return 0;
@@ -360,6 +363,38 @@ static int __parasite_dump_pages_seized(struct pstree_item *item,
 			xfer.parent = NULL + 1;
 	}
 
+	if (xfer.parent) {
+		struct proc_pid_stat pps_buf;
+		StatsEntry *stats = NULL;
+		unsigned long dump_ticks;
+		unsigned long clock_ticks;
+
+		clock_ticks = sysconf(_SC_CLK_TCK);
+		if (clock_ticks == -1) {
+			pr_perror("Failed to get clock ticks via sysconf");
+			goto out_xfer;
+		}
+
+		ret = parse_pid_stat(item->pid->real, &pps_buf);
+		if (ret < 0)
+			goto out_xfer;
+
+		ret = get_parent_stats((void**)&stats);
+		if (ret < 0)
+			goto out_xfer;
+		dump_ticks = stats->dump->dump_uptime/(USEC_PER_SEC / clock_ticks);
+		stats_entry__free_unpacked(stats, NULL);
+
+		if (pps_buf.start_time >= dump_ticks) {
+			pr_warn("Detected possible pid reuse pid=%d, " \
+				"start_time=%llu, parent's dump_uptime=%lu\n",
+				item->pid->real, pps_buf.start_time,
+				dump_ticks);
+			possible_pid_reuse = true;
+		}
+	}
+
+
 	/*
 	 * Step 1 -- generate the pagemap
 	 */
@@ -386,7 +421,7 @@ static int __parasite_dump_pages_seized(struct pstree_item *item,
 		else {
 again:
 			ret = generate_iovs(vma_area, pp, map, &off,
-				has_parent);
+				has_parent && !possible_pid_reuse);
 			if (ret == -EAGAIN) {
 				BUG_ON(!(pp->flags & PP_CHUNK_MODE));
 
-- 
2.14.3



More information about the CRIU mailing list