[CRIU] [PATCH v3] dump: check for conflicts with the dead processes
Evgenii Shatokhin
eshatokhin at virtuozzo.com
Thu Mar 10 07:41:45 PST 2016
It may happen that a process has completed but its files in /proc/PID/
are still open by another process (see remap_dead_pid test from zdtm
suite, for example).
If the PID number has been given to some newer thread since then, this
can be problematic. If that thread is the main thread of some process,
it seems to be handled OK on restore. However, if it is a secondary
thread, restore fails with an error like:
pie: Error (pie/restorer.c:439): Thread pid mismatch 4404/4403
This is because open_remap_dead_process() adds a helper with PID 4403 to
restore /proc/4403/* and that clashes with the thread's PID.
It seems reasonable to detect such things at the checkpoint stage and
refuse to dump, rather than to fail during restore.
https://jira.sw.ru/browse/PSBM-44217
v.3:
* Loop over dead_pids[] first: the array is likely to be empty.
Note that it is still needed to iterate over the threads explicitly
because pstree contains no items for the threads at that point.
v.2:
* Check for conflicts after all tasks have been dumped. One cannot rely
on any particular order of tasks being dumped.
Signed-off-by: Evgenii Shatokhin <eshatokhin at virtuozzo.com>
---
criu/cr-dump.c | 9 +++++++++
criu/files-reg.c | 42 +++++++++++++++++++++++++++++++++++++++---
criu/include/files-reg.h | 1 +
3 files changed, 49 insertions(+), 3 deletions(-)
diff --git a/criu/cr-dump.c b/criu/cr-dump.c
index 1cb4608..c6c1995 100644
--- a/criu/cr-dump.c
+++ b/criu/cr-dump.c
@@ -1689,6 +1689,15 @@ int cr_dump_tasks(pid_t pid)
goto err;
}
+ /*
+ * It may happen that a process has completed but its files in
+ * /proc/PID/ are still open by another process. If the PID has been
+ * given to some newer thread since then, we may be unable to dump
+ * all this.
+ */
+ if (dead_pid_conflict())
+ goto err;
+
/* MNT namespaces are dumped after files to save remapped links */
if (dump_mnt_namespaces() < 0)
goto err;
diff --git a/criu/files-reg.c b/criu/files-reg.c
index 55b2eff..6956ea5 100644
--- a/criu/files-reg.c
+++ b/criu/files-reg.c
@@ -804,11 +804,47 @@ static int dump_linked_remap(char *path, int len, const struct stat *ost,
&rpe, PB_REMAP_FPATH);
}
+static pid_t *dead_pids;
+static int n_dead_pids;
+
+static int dead_pid_check_threads(struct pstree_item *item, pid_t pid)
+{
+ int i;
+
+ for (i = 0; i < item->nr_threads; i++) {
+ /*
+ * If the dead PID was given to a main thread of another
+ * process, this is handled during restore.
+ */
+ if (item->pid.real == item->threads[i].real ||
+ item->threads[i].virt != pid)
+ continue;
+
+ pr_err("Conflict with a dead task with the same PID as of this thread (virt %d, real %d).\n",
+ item->threads[i].virt, item->threads[i].real);
+ return 1;
+ }
+
+ return 0;
+}
+
+int dead_pid_conflict(void)
+{
+ struct pstree_item *item;
+ int i;
+
+ for (i = 0; i < n_dead_pids; i++) {
+ for_each_pstree_item(item)
+ if (dead_pid_check_threads(item, dead_pids[i]))
+ return 1;
+ }
+
+ return 0;
+}
+
static int have_seen_dead_pid(pid_t pid)
{
- static pid_t *dead_pids = NULL;
- static int n_dead_pids = 0;
- size_t i;
+ int i;
for (i = 0; i < n_dead_pids; i++) {
if (dead_pids[i] == pid)
diff --git a/criu/include/files-reg.h b/criu/include/files-reg.h
index 50e1b30..9e5944d 100644
--- a/criu/include/files-reg.h
+++ b/criu/include/files-reg.h
@@ -56,5 +56,6 @@ extern void try_clean_remaps(int ns_fd);
extern int strip_deleted(struct fd_link *link);
extern int prepare_procfs_remaps(void);
+extern int dead_pid_conflict(void);
#endif /* __CR_FILES_REG_H__ */
--
2.6.3
More information about the CRIU
mailing list