[CRIU] [PATCH v2] dump: check for conflicts with the dead processes

Evgenii Shatokhin eshatokhin at virtuozzo.com
Wed Mar 9 04:04:00 PST 2016


09.03.2016 13:04, Pavel Emelyanov пишет:
> On 03/04/2016 05:06 PM, Evgenii Shatokhin wrote:
>> It may happen that a process has completed but its files in /proc/PID/
>> are still open by another process (see remap_dead_pid test from zdtm
>> suite, for example).
>>
>> If the PID number has been given to some newer thread since then, this
>> can be problematic. If that thread is the main thread of some process,
>> it seems to be handled OK on restore. However, if it is a secondary
>> thread, restore fails with an error like:
>>
>>    pie: Error (pie/restorer.c:439): Thread pid mismatch 4404/4403
>>
>> This is because open_remap_dead_process() adds a helper with PID 4403 to
>> restore /proc/4403/* and that clashes with the thread's PID.
>>
>> It seems reasonable to detect such things at the checkpoint stage and
>> refuse to dump, rather than to fail during restore.
>>
>> https://jira.sw.ru/browse/PSBM-44217
>>
>> v.2:
>> * Check for conflicts after all tasks have been dumped. One cannot rely
>>    on any particular order of tasks being dumped.
>>
>> Signed-off-by: Evgenii Shatokhin <eshatokhin at virtuozzo.com>
>> ---
>>   criu/cr-dump.c           | 31 +++++++++++++++++++++++++++++++
>>   criu/files-reg.c         | 16 ++++++++++++----
>>   criu/include/files-reg.h |  1 +
>>   3 files changed, 44 insertions(+), 4 deletions(-)
>>
>> diff --git a/criu/cr-dump.c b/criu/cr-dump.c
>> index 4ec2a55..9112b6a 100644
>> --- a/criu/cr-dump.c
>> +++ b/criu/cr-dump.c
>> @@ -1608,6 +1608,26 @@ static int cr_dump_finish(int ret)
>>   	return post_dump_ret ? : (ret != 0);
>>   }
>>
>> +static int dead_pid_conflict(struct pstree_item *item)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < item->nr_threads; i++) {
>> +		/*
>> +		 * If the dead PID was given to a main thread of another
>> +		 * process, this is handled during restore.
>> +		 */
>> +		if (item->pid.real == item->threads[i].real ||
>> +		    !dead_pid_exists(item->threads[i].virt))
>> +			continue;
>> +
>> +		pr_err("Conflict with a dead task with the same PID as of this thread (virt %d, real %d).\n",
>> +			item->threads[i].virt, item->threads[i].real);
>> +		return -1;
>> +	}
>> +	return 0;
>> +}
>> +
>>   int cr_dump_tasks(pid_t pid)
>>   {
>>   	InventoryEntry he = INVENTORY_ENTRY__INIT;
>> @@ -1692,6 +1712,17 @@ int cr_dump_tasks(pid_t pid)
>>   			goto err;
>>   	}
>>
>> +	/*
>> +	 * It may happen that a process has completed but its files in
>> +	 * /proc/PID/ are still open by another process. If the PID has been
>> +	 * given to some newer thread since then, we may be unable to dump
>> +	 * all this.
>> +	 */
>> +	for_each_pstree_item(item) {
>> +		if (dead_pid_conflict(item))
>
> This is not nice as well. Typically we have many pstree items and zero
> dead pids. So your nested loop always iterates all pstree items to see
> that there's no conflicts with empty array.
>
> Invert the nesting so in the typical case we do zero iterations. Allso
> use the pid_in_pstree() helper, we'll optimize it to make logN search
> instead of N as it is now.
>

OK. I doubt the effect will be significant in this particular case but I 
agree, the changes you suggest won't hurt anyway. Will do it in v.3.

>> +			goto err;
>> +	}
>> +
>>   	/* MNT namespaces are dumped after files to save remapped links */
>>   	if (dump_mnt_namespaces() < 0)
>>   		goto err;
>> diff --git a/criu/files-reg.c b/criu/files-reg.c
>> index 511abfe..3bf1d9e 100644
>> --- a/criu/files-reg.c
>> +++ b/criu/files-reg.c
>> @@ -805,16 +805,24 @@ static int dump_linked_remap(char *path, int len, const struct stat *ost,
>>   			&rpe, PB_REMAP_FPATH);
>>   }
>>
>> -static int have_seen_dead_pid(pid_t pid)
>> +pid_t *dead_pids;
>> +int n_dead_pids;
>> +
>> +int dead_pid_exists(pid_t pid)
>>   {
>> -	static pid_t *dead_pids = NULL;
>> -	static int n_dead_pids = 0;
>> -	size_t i;
>> +	int i;
>>
>>   	for (i = 0; i < n_dead_pids; i++) {
>>   		if (dead_pids[i] == pid)
>>   			return 1;
>>   	}
>> +	return 0;
>> +}
>> +
>> +static int have_seen_dead_pid(pid_t pid)
>> +{
>> +	if (dead_pid_exists(pid))
>> +		return 1;
>>
>>   	if (xrealloc_safe(&dead_pids, sizeof(*dead_pids) * (n_dead_pids + 1)))
>>   		return -1;
>> diff --git a/criu/include/files-reg.h b/criu/include/files-reg.h
>> index 1cbefb0..0740fe4 100644
>> --- a/criu/include/files-reg.h
>> +++ b/criu/include/files-reg.h
>> @@ -55,5 +55,6 @@ extern void try_clean_remaps(int ns_fd);
>>   extern int strip_deleted(struct fd_link *link);
>>
>>   extern int prepare_procfs_remaps(void);
>> +extern int dead_pid_exists(pid_t pid);
>>
>>   #endif /* __CR_FILES_REG_H__ */
>>
>
> .
>



More information about the CRIU mailing list