[CRIU] [PATCH v3 2/4] mount: Add forced mounts sceleton

Kirill Tkhai ktkhai at virtuozzo.com
Wed Jul 13 07:52:06 PDT 2016



On 13.07.2016 17:34, Pavel Emelyanov wrote:
> On 07/13/2016 04:43 PM, Kirill Tkhai wrote:
>> On 13.07.2016 15:46, Pavel Emelyanov wrote:
>>> On 06/29/2016 02:49 PM, Kirill Tkhai wrote:
>>>> Mechanism for mounts, which are forced by criu during dump stage.
>>>>
>>>> v2: New patch, splitted from v1's "[2/3] mount: Forced mount unmounted binfmt_misc to do not lost its content"
>>>>     Cleanup for dump failed case is added
>>>>
>>>> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
>>>> ---
>>>>  criu/cr-dump.c       |    1 +
>>>>  criu/include/mount.h |    1 +
>>>>  criu/mount.c         |   76 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  3 files changed, 78 insertions(+)
>>>>
>>>> diff --git a/criu/cr-dump.c b/criu/cr-dump.c
>>>> index 00d28e9..7609544 100644
>>>> --- a/criu/cr-dump.c
>>>> +++ b/criu/cr-dump.c
>>>> @@ -1588,6 +1588,7 @@ static int cr_dump_finish(int ret)
>>>>  	if (ret || post_dump_ret || opts.final_state == TASK_ALIVE) {
>>>>  		network_unlock();
>>>>  		delete_link_remaps();
>>>> +		cleanup_forced_mounts();
>>>>  	}
>>>>  	pstree_switch_state(root_item,
>>>>  			    (ret || post_dump_ret) ?
>>>> diff --git a/criu/include/mount.h b/criu/include/mount.h
>>>> index c7992ac..423b6d0 100644
>>>> --- a/criu/include/mount.h
>>>> +++ b/criu/include/mount.h
>>>> @@ -127,5 +127,6 @@ extern int ext_mount_add(char *key, char *val);
>>>>  extern int mntns_maybe_create_roots(void);
>>>>  extern int read_mnt_ns_img(void);
>>>>  extern void cleanup_mnt_ns(void);
>>>> +extern void cleanup_forced_mounts(void);
>>>>  
>>>>  #endif /* __CR_MOUNT_H__ */
>>>> diff --git a/criu/mount.c b/criu/mount.c
>>>> index 55f248d..4a288e4 100644
>>>> --- a/criu/mount.c
>>>> +++ b/criu/mount.c
>>>> @@ -41,6 +41,15 @@
>>>>  #define LOG_PREFIX "mnt: "
>>>>  
>>>>  static struct fstype fstypes[];
>>>> +static LIST_HEAD(forced_mounts_list);
>>>> +
>>>> +struct forced_mount {
>>>> +	struct list_head list;
>>>> +	unsigned int ns_id;
>>>> +	unsigned int mnt_id;
>>>> +	char *path;
>>>> +	pid_t pid;
>>>> +};
>>>
>>> Why separate structure? We have mount_info already, let's add
>>> necessary flags on it.
>>
>> Firstly, we add such mounts, when namespaces are not collected,
>> and there are no any mount_infos populated. If we create a new
>> mount_info, it will be terrible and crutch'ly to use it in
>> mnt_ns collecting engine.
>>
>> Secondly, the patch introduces some new fields, which are not
>> need for common case. We need list entry to keep forced mounts
>> order, and to cleanup them in proper way in case of dump failed.
>>
>> We use separate structure for ghost files, and this simplifies
>> logic, so why should we overload mountpoints another way?!
>> I think we shouldn't.
>>  
>>>>  
>>>>  int ext_mount_add(char *key, char *val)
>>>>  {
>>>> @@ -3685,4 +3694,71 @@ int dump_mnt_namespaces(void)
>>>>  	return 0;
>>>>  }
>>>>  
>>>> +static int add_forced_mount(pid_t pid, const char *path)
>>>> +{
>>>> +	struct forced_mount *fm;
>>>> +	int fd, mnt_id, ret = 0;
>>>> +	unsigned int ns_id;
>>>> +
>>>> +	if (read_ns_id(pid, &mnt_ns_desc, &ns_id) < 0 || !ns_id) {
>>>
>>> I don't get why we need to read this stuff from kernel, while we have all
>>> the ns_id engine at hands. If you need to add binfmt_misc mountpoint, you
>>> want it per-namespace, not per-task, don't you?
>>
>> Which engine do you mean? Namespaces are not parced yet; no information at all.
> 
> Thus don't add forced mounts that early. Add them once namespace is parsed
> and collected, you don't need this thing before.

What is the problem of parsing namespace on mounting? Anyway you need umount
them in case of dumping fail, and there is need task's pid to switch its
namespace.

You'd escape read_ns_id(), but you'd get another problems (such as determining
of that namespace is root, etc). You'd merge pre-dump code and ns-parcing code.
It's really terrible, I've tried.

I don't think using of read_ns_id() is a huge problem.
 
>> See next patch, and places where we add forced mounts.
>>
>> Mounts is per-namespace, but you need a task to attach a ns. You can't save
>> "mnt:[4026531840]" and attach to it when you want. If so, you should iterate
>> over tasks list and look for one, who owns it. Linux doesn't provide a list
>> of used namespaces.
>>  
>>>> +		pr_err("Can't read mnt_ns id\n");
>>>> +		return -1;
>>>> +	}
>>>> +
>>>> +	fm = xmalloc(sizeof(*fm));
>>>> +	if (!fm)
>>>> +		return -1;
>>>> +
>>>> +	fd = open(path, O_PATH|O_DIRECTORY);
>>>> +	if (fd < 0) {
>>>> +		pr_perror("Can't open %s\n", path);
>>>> +		goto err_free;
>>>> +	}
>>>> +
>>>> +	ret = get_fd_mntid(fd, &mnt_id);
>>>> +	close(fd);
>>>> +
>>>> +	if (ret < 0) {
>>>> +		pr_err("Can't get mnt_id of %s\n", path);
>>>> +		goto err_free;
>>>> +	}
>>>> +
>>>> +	fm->path = xstrdup(path);
>>>> +	if (!fm->path)
>>>> +		goto err_free;
>>>> +
>>>> +	fm->ns_id = ns_id;
>>>> +	fm->mnt_id = mnt_id;
>>>> +	fm->pid = pid;
>>>> +	list_add(&fm->list, &forced_mounts_list);
>>>> +
>>>> +	return 0;
>>>> +err_free:
>>>> +	pr_err("Can't add forced mount\n");
>>>> +	free(fm);
>>>> +	return -1;
>>>> +}
>>>> +
>>>> +void cleanup_forced_mounts(void)
>>>> +{
>>>> +	struct forced_mount *fm;
>>>> +	int mnt_fd, ret;
>>>> +
>>>> +	list_for_each_entry(fm, &forced_mounts_list, list) {
>>>> +		ret = switch_ns(fm->pid, &mnt_ns_desc, &mnt_fd);
>>>> +		if (ret) {
>>>> +			pr_err("Can't switch to pid's %u mnt_ns\n", fm->pid);
>>>> +			continue;
>>>> +		}
>>>> +
>>>> +		if (umount(fm->path) < 0)
>>>> +			pr_perror("Can't umount forced mount %s\n", fm->path);
>>>> +
>>>> +		if (restore_ns(mnt_fd, &mnt_ns_desc)) {
>>>> +			pr_err("cleanup_forced_mounts exiting with wrong mnt_ns\n");
>>>> +			return;
>>>> +		}
>>>> +	}
>>>> +}
>>>> +
>>>>  struct ns_desc mnt_ns_desc = NS_DESC_ENTRY(CLONE_NEWNS, "mnt");
>>>>
>>>> _______________________________________________
>>>> CRIU mailing list
>>>> CRIU at openvz.org
>>>> https://lists.openvz.org/mailman/listinfo/criu
>>>> .
>>>>
>>>
>> .
>>
> 


More information about the CRIU mailing list