[CRIU] [PATCH] fsnotify: Use longest mount point for inotify watchee

Pavel Emelyanov xemul at parallels.com
Mon Oct 12 02:52:06 PDT 2015


On 10/12/2015 11:58 AM, Cyrill Gorcunov wrote:
> On Mon, Oct 12, 2015 at 11:41:45AM +0300, Pavel Emelyanov wrote:
>> On 10/12/2015 11:31 AM, Andrey Vagin wrote:
>>>
>>> On Oct 12, 2015 11:18, "Pavel Emelyanov" <xemul at parallels.com <mailto:xemul at parallels.com>> wrote:
>>>>
>>>> On 10/12/2015 11:14 AM, Cyrill Gorcunov wrote:
>>>>> On Mon, Oct 12, 2015 at 11:06:32AM +0300, Pavel Emelyanov wrote:
>>>>>> On 10/10/2015 11:07 AM, Cyrill Gorcunov wrote:
>>>>>>> In debian-8 container we faced the problem -- systemd
>>>>>>> creates nested mount namespaces and inotify watchee
>>>>>>> are resolved into a path which is inaccessbile on
>>>>>>> restore (because we're operating in task's mount
>>>>>>> namespace). So here is a dirty hack for now -- choose
>>>>>>> the widest mount point as a reference. The proper
>>>>>>> fix requires kernel patching.
>>>>>>
>>>>>> How about Andrey's patch that saved mnt-id for an inotify?
>>>>>
>>>>> This won't work on its own. Please look at the second patch
>>>>> I send into the reply of this one. We need mnt_id but the
>>>>> path may be inaccessible during the restore because the
>>>>> restore comes in another mount namespace where the former
>>>>> path is no longer accessbile.
>>>>
>>>> But we do know the namespace the path come from, why do we ever
>>>> _guess_ anything instead of just going to that ns and opening
>>>> the file there?
>>>
>>> Here is a problem that we can open a handle from a bind-mount where this file is unavailable by path.
>>>
>>> For example:
>>>
>>> mount /dev/xxx /
>>> mount --bind /var/yyy /zzz
>>> inotify_add_watch("/test")
>>> Read handle from /proc/pid/fdinfo/inotify_fd
>>> FD = open_handlat("/zzz", handle)
>>
>> You call open_handle_at() on /zzz because the device /dev/xxx
>> is shown for both, / and /zzz, right?
>>
>>> readlink /proc/pid/fd/FD
>>> /zzz/test
>>
>> Heh :) But the /zzz/test is non-existing path, is it? Isn't it a
>> kernel issue? I mean -- why does kernel report this path? When
>> I do open_handle_at() and the "test" dentry is created where does
>> it get connected to?
> 
> handle_to_path(mountdirfd)
>   ...
>   do_handle_to_path
>   ...
>     path->mnt = get_vfsmount_from_fd(mountdirfd);
>     path->dentry = exportfs_decode_fh
>      ...
>      find_acceptable_alias
> 
> For some reason in fhandle there is a call that validates
> dentry obtained
> 
> static int vfs_dentry_acceptable(void *context, struct dentry *dentry)
> {
> -->	return 1;

But this one is called for an _exiting_ dentry found in inode alias list.
Those are either disconnected or in-tree ones. The former do not produce
and path in fd symlinks, the latter only produce accessible from somewhere
path. The '/zzz/test' one you're showing here is neither of them :)

> }
> 
> static int do_handle_to_path(int mountdirfd, struct file_handle *handle,
> 			     struct path *path)
> {
> 	int retval = 0;
> 	int handle_dwords;
> 
> 	path->mnt = get_vfsmount_from_fd(mountdirfd);
> 	if (IS_ERR(path->mnt)) {
> 		retval = PTR_ERR(path->mnt);
> 		goto out_err;
> 	}
> 	/* change the handle size to multiple of sizeof(u32) */
> 	handle_dwords = handle->handle_bytes >> 2;
> 	path->dentry = exportfs_decode_fh(path->mnt,
> 					  (struct fid *)handle->f_handle,
> 					  handle_dwords, handle->handle_type,
> 					  vfs_dentry_acceptable, NULL);
> 	if (IS_ERR(path->dentry)) {
> 		retval = PTR_ERR(path->dentry);
> 		goto out_mnt;
> 	}
> 	return 0;
> out_mnt:
> 	mntput(path->mnt);
> out_err:
> 	return retval;
> }
> 
> .
> 



More information about the CRIU mailing list