[CRIU] Re: [PATCH 4/4] inotify: Add checkpoint/restore

Pavel Emelyanov xemul at parallels.com
Tue Apr 10 09:59:26 EDT 2012


>>> +struct inotify_file_entry {
>>> +	u32	id;
>>> +	u64	i_ino;
>>> +	u32	mask;
>>> +	u32	s_dev;
>>> +	u32	r_dev;
>>> +	u32	wd;
>>> +	fh_t	f_handle;
>>> +} __packed;
>>> +
>>
>> Just putting watches in a raw is not enough. You should have an entry describing
>> inotify fd itself. The thing is that it has flags (O_NONBLOCK is only used now, but
>> still), fowner and we'll have to dump them there.
> 
> should not the flags already be provided in fdinfo? 

No. fdinfo is info about file descriptors. fowners and flags sit on struct file.

> As for fowners, sure, but
> I can't yet add them in this series, simply because there is no fowners engine
> merged. Once it's merged I'll update this entry. But thanks for reminder, i'll
> mark that.
> 
>>> +/* Returns path for mount device @s_dev */
>>> +static char *inotify_get_mnt_root(unsigned int s_dev)
>>> +{
>>> +	static int last = 0;
>>> +	int i;
>>> +
>>> +	/* Cache hit rate is big */
>>> +again:
>>> +	for (i = last; i < nr_mntinfo; i++) {
>>> +		if (s_dev == mntinfo[i].s_dev) {
>>> +			last = i;
>>> +			return mntinfo[i].mnt_root;
>>> +		}
>>> +	}
>>> +
>>> +	if (last) {
>>> +		last = 0;
>>> +		goto again;
>>> +	}
>>> +
>>> +	return NULL;
>>> +}
>>
>> This is wrong, but enough for the first time (please, put this comment into the code).
>> Besides, this has to be in some generic .c file.
> 
> OK, but since it's RFC rather, Pavel which is way to make it correct then?

The path you see in mnt_root can be in theory overmounted by some other mount.
To dive "under" the top mount you have to do complex things which are impossible
without checkpointing/restoring the mount tree. This is long-term TODO.

>>
>>> +	snprintf(path, sizeof(path), "/proc/self/fd/%d", wd);
>>> +	ret = readlink(path, link, sizeof(link));
>>> +	close(wd);
>>> +	close(mntfd);
>>> +
>>> +	if (ret < 1) {
>>> +		pr_perror("Can't read self-link for %d\n", wd);
>>> +		return -1;
>>> +	}
>>
>> Check if we can just push the /proc/self/fd/xxx path into inotify_add_watch.
> 
> OK.
> 
>>
>>> +	attempt = 10, wd = 1;
>>
>> This means that you only can restore wd-s less than 10 (or 11, I don't try hard
>> on corner case). This is wrong.
> 
> No, attempt is simply a counter. I can't restore if a gap between the
> wd we need and kernel allocate is greater than 10. I try 10 times
> and if no success -- just break, simply to not spin here too long.

OK consider the 1st wd is 20, will you restore this one?



More information about the CRIU mailing list