[CRIU] Sync TODO-s for mount.c work

Pavel Emelyanov xemul at parallels.com
Tue Apr 28 10:20:28 PDT 2015


On 04/28/2015 08:09 PM, Oleg Nesterov wrote:
> Sorry for delay, I am fighting with other internal rh bugs...
> 
> On 04/27, Pavel Emelyanov wrote:
>>
>> On 04/25/2015 07:12 PM, Oleg Nesterov wrote:
>>> On 04/21, Pavel Emelyanov wrote:
>>>>
>>>> This is just to make sure I properly track what's going on with mount.c :)
>>>
>>> And let me report another problem, just for record. Sorry if this
>>> was already discussed or documented somewhere. And please correct
>>> me if I am wrong, I still do not really understand this "mount"
>>> magic.
>>>
>>> I do not see how CRIU can dump/restore a mount_nodev() mount, say,
>>> tmpfs. Yes, tmpfs_dump() and tmpfs_restore() are clever and I am
>>> not saying there are always wrong, but this depends on use-case.
>>>
>>> Once again, my lovely trivial example:
>>>
>>> 	# unshare -m
>>>
>>> 	# grep run /proc/self/mountinfo
>>> 	52 26 0:18 / /run rw,nosuid,nodev shared:22 - tmpfs tmpfs rw,seclabel,mode=755
>>>
>>> 	# perl -e 'close STDIN; close STDOUT; close STDERR; sleep'
>>>
>>> and dump/restore works. In particular it dumps/restores /run. But
>>> it actually restores the "copy" of this mount and this is pointless
>>> in this particular case.
>>
>> Sure, but from my perspective that's because /run is not detected as a clone
>> from some existing (host-side) mountpoint.
> 
> Yes. Well, yes and no.
> 
> It is not detected, but note also that do_new_mount() calls ->restore()
> unconditionally, and tmpfs_restore() populates the new mount.
> 
> So do_new_mount() should not be used in this case, and dump_one_mountpoint()
> probably needs more checks before fstype->dump(pm)...
> 
> But I am a bit confused, probably this is what you actually meant, so my
> "and no" can be wrong.

I think it is :) For bind-mounts and external mount dump doesn't call
->dump callback. This complex if

        if (pm->parent && !pm->dumped && !pm->need_plugin &&
            pm->fstype->dump && fsroot_mounted(pm)

is all about it and (taking into account the previous discussion) is
all wrong :(

>>> CRIU doesn't have an option to change this behaviour but this is
>>> minor, to simplify the discussion lets suppose we want to change
>>> the current behaviour so /run is still "shared" after restore.
>>>
>>> How can we do this?
>>>
>>> "restore" does clean_mnt_ns(), this umounts "/run", and after that
>>> I don't see how we can re-mount it "correctly" without nontrivial
>>> setns-like hacks.
>>
>> This is the case when all mountpoints are shares or slaves of some others from
>> the host side, is it?
> 
> Again, I am not sure I understand you... prepare_mnt_ns() always does
> clean_mnt_ns() of !opts.root... If root_ns_mask & CLONE_NEWNS of course.

Exactly. That's because we only considered the case when mount namespace
we dump is ... quite new as compared to the host ns. So we depopulate it
then populate back _trying_ to resolve external dependencies. Sometimes
this doesn't work optimal, sometimes doesn't work correct at all.

But after all the ->restore callback is only called for _new_ mounts, not
for external or bind.

-- Pavel



More information about the CRIU mailing list