[CRIU] Dealing with VDSO remap

Mon Mar 9 07:04:04 PDT 2015

Hi Laurent,

On 03/09/2015 09:34 AM, Laurent Dufour wrote:
> Hi Chris,
> 
> On 06/03/2015 15:58, Christopher Covington wrote:
>> Hi Laurent,
>>
>> On 03/06/2015 09:15 AM, Laurent Dufour wrote:
>>> Hi,
>>>
>>> I'm porting CRIU to the PopwerPC architecture, and among other issues,
>>> I'm facing a major one with the VDSO remapping at restart time.
>>>
>>> On PowerPC, as on ARM64, the kernel keeps track of the VDSO base address
>>> because it is using it to jump back to a sigreturn trampoline at the end
>>> of a signal processing (see handle_rt_signal64 in
>>> arch/powerpc/kernel/signal_64.c, and for ARM64, setup_return in
>>> arch/arm64/kernel/signal.c).
>>>
>>> When remapping the VDSO at restart time, the kernel keep the reference
>>> to the previous VDSO mapping, the one inheriting from the criu, so
>>> handling signal after the restart leads to unpredictable results, most
>>> of the time a SIGSEGV is raised.
>>>
>>> I didn't find a smart way to update the kernel reference to the vdso
>>> mapping once the VDSO is remapped, so no way to work around that today.
>>>
>>> Furthermore, since this is the same picture on ARM 64, I'm wondering how
>>> it could work on this architecture. Am I missing a major thing here ?
>>>
>>> If not, is there a plan in the CRIU project to to deal with that, other
>>> than by hacking the kernel to update its reference at restart time ?
>>
>> It's been a while since I worked on this, and I feel like I never had a really
>> solid understanding of all the parts, but hopefully this can help.
>>
>> I think the ideal solution would be for a remap system call to move the VDSO.
>> This may have been implemented for x86, but I think it's a new feature and
>> missing on most other architectures. There's a lot of duplication in the VDSO
>> code between architectures. If there was less duplication, the x86 additions
>> might easily apply to other architectures as well, but I've never gotten
>> around to consolidating the VDSO code and I haven't noticed anyone else having
>> gotten around to it either.
> 
> I came to the same conclusion, when the VDSO area is remapped, some
> architecture specific code should be triggerd in the kernel to update
> the VDSO reference.
> I'll take a closer look to the mremap in the kernel..
> 
>> The workaround is to put trampolines/branches at the location that the
>> restored process expects to the location that the VDSO is currently located at
>> restore time. See vdso_redirect_calls in arch/aarch64/vdso-pie.c.
> 
> I put the same code in the new ppc64 branch I created but it is only
> dealing with user space's references to the VDSO, not the kernel ones.
> 
> Unfortunately, creating a trampoline at the place the kernel put the
> VDSO at restart is not working all the time since this area may conflict
> with a checkpointed memory part. Updating the kernel reference to the
> VDSO, when it has been moved, looks to be the only way to address that.

I see. My "production" runs are known/trusted code running under qemu system
emulation with the norandmaps kernel parameter set for run-to-run
determinism/reproducibility. I've only done light testing with randomization
fully enabled, so I'm afraid my experience here is limited, but my
recollection is that (and some quick double checking confirms) the vdso00 and
vdso01 test cases pass for me with /proc/sys/kernel/randomize_va_space == 2.
Triggering the case you describe requires an ET_DYN binary (CFLAGS="-pie
-fPIE"), right? My binaries are currently ET_EXEC. Should we update the vdso
test cases to use those flags?

Chris

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project