[CRIU] [PATCH 14/15] restorer: rework unmaping old VMA-s (v3)
Pavel Emelyanov
xemul at parallels.com
Tue Oct 7 02:05:43 PDT 2014
On 10/06/2014 11:33 PM, Christopher Covington wrote:
> Hi Andrey, Pavel,
>
> On 09/23/2013 06:33 AM, Andrey Vagin wrote:
>> All process VMA-s are in "premmaped area". All restorer stuff are in
>> bootstap "area", so we have two areas.
>>
>> So we don't need to unmap extra VMA-s one by one. We can call munmap
>> three times for the region before the first area, for the hole between
>> areas and for the region after the second area.
>>
>> The old scheme didn't work, because the list of VMA-s can be changed
>> after collecting. It can be due to memory allocations by libc or due to
>> increased stack.
>
>> diff --git a/pie/restorer.c b/pie/restorer.c
>> index 8e43609..59b801f 100644
>> --- a/pie/restorer.c
>> +++ b/pie/restorer.c
>> @@ -524,6 +524,51 @@ void __export_unmap(void)
>> }
>>
>> /*
>> + * This function unmaps all VMAs, which don't belong to
>> + * the restored process or the restorer
>> + */
>> +static int unmap_old_vmas(void *premmapped_addr, unsigned long premmapped_len,
>> + void *bootstrap_start, unsigned long bootstrap_len)
>> +{
>> + unsigned long s1, s2;
>> + void *p1, *p2;
>> + int ret;
>> +
>> + if ((void *) premmapped_addr < bootstrap_start) {
>> + p1 = premmapped_addr;
>> + s1 = premmapped_len;
>> + p2 = bootstrap_start;
>> + s2 = bootstrap_len;
>> + } else {
>> + p2 = premmapped_addr;
>> + s2 = premmapped_len;
>> + p1 = bootstrap_start;
>> + s1 = bootstrap_len;
>> + }
>> +
>> + ret = sys_munmap(NULL, p1 - NULL);
>> + if (ret) {
>> + pr_err("Unable to unmap (%p-%p): %d\n", NULL, p1, ret);
>> + return -1;
>> + }
>> +
>> + ret = sys_munmap(p1 + s1, p2 - (p1 + s1));
>> + if (ret) {
>> + pr_err("Unable to unmap (%p-%p): %d\n", p1 + s1, p2, ret);
>> + return -1;
>> + }
>> +
>> + ret = sys_munmap(p2 + s2, (void *) TASK_SIZE - (p2 + s2));
>
> Experimenting with various kernel configurations on AArch64 such as 64K pages
> (which change the default VA_BITS and therefore TASK_SIZE), it has become
> apparent to me that TASK_SIZE as used here cannot be a compile-time constant
> if we are to have one AArch64 CRIU binary that works regardless of the kernel
> configuration it is paired with.
>
> Currently, the shift for TASK_SIZE could be 39, 42, or 48. What do you all
> think is the best way to handle this? Return -1 if unmapping up to bit 39
> fails, but just give a debug print if unmapping between bits 39 and 42 or bits
> 42 and 48 fails? Is there an existing /proc entry or sysconf() or
> prctl(PR_GET_MM, ...) to determine task size dynamically that I've overlooked?
Presumably a hack, but when reading the /proc/$pid/pagemap the EOF would (should)
occur when hitting the TASK_SIZE. If this is true, we can estimate this value
in the kerndat.c on criu start and use this as variable.
BTW, on x86_64 this value is constant, so can we have the TASK_SIZE remain such
on x86 and turn into variable on arm?
Thanks,
Pavel
> If not, should I propose one? Should I try to probe the value with mmap calls
> or similar?
>
>> + if (ret) {
>> + pr_err("Unable to unmap (%p-%p): %d\n",
>> + p2 + s2, (void *)TASK_SIZE, ret);
>> + return -1;
>> + }
>> +
>> + return 0;
>> +}
>
> Thanks,
> Christopher
>
More information about the CRIU
mailing list