[CRIU] [PATCH] Add test case for code page sharing
Pavel Emelyanov
xemul at parallels.com
Mon Apr 6 23:25:57 PDT 2015
On 04/07/2015 08:57 AM, Andrew Vagin wrote:
> On Mon, Apr 06, 2015 at 04:33:08PM -0400, Christopher Covington wrote:
>> Hi Andrew,
>>
>> On 04/02/2015 04:30 PM, Andrey Wagin wrote:
>>> 2015-03-31 20:51 GMT+03:00 Christopher Covington <cov at codeaurora.org>:
>>>> We've observed at least one scenario where processes originally
>>>> sharing physical memory no longer do so after dump and restore.
>>>> This increases memory usage and degrades performance. Here is a
>>>> test case (created outside of the ZDTM framework for expedience)
>>>> that demonstrates the issue. The test fails with CRIU 1.4. Run
>>>> `make && ./test_sharing.sh` to build and run the test.
>>>
>>> I've found a root cause of this problem. CRIU modifies code to inject
>>> a parasite:
>>> https://github.com/xemul/criu/blob/master/parasite-syscall.c#L219
>>>
>>> So we affects only one page in a process. Is this really so critical?
>>
>> I think this is a different reason for pages to not be shared. Ptrace parasite
>> injection affecting the sharing of just one page is a problem that it would be
>> nice to fix, but I agree that it's not that critical.
>
> You test detects this page. I've changed your test to check main +
> getpagesize() and it works fine.
>
>>
>> But what we have observed isn't just one page but every page mapping the text
>> sections of an executable. Please correct me if I'm wrong, but CRIU never
>> restores text sections from the actual executable, but instead uses the image
>> file. So the text sections will never be shared, except in simple parent-child
>> relationships that are handled specially.
>
> text section is an anonymous file mapping. In such cases CRIU maps the file
> and restore only changed pages.
>
> static inline bool should_dump_page(VmaEntry *vmae, u64 pme)
> {
> ...
> /*
> * Optimisation for private mapping pages, that haven't
> * yet being COW-ed
> */
> if (vma_entry_is(vmae, VMA_FILE_PRIVATE) && (pme & PME_FILE))
> return false;
But that's the dump routine. For COW restore different code is used:
static int map_private_vma(pid_t pid, struct vma_area *vma, void **tgt_addr,
struct vma_area **pvma, struct list_head *pvma_list)
{
...
list_for_each_entry_from(p, pvma_list, list) {
if (p->e->start > vma->e->start)
break;
if (!vma_area_is_private(p))
continue;
if (p->e->end != vma->e->end ||
p->e->start != vma->e->start)
continue;
/* Check flags, which must be identical for both vma-s */
if ((vma->e->flags ^ p->e->flags) & (MAP_GROWSDOWN | MAP_ANONYMOUS))
break;
if (!(vma->e->flags & MAP_ANONYMOUS) &&
vma->e->shmid != p->e->shmid)
break;
pr_info("COW 0x%016"PRIx64"-0x%016"PRIx64" 0x%016"PRIx64" vma\n",
vma->e->start, vma->e->end, vma->e->pgoff);
paddr = decode_pointer(p->premmaped_addr);
break;
That one.
-- Pavel
More information about the CRIU
mailing list