<div dir="ltr"><div>Hi Pavel, <br></div><div><br></div><div>I have final few doubts regarding implementation. Could we have a quick skype call/slack ?<br></div><div><br></div><div>-Abhishek<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, May 23, 2019 at 2:13 PM Pavel Emelianov <<a href="mailto:xemul@virtuozzo.com">xemul@virtuozzo.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 5/22/19 11:12 AM, Radostin Stoyanov wrote:<br>
> Hi Abhishek,<br>
> <br>
> I have some suggestions/ideas that may be useful.<br>
> <br>
> On 22/05/2019 01:11, Abhishek Dubey wrote:<br>
>> Hi Pavel,<br>
>><br>
>> I have gone through the cr_pre_dump_tasks() function tree and quite comfortable with parts of it. Compel stuff seem bit difficult to digest in one go.<br>
>> I will query if stuck somewhere in code. I think we can start with design discussion.<br>
>> <br>
>> Some queries related to new approach:<br>
>> 1) We need to replace page pipe with user-space supplied buffer. There is list of pipes in struct page_pipe. If I got it correct then, pipe buffer in the list has to be replaced with user-supplied buffer and these buffer exhibit same properties as of pipes in current implementation?<br>
>><br>
> There is a prototype implementation which you can use as a starting point:<br>
> <br>
> <a href="https://github.com/avagin/criu/tree/process_vm_readv" rel="noreferrer" target="_blank">https://github.com/avagin/criu/tree/process_vm_readv</a><br>
<br>
Yup, that's the good starting point, thank you, Radostin.<br>
<br>
>> 2) We finalized user space buffer for process_vm_readv to be of fixed size. How do we go deciding best size (=max size of pipe)?<br>
> <br>
> Currently, CRIU is creating a pipe and it is continuously increasing it's buffer size (see __ppb_resize_pipe() in criu/page-pipe.c). In the case of pre-dump (or when --leave-running is used) it would be more efficient to compute the necessary memory space and allocate it prior freezing the process tree. Thus, reducing the down time during pre-copy migration.<br>
> <br>
> Dump is currently using chunks (see commit bb98a82) and perhaps the same idea could be applied with memory buffer(s). This reduces the required amount of memory during checkpoint (e.g. when we want to dump a process tree that occupies 90% of the available memory).<br>
<br>
Agree. Let's start with the fixed-size buffers for pre-dumps and use the same size as for chunked dump mode.<br>
One thing is that criu doesn't have the explicit constant for that, instead it uses several of them (max<br>
number of pipes, page-alloc-costly-order, etc.) I propose not to overengineer things here (at least for now)<br>
and just agree on some pre-defined constant. Say, 4Mb.<br>
<br>
>><br>
>> 3) iovs generation for shared mapping are ignored and shared mapping is handled separately. Will new approach handle shared memory similarly?<br>
<br>
We're talking about the __parasite_dump_pages_seized, this routine just ignores the shared mappings.<br>
<br>
>> 4) Freeze - collect vmas - Unfreeze : How we go about handling following events -<br>
>> a) process does something such that vma gets modified<br>
>> - we can't ignore such mappings<br>
<br>
When saving memory contents you will generate a set of pagemaps. The pagemaps do _not_ coincide with<br>
the collected mappings, but are those that has been successfully read. Those that were collected as<br>
mappings but failed to be read should be just ignored.<br>
<br>
Note, that some mappings may be partially read. For those, the pagemap size should be "tuned" respectively.<br>
<br>
>> - we can't freeze single process again, becomes inconsistent with other tree processes<br>
<br>
Why again? Freezing happens once.<br>
<br>
>> b) one of the process in pstree dies<br>
<br>
That's OK, this can happen even in the current scheme.<br>
<br>
-- Pavel<br>
</blockquote></div>