[CRIU] GSoC: Boot up week

abhishek dubey dubeyabhishek777 at gmail.com
Wed May 29 04:07:35 MSK 2019


Thanks Pavel,

I got my doubts cleared and started modifying code as per discussed steps.

In case I am stuck with design, we can setup call next time :)

Inline reply included.

On 28/05/19 2:03 PM, Pavel Emelianov wrote:
> On 5/28/19 5:51 AM, abhishek dubey wrote:
>> Hi Pavel,
>>
>> This week I will start with following task:
>>
>>       - Pre-allocate 4Mb user buffer(or chain of buffer - questioned
>> inline below)
>>
>>       - collecting VMAs after freezing
>>
>>               - VMAs from complete pstree will be collected at once in
>> list, unlike current handling of one process at a time in pstree
> The list of VMAs must be kept in per-pstree manner, otherwise you wouldn't
> be able to find out the pid on which to call the vm reading syscall.
>
>>               - I will use existing function for collection -
>> modification needed to skip non-readable VMAs (root user can't read such
>> VMAs using process_vm_readv)
> OK
>
>>               - No more injection of parasite code
> Yup
>
>>       - unfreeze the process just after collection of VMAs
>>
>>               - pstree_switch_state() will unfreeze pstree
>>
>>
>> Is above approach fine?
> Looks OK.
>
>> Please look for inline question below :
>>
>> On 23/05/19 2:13 PM, Pavel Emelianov wrote:
>>> On 5/22/19 11:12 AM, Radostin Stoyanov wrote:
>>>> Hi Abhishek,
>>>>
>>>> I have some suggestions/ideas that may be useful.
>>>>
>>>> On 22/05/2019 01:11, Abhishek Dubey wrote:
>>>>> Hi Pavel,
>>>>>
>>>>> I have gone through the cr_pre_dump_tasks() function tree and quite comfortable with parts of it. Compel stuff seem bit difficult to digest in one go.
>>>>> I will query if stuck somewhere in code. I think we can start with design discussion.
>>>>>    
>>>>> Some queries related to new approach:
>>>>> 1) We need to replace page pipe with user-space supplied buffer. There is list of pipes in struct page_pipe. If I got it correct then, pipe buffer in the list has to be replaced with user-supplied buffer and these buffer exhibit same properties as of pipes in current implementation?
>>>>>
>>>> There is a prototype implementation which you can use as a starting point:
>>>>
>>>> https://github.com/avagin/criu/tree/process_vm_readv
>>> Yup, that's the good starting point, thank you, Radostin.
>> Went through pointed commit for pipe size limiting. If I am not
>> mistaken, then a page_pipe could have maximum of 8 page_pipe_bufs with
>> pipe size up to PIPE_MAX_SIZE each.
> Yes, something like that.
>
>>>>> 2) We finalized user space buffer for process_vm_readv to be of fixed size. How do we go deciding best size (=max size of pipe)?
>>>> Currently, CRIU is creating a pipe and it is continuously increasing it's buffer size (see __ppb_resize_pipe() in criu/page-pipe.c). In the case of pre-dump (or when --leave-running is used) it would be more efficient to compute the necessary memory space and allocate it prior freezing the process tree. Thus, reducing the down time during pre-copy migration.
>>>>
>>>> Dump is currently using chunks (see commit bb98a82) and perhaps the same idea could be applied with memory buffer(s). This reduces the required amount of memory during checkpoint (e.g. when we want to dump a process tree that occupies 90% of the available memory).
>>> Agree. Let's start with the fixed-size buffers for pre-dumps and use the same size as for chunked dump mode.
>>> One thing is that criu doesn't have the explicit constant for that, instead it uses several of them (max
>>> number of pipes, page-alloc-costly-order, etc.) I propose not to overengineer things here (at least for now)
>>> and just agree on some pre-defined constant. Say, 4Mb.
>> Since we have to utilize existing xfer functions, so we need to adhere
>> to "page_pipe -- page_pipe_buf" model for user buffer. In that case,
>> will these 4Mb chunk be similar to page_pipe_buf of current implementation?
> That's a tricky thing. Page-pipe is the description of the process pagemap with the
> data sitting in file descriptors. In your case you will have the same, but the
> data sitting right in the memory. So as a quick hack we can vmsplice() the local
> buffer into a pipe and then feed this pipe into the page_pipe. But as a longer-term
> solution we'd need to generalize the page_pipe_buf structure to allow for keeping
> raw memory pointers instead of file descriptors.
Indeed it is! Let me look into pipe code with this perspective.
>
>>>>> 3) iovs generation for shared mapping are ignored and shared mapping is handled separately. Will new approach handle shared memory similarly?
>>> We're talking about the __parasite_dump_pages_seized, this routine just ignores the shared mappings.
>> Yes.
>>>>> 4) Freeze - collect vmas - Unfreeze : How we go about handling following events -
>>>>>            a) process does something such that vma gets modified
>>>>>                 - we can't ignore such mappingsWill these 4Mb chunk be similar to page_pipe_buf of current implementation?
>>> When saving memory contents you will generate a set of pagemaps. The pagemaps do _not_ coincide with
>>> the collected mappings, but are those that has been successfully read. Those that were collected as
>>> mappings but failed to be read should be just ignored.
>>>
>>> Note, that some mappings may be partially read. For those, the pagemap size should be "tuned" respectively.
>> Sure!
>>>>>                 - we can't freeze single process again, becomes inconsistent with other tree processes
>>> Why again? Freezing happens once.
>>>
>>>>>            b) one of the process in pstree dies
>>> That's OK, this can happen even in the current scheme.
>>>
>>> -- Pavel
>> .
>>
-- Abhishek


More information about the CRIU mailing list