<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div>I had a few follow-up questions. I&#39;m not kernel savvy, so some of my questions may seem naive. </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="">&gt; 1) Is &#39;criu restore&#39; operation time complexity mostly CPU bound,<br>

&gt;    IO bound, or memory bound?<br>

<br>

</div>It&#39;s mostly CPU-bound, but for images with large amount of process memory<br>

it can become mem-bound.<br></blockquote><div><br></div><div>Does CRIU restore operation take advantage of all CPU cores on the system when restoring tasks? I didn&#39;t see any calls to pthread in src.</div><div><br></div>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Between each step there are global synchronization points.<br>

Other than this in stage 2 tasks may sometimes wait for each other<br>

to restore shared resources, e.g. opened files.</blockquote><div><br></div><div>Any opportunity to reduce or aggregate the number of synchronization points so more stuff can be done in parallel?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="">

&gt; 3) Is performance of restore a function of the size of the images folder?<br>

<br>

</div>Well, yes. The more data we have to restore the more time it takes. But<br>

the dependency is not researched, but it&#39;s non-linear for sure.<br>

<div class=""><br>

&gt; 4) Any tricks/advice/hacks to speed up restore?<br>

<br>

</div>It&#39;s a WIP at the moment. We do know some things that slow restore (and<br>

dump), but the list is not complete and is not fully fixed yet. E.g.<br>

<br>

1. More image files we have the slower it works. Currently criu generates<br>

   8 files per-task, we try to make less of them.<br>

<br>

2. Criu writes data into images with small portions. This behaves badly<br>

   due to many actions taken by kernel on every write() call especially<br>

   for disk FS-s (even for page-cache writes).<br></blockquote><div><br></div><div>Any reason not to increase write memory block size? </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


3. /proc interface we use heavily on dump is too damn slow</blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

4. Shared file descriptors can be inherited by tasks on restore. Instead<br>

   we share them via unix sockets which is slower.<br>

<br>

5. Potentially COW-ed pages in memory mapping are memcmp-ed on restore to<br>

   decide whether or not to COW the page. No good ideas how to deal with it</blockquote><div><br></div><div>Of the 5 items you listed, which do you think is the biggest performance bottleneck for a restore operation on a large memory application (e.g. my dump is about .5 - 2 gigs)?</div>

<div><br></div><div>Thanks,</div><div>-J</div></div><br></div></div>