<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div>I had a few follow-up questions. I'm not kernel savvy, so some of my questions may seem naive. </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="">> 1) Is 'criu restore' operation time complexity mostly CPU bound,<br>
> IO bound, or memory bound?<br>
<br>
</div>It's mostly CPU-bound, but for images with large amount of process memory<br>
it can become mem-bound.<br></blockquote><div><br></div><div>Does CRIU restore operation take advantage of all CPU cores on the system when restoring tasks? I didn't see any calls to pthread in src.</div><div><br></div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Between each step there are global synchronization points.<br>
Other than this in stage 2 tasks may sometimes wait for each other<br>
to restore shared resources, e.g. opened files.</blockquote><div><br></div><div>Any opportunity to reduce or aggregate the number of synchronization points so more stuff can be done in parallel?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="">
> 3) Is performance of restore a function of the size of the images folder?<br>
<br>
</div>Well, yes. The more data we have to restore the more time it takes. But<br>
the dependency is not researched, but it's non-linear for sure.<br>
<div class=""><br>
> 4) Any tricks/advice/hacks to speed up restore?<br>
<br>
</div>It's a WIP at the moment. We do know some things that slow restore (and<br>
dump), but the list is not complete and is not fully fixed yet. E.g.<br>
<br>
1. More image files we have the slower it works. Currently criu generates<br>
8 files per-task, we try to make less of them.<br>
<br>
2. Criu writes data into images with small portions. This behaves badly<br>
due to many actions taken by kernel on every write() call especially<br>
for disk FS-s (even for page-cache writes).<br></blockquote><div><br></div><div>Any reason not to increase write memory block size? </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
3. /proc interface we use heavily on dump is too damn slow</blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
4. Shared file descriptors can be inherited by tasks on restore. Instead<br>
we share them via unix sockets which is slower.<br>
<br>
5. Potentially COW-ed pages in memory mapping are memcmp-ed on restore to<br>
decide whether or not to COW the page. No good ideas how to deal with it</blockquote><div><br></div><div>Of the 5 items you listed, which do you think is the biggest performance bottleneck for a restore operation on a large memory application (e.g. my dump is about .5 - 2 gigs)?</div>
<div><br></div><div>Thanks,</div><div>-J</div></div><br></div></div>