[CRIU] criu restore performance

J F jgmb45 at gmail.com
Sun Jul 6 13:10:38 PDT 2014


I had a few follow-up questions. I'm not kernel savvy, so some of my
questions may seem naive.


> > 1) Is 'criu restore' operation time complexity mostly CPU bound,
> >    IO bound, or memory bound?
>
> It's mostly CPU-bound, but for images with large amount of process memory
> it can become mem-bound.
>

Does CRIU restore operation take advantage of all CPU cores on the system
when restoring tasks? I didn't see any calls to pthread in src.

Between each step there are global synchronization points.
> Other than this in stage 2 tasks may sometimes wait for each other
> to restore shared resources, e.g. opened files.


Any opportunity to reduce or aggregate the number of synchronization points
so more stuff can be done in parallel?


> > 3) Is performance of restore a function of the size of the images folder?
>
> Well, yes. The more data we have to restore the more time it takes. But
> the dependency is not researched, but it's non-linear for sure.
>
> > 4) Any tricks/advice/hacks to speed up restore?
>
> It's a WIP at the moment. We do know some things that slow restore (and
> dump), but the list is not complete and is not fully fixed yet. E.g.
>
> 1. More image files we have the slower it works. Currently criu generates
>    8 files per-task, we try to make less of them.
>
> 2. Criu writes data into images with small portions. This behaves badly
>    due to many actions taken by kernel on every write() call especially
>    for disk FS-s (even for page-cache writes).
>

Any reason not to increase write memory block size?


> 3. /proc interface we use heavily on dump is too damn slow

4. Shared file descriptors can be inherited by tasks on restore. Instead
>    we share them via unix sockets which is slower.
>
> 5. Potentially COW-ed pages in memory mapping are memcmp-ed on restore to
>    decide whether or not to COW the page. No good ideas how to deal with it


Of the 5 items you listed, which do you think is the biggest performance
bottleneck for a restore operation on a large memory application (e.g. my
dump is about .5 - 2 gigs)?

Thanks,
-J
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140706/465ef084/attachment.html>


More information about the CRIU mailing list