[CRIU] Question: Tightly coupled applications

Thouraya TH thouraya87 at gmail.com
Sat Jun 23 16:11:10 MSK 2018


Hi,
Thank you so much for answer.
Ok.



*MPI applications would be to be aware of the communication that is going
on and try to restore that communication state after the process restore. *

This is about MPI library https://www.open-mpi.org/
1) Running HPC applications, in containers, is gaining significant interest
due to lighweight virtualisation of containers versus VMs (as i know).

And, what about web applications (web client - Mysql server application in
a container lxc- Tomcat web server in a container )? There is a
communication also.
2) If i would like to save snapshots using criu of this application,
therefore i have to restore that communication state after the process
restore ?
3) I ask also if checkpoint/restore is useful for this kind of application ?


Kind regards.


2018-06-23 11:35 GMT+01:00 Adrian Reber <adrian at lisas.de>:

> On Sat, Jun 23, 2018 at 10:19:13AM +0100, Thouraya TH wrote:
> > Please, i have a question about tightly coupled applications and their
> > checkpoint
> > https://dl.acm.org/citation.cfm?id=568525
> >
> > As i know, for this kind of application , i have to record to state of
> the
> > communication channel and the state of each process.
> > Following a failure, i have to find the the coherent state to restart
> > (coordinated protocol or no coordinated protocol).
> >
> > Is there, already, a solution you have proposed to acheive that ?
>
> No, there is nothing I know of. The whole MPI/HPC part of
> checkpoint/restore with CRIU has not seen much development in the last
> years.
>
> One way to use CRIU in distributed MPI applications would be to be aware
> of the communication that is going on and try to restore that
> communication state after the process restore.
>
> Another way to use CRIU in MPI applications is to make sure that all
> communication has been quiesced before the actual checkpoint/restore.
> This probably does not work for fault tolerance.
>
>                 Adrian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20180623/a487d6ba/attachment-0001.html>


More information about the CRIU mailing list