<div dir="ltr">That's an interesting idea. Though, my process is inside of a docker container, and I think it would get upset by being restored into a different container. I think I need the coordination docker is doing in order for my system to work.</div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 12, 2015 at 2:27 PM, Ruslan Kuprieiev <span dir="ltr"><<a href="mailto:kupruser@gmail.com" target="_blank">kupruser@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
I'm saying that you might want to consider calling criu_dump() from
a process that you are<br>
trying to dump. We call it self dump[1]. For example, using
criu_dump() from libcriu it might look like:<br>
<br>
...<br>
while (1) {<br>
ret = criu_dump();<br>
if (ret < 0) {<br>
/*error*/<br>
} else if (ret == 0) {<br>
/*dump is ok*/<br>
} else if (ret == 1) {<br>
/*This process is restored*/<br>
/*reestablish connection or do whatever needs to be done<br>
* in case of broken connection */<br>
}<br>
/*accept connection and evaluate code*/<br>
}<br>
...<br>
<br>
[1] <a href="http://criu.org/Self_dump" target="_blank">http://criu.org/Self_dump</a><div><div class="h5"><br>
<br>
<br>
<div>On 05/12/2015 11:25 PM, Ross Boucher
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">I'm not sure I follow. You're saying, the process
that actually calls restore would get notified? Or, are you
saying that somehow in the restored process I can access
something set by criu?
<div><br>
</div>
<div>Assuming the former, I don't think that's necessary -- I
already know that I've just restored the process. I could try
to send a signal from the coordinating process and then use
that signal to cancel the read thread, which would be mostly
the same thing. But because that would have to travel through
quite a few layers, it seems like it would be better and more
performant to do it from within the restored process itself.</div>
<div><br>
</div>
<div>Perhaps I am just misunderstanding your suggestion though.</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, May 12, 2015 at 12:37 PM,
Ruslan Kuprieiev <span dir="ltr"><<a href="mailto:kupruser@gmail.com" target="_blank">kupruser@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Hi, Ross<br>
<br>
When restoring using RPC or Libcriu response message
contains "restored" field set to true,<br>
that help process to detect if it was restored. You say
that every time you restore the connection<br>
is broken, right? So maybe you could utilize "restored"
flag?<br>
<br>
Thanks,<br>
Ruslan <br>
<div>
<div> <br>
<div>On 05/12/2015 09:59 PM, Ross Boucher wrote:<br>
</div>
</div>
</div>
<blockquote type="cite">
<div>
<div>
<div dir="ltr">In order to get support working in my
application, I've resorted to a hack that works
but is almost certainly not the best way to do
things. I'm interested if anyone has suggestions
for a better way. First, let me explain how it
works.
<div><br>
</div>
<div>The process I'm checkpointing is a node.js
process that opens a socket, and waits for a
connection on that socket. Once established, the
connecting process sends code for the node.js
process to evaluate, in a loop. The node process
is checkpointed between every message containing
new code to evaluate. </div>
<div><br>
</div>
<div>Now, when we restore, it is always a
completely new process sending code to the
node.js process, so the built in tcp socket
restoration won't work. We had lots of
difficulty figuring out how to detect that the
socket connection had been broken. Ultimately,
the hack we ended up using was to simply loop
forever on a separate thread checking the time,
and noticing if an unexplained huge gap in time
had occurred. The looping thread looks like
this:</div>
<div><br>
</div>
<div><br>
</div>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div>
<div>void * canceler(void * threadPointer)</div>
</div>
<div>
<div>{</div>
</div>
<div>
<div> pthread_t thread = *(pthread_t
*)threadPointer;</div>
</div>
<div>
<div><br>
</div>
</div>
<div>
<div> time_t start,end;</div>
</div>
<div>
<div> time(&start);</div>
</div>
<div>
<div><br>
</div>
</div>
<div>
<div> while(true)</div>
</div>
<div>
<div> {</div>
</div>
<div>
<div> usleep(1000);</div>
</div>
<div>
<div> time(&end);</div>
</div>
<div>
<div> double diff =
difftime(end,start);</div>
</div>
<div>
<div><br>
</div>
</div>
<div>
<div> if (diff > 1.0) {<br>
</div>
</div>
<div>
<div> // THIS IS ALMOST CERTAINLY A
RESTORE</div>
</div>
<div>
<div> break;</div>
</div>
<div>
<div> }</div>
</div>
<div>
<div> }</div>
</div>
<div>
<div><br>
</div>
</div>
<div>
<div> // cancel the read thread<br>
</div>
</div>
</blockquote>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div>
<div> int result = pthread_cancel(thread);</div>
</div>
<div>
<div><br>
</div>
</div>
<div>
<div> return NULL;</div>
</div>
</blockquote>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div>
<div>}</div>
</div>
</blockquote>
<div><br>
</div>
<div><br>
</div>
<div>Elsewhere, in the code that actually does the
reading, we spawn this thread with a handle to
the read thread:</div>
<div><br>
</div>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div>
<div>pthread_create(&cancelThread, NULL,
canceler, (void *)readThread);</div>
</div>
</blockquote>
<div><br>
</div>
<div><br>
</div>
<div>The rest of our code understand how to deal
with a broken connection and is able to
seamlessly reconnect. This is all working well,
but it seems like there is probably a better way
so I wanted to ask for suggestions. I also tried
getting things to work with a file based socket
rather than a TCP socket, but that proved even
more difficult (and was far more complicated in
our architecture anyway, so I'd prefer not to
return down that path).</div>
<div><br>
</div>
<div>- Ross</div>
<div><br>
</div>
<div>[1] From my other email thread, this video
might help illustrate the actual process going
on, if my description isn't that clear: </div>
<div><br>
</div>
<div><a href="https://www.youtube.com/watch?v=F2L6JLFuFWs&feature=youtu.be" target="_blank">https://www.youtube.com/watch?v=F2L6JLFuFWs&feature=youtu.be</a><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
</div>
</div>
<pre>_______________________________________________
CRIU mailing list
<a href="mailto:CRIU@openvz.org" target="_blank">CRIU@openvz.org</a>
<a href="https://lists.openvz.org/mailman/listinfo/criu" target="_blank">https://lists.openvz.org/mailman/listinfo/criu</a>
</pre>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div></div></div>
</blockquote></div><br></div>