<div dir="ltr">The container is only running the one process, but I have pools of identical containers, and checkpoint/restore into ones unpredictably -- so the underlying things like mount points and file descriptors would change, which is what I'm using docker to manage.</div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 12, 2015 at 2:46 PM, Ruslan Kuprieiev <span dir="ltr"><<a href="mailto:kupruser@gmail.com" target="_blank">kupruser@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Oh, so the whole container is being dumped and not only that one
process?<br>
Hm, you might be able to just call criu_dump on whole container<br>
from within that process just as I showed you in code below(but
specify container<br>
pid) and get same results. The way that that return 1 in criu_dump
works is criu<br>
puts a proper response packet into that service socket when
restoring a process tree,<br>
so everything should work.<div><div class="h5"><br>
<br>
<div>On 05/13/2015 12:36 AM, Ross Boucher
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">That's an interesting idea. Though, my process is
inside of a docker container, and I think it would get upset by
being restored into a different container. I think I need the
coordination docker is doing in order for my system to work.</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, May 12, 2015 at 2:27 PM, Ruslan
Kuprieiev <span dir="ltr"><<a href="mailto:kupruser@gmail.com" target="_blank">kupruser@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> I'm saying that you
might want to consider calling criu_dump() from a process
that you are<br>
trying to dump. We call it self dump[1]. For example,
using criu_dump() from libcriu it might look like:<br>
<br>
...<br>
while (1) {<br>
ret = criu_dump();<br>
if (ret < 0) {<br>
/*error*/<br>
} else if (ret == 0) {<br>
/*dump is ok*/<br>
} else if (ret == 1) {<br>
/*This process is restored*/<br>
/*reestablish connection or do whatever needs to be
done<br>
* in case of broken connection */<br>
}<br>
/*accept connection and evaluate code*/<br>
}<br>
...<br>
<br>
[1] <a href="http://criu.org/Self_dump" target="_blank">http://criu.org/Self_dump</a>
<div>
<div><br>
<br>
<br>
<div>On 05/12/2015 11:25 PM, Ross Boucher wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">I'm not sure I follow. You're saying,
the process that actually calls restore would get
notified? Or, are you saying that somehow in the
restored process I can access something set by
criu?
<div><br>
</div>
<div>Assuming the former, I don't think that's
necessary -- I already know that I've just
restored the process. I could try to send a
signal from the coordinating process and then
use that signal to cancel the read thread, which
would be mostly the same thing. But because that
would have to travel through quite a few layers,
it seems like it would be better and more
performant to do it from within the restored
process itself.</div>
<div><br>
</div>
<div>Perhaps I am just misunderstanding your
suggestion though.</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, May 12, 2015 at
12:37 PM, Ruslan Kuprieiev <span dir="ltr"><<a href="mailto:kupruser@gmail.com" target="_blank">kupruser@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Hi,
Ross<br>
<br>
When restoring using RPC or Libcriu response
message contains "restored" field set to
true,<br>
that help process to detect if it was
restored. You say that every time you
restore the connection<br>
is broken, right? So maybe you could utilize
"restored" flag?<br>
<br>
Thanks,<br>
Ruslan <br>
<div>
<div> <br>
<div>On 05/12/2015 09:59 PM, Ross
Boucher wrote:<br>
</div>
</div>
</div>
<blockquote type="cite">
<div>
<div>
<div dir="ltr">In order to get support
working in my application, I've
resorted to a hack that works but is
almost certainly not the best way to
do things. I'm interested if anyone
has suggestions for a better way.
First, let me explain how it works.
<div><br>
</div>
<div>The process I'm checkpointing
is a node.js process that opens a
socket, and waits for a connection
on that socket. Once established,
the connecting process sends code
for the node.js process to
evaluate, in a loop. The node
process is checkpointed between
every message containing new code
to evaluate. </div>
<div><br>
</div>
<div>Now, when we restore, it is
always a completely new process
sending code to the node.js
process, so the built in tcp
socket restoration won't work. We
had lots of difficulty figuring
out how to detect that the socket
connection had been broken.
Ultimately, the hack we ended up
using was to simply loop forever
on a separate thread checking the
time, and noticing if an
unexplained huge gap in time had
occurred. The looping thread looks
like this:</div>
<div><br>
</div>
<div><br>
</div>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div>
<div>void * canceler(void *
threadPointer)</div>
</div>
<div>
<div>{</div>
</div>
<div>
<div> pthread_t thread =
*(pthread_t *)threadPointer;</div>
</div>
<div>
<div><br>
</div>
</div>
<div>
<div> time_t start,end;</div>
</div>
<div>
<div> time(&start);</div>
</div>
<div>
<div><br>
</div>
</div>
<div>
<div> while(true)</div>
</div>
<div>
<div> {</div>
</div>
<div>
<div> usleep(1000);</div>
</div>
<div>
<div> time(&end);</div>
</div>
<div>
<div> double diff =
difftime(end,start);</div>
</div>
<div>
<div><br>
</div>
</div>
<div>
<div> if (diff > 1.0)
{<br>
</div>
</div>
<div>
<div> // THIS IS
ALMOST CERTAINLY A RESTORE</div>
</div>
<div>
<div> break;</div>
</div>
<div>
<div> }</div>
</div>
<div>
<div> }</div>
</div>
<div>
<div><br>
</div>
</div>
<div>
<div> // cancel the read
thread<br>
</div>
</div>
</blockquote>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div>
<div> int result =
pthread_cancel(thread);</div>
</div>
<div>
<div><br>
</div>
</div>
<div>
<div> return NULL;</div>
</div>
</blockquote>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div>
<div>}</div>
</div>
</blockquote>
<div><br>
</div>
<div><br>
</div>
<div>Elsewhere, in the code that
actually does the reading, we
spawn this thread with a handle to
the read thread:</div>
<div><br>
</div>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div>
<div>pthread_create(&cancelThread,
NULL, canceler, (void
*)readThread);</div>
</div>
</blockquote>
<div><br>
</div>
<div><br>
</div>
<div>The rest of our code understand
how to deal with a broken
connection and is able to
seamlessly reconnect. This is all
working well, but it seems like
there is probably a better way so
I wanted to ask for suggestions. I
also tried getting things to work
with a file based socket rather
than a TCP socket, but that proved
even more difficult (and was far
more complicated in our
architecture anyway, so I'd prefer
not to return down that path).</div>
<div><br>
</div>
<div>- Ross</div>
<div><br>
</div>
<div>[1] From my other email thread,
this video might help illustrate
the actual process going on, if my
description isn't that clear: </div>
<div><br>
</div>
<div><a href="https://www.youtube.com/watch?v=F2L6JLFuFWs&feature=youtu.be" target="_blank">https://www.youtube.com/watch?v=F2L6JLFuFWs&feature=youtu.be</a><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
</div>
</div>
<pre>_______________________________________________
CRIU mailing list
<a href="mailto:CRIU@openvz.org" target="_blank">CRIU@openvz.org</a>
<a href="https://lists.openvz.org/mailman/listinfo/criu" target="_blank">https://lists.openvz.org/mailman/listinfo/criu</a>
</pre>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div></div></div>
</blockquote></div><br></div>