<div dir="ltr">That&#39;s an interesting idea. Though, my process is inside of a docker container, and I think it would get upset by being restored into a different container. I think I need the coordination docker is doing in order for my system to work.</div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 12, 2015 at 2:27 PM, Ruslan Kuprieiev <span dir="ltr">&lt;<a href="mailto:kupruser@gmail.com" target="_blank">kupruser@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    I&#39;m saying that you might want to consider calling criu_dump() from
    a process that you are<br>
    trying to dump. We call it self dump[1]. For example, using
    criu_dump() from libcriu it might look like:<br>
    <br>
    ...<br>
    while (1) {<br>
        ret = criu_dump();<br>
        if (ret &lt; 0) {<br>
            /*error*/<br>
        } else if (ret == 0) {<br>
           /*dump is ok*/<br>
        } else if (ret == 1) {<br>
          /*This process is restored*/<br>
          /*reestablish connection or do whatever needs to be done<br>
           * in case of broken connection */<br>
        }<br>
        /*accept connection and evaluate code*/<br>
    }<br>
    ...<br>
    <br>
    [1] <a href="http://criu.org/Self_dump" target="_blank">http://criu.org/Self_dump</a><div><div class="h5"><br>
    <br>
    <br>
    <div>On 05/12/2015 11:25 PM, Ross Boucher
      wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr">I&#39;m not sure I follow. You&#39;re saying, the process
        that actually calls restore would get notified? Or, are you
        saying that somehow in the restored process I can access
        something set by criu?
        <div><br>
        </div>
        <div>Assuming the former, I don&#39;t think that&#39;s necessary -- I
          already know that I&#39;ve just restored the process. I could try
          to send a signal from the coordinating process and then use
          that signal to cancel the read thread, which would be mostly
          the same thing. But because that would have to travel through
          quite a few layers, it seems like it would be better and more
          performant to do it from within the restored process itself.</div>
        <div><br>
        </div>
        <div>Perhaps I am just misunderstanding your suggestion though.</div>
        <div><br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Tue, May 12, 2015 at 12:37 PM,
          Ruslan Kuprieiev <span dir="ltr">&lt;<a href="mailto:kupruser@gmail.com" target="_blank">kupruser@gmail.com</a>&gt;</span> wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000"> Hi, Ross<br>
              <br>
              When restoring using RPC or Libcriu response message
              contains &quot;restored&quot; field set to true,<br>
              that help process to detect if it was restored. You say
              that every time you restore the connection<br>
              is broken, right? So maybe you could utilize &quot;restored&quot;
              flag?<br>
              <br>
              Thanks,<br>
              Ruslan <br>
              <div>
                <div> <br>
                  <div>On 05/12/2015 09:59 PM, Ross Boucher wrote:<br>
                  </div>
                </div>
              </div>
              <blockquote type="cite">
                <div>
                  <div>
                    <div dir="ltr">In order to get support working in my
                      application, I&#39;ve resorted to a hack that works
                      but is almost certainly not the best way to do
                      things. I&#39;m interested if anyone has suggestions
                      for a better way. First, let me explain how it
                      works. 
                      <div><br>
                      </div>
                      <div>The process I&#39;m checkpointing is a node.js
                        process that opens a socket, and waits for a
                        connection on that socket. Once established, the
                        connecting process sends code for the node.js
                        process to evaluate, in a loop. The node process
                        is checkpointed between every message containing
                        new code to evaluate. </div>
                      <div><br>
                      </div>
                      <div>Now, when we restore, it is always a
                        completely new process sending code to the
                        node.js process, so the built in tcp socket
                        restoration won&#39;t work. We had lots of
                        difficulty figuring out how to detect that the
                        socket connection had been broken. Ultimately,
                        the hack we ended up using was to simply loop
                        forever on a separate thread checking the time,
                        and noticing if an unexplained huge gap in time
                        had occurred. The looping thread looks like
                        this:</div>
                      <div><br>
                      </div>
                      <div><br>
                      </div>
                      <blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
                        <div>
                          <div>void * canceler(void * threadPointer)</div>
                        </div>
                        <div>
                          <div>{</div>
                        </div>
                        <div>
                          <div>    pthread_t thread = *(pthread_t
                            *)threadPointer;</div>
                        </div>
                        <div>
                          <div><br>
                          </div>
                        </div>
                        <div>
                          <div>    time_t start,end;</div>
                        </div>
                        <div>
                          <div>    time(&amp;start);</div>
                        </div>
                        <div>
                          <div><br>
                          </div>
                        </div>
                        <div>
                          <div>    while(true)</div>
                        </div>
                        <div>
                          <div>    {</div>
                        </div>
                        <div>
                          <div>        usleep(1000);</div>
                        </div>
                        <div>
                          <div>        time(&amp;end);</div>
                        </div>
                        <div>
                          <div>        double diff =
                            difftime(end,start);</div>
                        </div>
                        <div>
                          <div><br>
                          </div>
                        </div>
                        <div>
                          <div>        if (diff &gt; 1.0) {<br>
                          </div>
                        </div>
                        <div>
                          <div>            // THIS IS ALMOST CERTAINLY A
                            RESTORE</div>
                        </div>
                        <div>
                          <div>            break;</div>
                        </div>
                        <div>
                          <div>        }</div>
                        </div>
                        <div>
                          <div>    }</div>
                        </div>
                        <div>
                          <div><br>
                          </div>
                        </div>
                        <div>
                          <div>    // cancel the read thread<br>
                          </div>
                        </div>
                      </blockquote>
                      <blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
                        <div>
                          <div>    int result = pthread_cancel(thread);</div>
                        </div>
                        <div>
                          <div><br>
                          </div>
                        </div>
                        <div>
                          <div>    return NULL;</div>
                        </div>
                      </blockquote>
                      <blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
                        <div>
                          <div>}</div>
                        </div>
                      </blockquote>
                      <div><br>
                      </div>
                      <div><br>
                      </div>
                      <div>Elsewhere, in the code that actually does the
                        reading, we spawn this thread with a handle to
                        the read thread:</div>
                      <div><br>
                      </div>
                      <blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
                        <div>
                          <div>pthread_create(&amp;cancelThread, NULL,
                            canceler, (void *)readThread);</div>
                        </div>
                      </blockquote>
                      <div><br>
                      </div>
                      <div><br>
                      </div>
                      <div>The rest of our code understand how to deal
                        with a broken connection and is able to
                        seamlessly reconnect. This is all working well,
                        but it seems like there is probably a better way
                        so I wanted to ask for suggestions. I also tried
                        getting things to work with a file based socket
                        rather than a TCP socket, but that proved even
                        more difficult (and was far more complicated in
                        our architecture anyway, so I&#39;d prefer not to
                        return down that path).</div>
                      <div><br>
                      </div>
                      <div>- Ross</div>
                      <div><br>
                      </div>
                      <div>[1] From my other email thread, this video
                        might help illustrate the actual process going
                        on, if my description isn&#39;t that clear: </div>
                      <div><br>
                      </div>
                      <div><a href="https://www.youtube.com/watch?v=F2L6JLFuFWs&amp;feature=youtu.be" target="_blank">https://www.youtube.com/watch?v=F2L6JLFuFWs&amp;feature=youtu.be</a><br>
                      </div>
                      <div><br>
                      </div>
                      <div><br>
                      </div>
                    </div>
                    <br>
                    <fieldset></fieldset>
                    <br>
                  </div>
                </div>
                <pre>_______________________________________________
CRIU mailing list
<a href="mailto:CRIU@openvz.org" target="_blank">CRIU@openvz.org</a>
<a href="https://lists.openvz.org/mailman/listinfo/criu" target="_blank">https://lists.openvz.org/mailman/listinfo/criu</a>
</pre>
              </blockquote>
              <br>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br></div>