[CRIU] Alternative to hacky resume detection

Pavel Emelyanov xemul at parallels.com
Wed May 13 05:04:38 PDT 2015


On 05/12/2015 09:59 PM, Ross Boucher wrote:
> In order to get support working in my application, I've resorted to a hack that works but
> is almost certainly not the best way to do things. I'm interested if anyone has suggestions
> for a better way. First, let me explain how it works. 
> 
> The process I'm checkpointing is a node.js process that opens a socket, and waits for a connection
> on that socket. Once established, the connecting process sends code for the node.js process to
> evaluate, in a loop. The node process is checkpointed between every message containing new code 
> to evaluate. 
> 
> Now, when we restore, it is always a completely new process sending code to the node.js process,

Wait a second, I understood from the previous paragraph that the node.js is the process you
checkpoint and restore, isn't it? So why the code-sending process is "new" here?

> so the built in tcp socket restoration won't work. We had lots of difficulty figuring out how to
> detect that the socket connection had been broken.

Is read() from socket returning 0 not enough? Or poll()-ing the socket for read and once it's
read-ready when it shouldn't it's closed.


-- Pavel



More information about the CRIU mailing list