[CRIU] p.haul and lxc

Fri Nov 14 08:04:50 PST 2014

On Fri, Nov 14, 2014 at 06:34:51PM +0400, Pavel Emelyanov wrote:
> On 11/14/2014 06:05 PM, Tycho Andersen wrote:
> > * The p.haul server uses the lxc python api to do ->restore(), which
> >   by default execs criu, so the p.haul server process is replaced by
> >   the criu process, which does CLONE_PARENT because of
> >   --restore-sibling, and everything is happy.
> 
> Wow, you're right. If we delegate to parent LXC process the ->restore
> callback then we've solved the issue with reattach :)

Yeah, I was just being lazy (or so I thought) when I wrote the API
this way, but it turns out it is a feature and not a bug :)

> > Yes, agreed. I think the best way is to allow users (perhaps via some
> > plugins to the library, if not just executable arguments) to spawn new
> > sockets, and then just have p.haul ask that plugin for a socket to the
> > server; maybe with some ordering like the first socket is the control
> > socket, and then every socket after that's type is negotiated over the
> > control socket. We're interested in spawning only authenticated (TLS)
> > sockets, so a simple connect() won't work for us.
> 
> Yes, I agree that plain socket is not the way to go. This was done so in
> p.haul just for the simplicity. I haven't found quickly any secure proxies,
> so decided to make the proof-of-concept on plain connect.

Yep, makes sense.

> If p.haul will use LXC's sockets and will use LXC as "checkpoint-restore API"
> then the workflow would look like this.
> 
>   src p.haul says to dst one "start page server"
>   src p.haul says to local "criu api (lxc daemon)" -- start pre-dump
> 
> After these two steps criu page server on dst and criu pre-dump on the
> src should be connected. Can LXC daemon provide this?

Yes, I think we can provide the authenticated socket (or just pass a
message for criu as a proxy). In fact, the proxy method might be the
easiest -- p.haul sends stuff to lxd, and then lxd forwards it on to the
other lxd, which sends it back to the other end's p.haul.

> Note, that it will
> not be nice if for every such iteration the new socket will be created,
> there can be several iterations.

I think the socket we give to p.haul would be for use exclusively by
p.haul, so since it's not necessary now, I don't think it would be.

> Hmm...
> 
> I guess this can be solved if during LXC-to-LXC migration handshake they 
> open two (3 in FS migration case) sockets, one is fed to p.haul-s, the 2nd
> to criu pre-dump and criu page-server.
> 
> At the same time fork() + exec() of criu on every iteration doesn't sound
> nice too (can be long). We have the "swrk" mode of criu -- it's when criu
> gets a socket and reads RPC command from it instead of parsing command
> line arguments. The page-server start, pre-dump, dump and restore work
> nice through this mode. I guess we need to polish one in 1.4, document
> and use _it_ in the migration case. Does this sound OK to you?

Ah, that's interesting. I hadn't thought about the multiple forks
being expensive. So we'd start lxc-checkpoint in some sort of daemon
mode, which would then read rpc commands over the socket from p.haul
until the final dump was done? Then on the restore side I guess it
would just be the same single command thing.

The only problem I see with this is that then lxc needs to depend on
protobuf-c-compiler, which isn't currently in ubuntu's 'main' repo and
would take some work to get it there.

Tycho