[CRIU] p.haul and lxc

Tycho Andersen tycho.andersen at canonical.com
Fri Nov 14 09:26:59 PST 2014


On Fri, Nov 14, 2014 at 07:47:15PM +0400, Pavel Emelyanov wrote:
> On 11/14/2014 08:04 PM, Tycho Andersen wrote:
> 
> >> If p.haul will use LXC's sockets and will use LXC as "checkpoint-restore API"
> >> then the workflow would look like this.
> >>
> >>   src p.haul says to dst one "start page server"
> >>   src p.haul says to local "criu api (lxc daemon)" -- start pre-dump
> >>
> >> After these two steps criu page server on dst and criu pre-dump on the
> >> src should be connected. Can LXC daemon provide this?
> > 
> > Yes, I think we can provide the authenticated socket (or just pass a
> > message for criu as a proxy). In fact, the proxy method might be the
> > easiest -- p.haul sends stuff to lxd, and then lxd forwards it on to the
> > other lxd, which sends it back to the other end's p.haul.
> 
> Wait, we seem to talk about different sockets :) Maybe not, but let me
> clarify the whole picture anyway :)
> 
> The socket I'm talking about is the socket which will be used by criu 
> pre-dump to send memory contents of tasks to the page server. Not the 
> one that will be used by p.haul ends to talk to each other.
> 
> The in-progress picture should look like this
> 
> src-LXD                                dst-LXD
>  `- p.haul --[ channel for commands ]-- `- p.haul-service
>  `- criu   --[  channel for memory  ]-- `- criu
>                                         `- init <-- will get CLONE_PARENT by criu
>                                             `- ...
> 
> There are two network channels and four local via which both p.haul-s can
> talk to LXD-s as to "CRIU API" and LXD-s make calls to criu-s.
> 
> As far as network channels are concerned.
> 
> The 1st channel (for commands) can be implemented "via" LXDs, since it's
> nothing but pre-dump/dump/restore stages synchronization. But the 2ns
> channel (for memory) should be just a socket for data (auth-d and crypted,
> but there's no need in whole LXD in between from my POV). 
> 
> BTW, the same channel is currently used by p.haul to transfer non-memory 
> images at the very end, so p.haul-s should "know" about it too.

I see. My concern is about the auth, actually. We're building some
auth scheme (http-based certificates) into lxd, so we'll probably use
a websocket to be the p.haul command layer, and potentially the data
layer. I think we can just write a python module that understands
lxd's websocket scheme, and expose it as a file-like object in python
and that should be ok. (At least, based on my read of the current
p.haul code, it looks like it should work.)

I agree, though, that in principle there is no reason (and ideally we
wouldn't have) an lxd in the middle. The only reason to do it would be
for some custom auth mechanism.

> >> Note, that it will
> >> not be nice if for every such iteration the new socket will be created,
> >> there can be several iterations.
> > 
> > I think the socket we give to p.haul would be for use exclusively by
> > p.haul, so since it's not necessary now, I don't think it would be.
> > 
> > 
> >> Hmm...
> >>
> >> I guess this can be solved if during LXC-to-LXC migration handshake they 
> >> open two (3 in FS migration case) sockets, one is fed to p.haul-s, the 2nd
> >> to criu pre-dump and criu page-server.
> >>
> >> At the same time fork() + exec() of criu on every iteration doesn't sound
> >> nice too (can be long). We have the "swrk" mode of criu -- it's when criu
> >> gets a socket and reads RPC command from it instead of parsing command
> >> line arguments. The page-server start, pre-dump, dump and restore work
> >> nice through this mode. I guess we need to polish one in 1.4, document
> >> and use _it_ in the migration case. Does this sound OK to you?
> > 
> > Ah, that's interesting. I hadn't thought about the multiple forks
> > being expensive. 
> 
> Fork()-s -- no. Execve()-s will (can) be :)

Ah, ok. Is this because it has to remap the binary every time, or look
through the path?

> > So we'd start lxc-checkpoint in some sort of daemon
> > mode, which would then read rpc commands over the socket from p.haul
> > until the final dump was done? Then on the restore side I guess it
> > would just be the same single command thing.
> 
> Not single, unfortunately. During iterations destination LXD will have
> to ask CRIU to start page-servers to accept memory pages.
> 
> Can LXD fork criu in swrk mode and just forward to it anything that
> comes from p.haul? Without de/en-coding the contents.

Yes, that is one option. Ideally we'd be able to connect them
directly, via some custom auth implementation that we can plug into
(or wrap) p.haul with.

Tycho

> > The only problem I see with this is that then lxc needs to depend on
> > protobuf-c-compiler, which isn't currently in ubuntu's 'main' repo and
> > would take some work to get it there.
> > 
> > Tycho
> > .
> > 
> 
> Thanks,
> Pavel
> 


More information about the CRIU mailing list