[CRIU] the new p.haul interface p.haul-wrap question

Mon Oct 12 10:30:14 PDT 2015

On Mon, Oct 12, 2015 at 07:24:02PM +0300, Pavel Emelyanov wrote:
> On 10/12/2015 05:14 PM, Adrian Reber wrote:
> > In the past (before 'implement migration over existing connections') I
> > have started p.haul-service on the destination machine and was able to
> > migrate as many processes to the destination machine without restarting
> > p.haul-service. Using the newly introduced p.haul-wrap this is no longer
> > possible. The first connection works but the second connection just
> > hangs with:
> > 
> > # ./p.haul-wrap client host02 pid `pidof minimal`   -v 4  -j  
> > Establish connection...
> > Exec p.haul: ./p.haul pid 5678 -v 4 -j --to host02 --fdrpc 3 --fdmem 4 --fdfs 5
> > 14:04:33.753: Starting p.haul
> > 14:04:33.753: Use existing connections, fdrpc=3 fdmem=4 fdfs=5
> > 
> > Which is a bit annoying as p.haul-service (or ./p.haul-wrap service) has
> > to be restarted for every new migration request. Even for failed
> > migration requests.
> > 
> > It also says pretty clear that 'p.haul-wrap' is for testing purposes
> > only which is a bit confusing as there is right now no other way to use
> > p.haul from the command-line.
> 
> Well, it looks like the existing containers engines (OpenVZ, LXC and Docker)
> cannot use p.haul when it's run as a service and uses only CRIU. The reason
> for that is simple -- at the very end we should do engine's restore, not
> criu restore to (at least) reattach the restored processes to the engine
> daemon (LXC daemon, LXD or Docker daemon). 
> 
> Another reason for removing the standalone daemon is that connections between
> target and source nodes can be governed in a complex way that is very
> dependent on the infrastructure used.
> 
> So the utility of the standalone service became doubtful and we switched to a
> model when service process is spawned by the engine with given connections.
> The connections themselves can then be anything the engine wants.

I understand.

> > So I am kind of missing the removed stand-alone p.haul mode as I do not
> > know how to set up the required file descriptors for the communication
> > between the p.haul processes.
> 
> Ah, so for you experiments you need the way to keep service constantly up
> and running, right? Would fixed p.haul-wrap that does run_phaul_service in
> a loop be helpful?

That would be helpful, yes. The reason for my confusion was that
p.haul-wrap is marked as only for testing and there has been no further
documentation how to use p.haul now. It just sounded to me, that the
wrapper should not actually be used. But now it sounds more like it is
an interface which can be used. So if the standalone mode can be
emulated with a loop in the p.haul-wrap that would be great.

		Adrian