[CRIU] Using p.haul migration failure

Sun Oct 26 14:52:01 PDT 2014

After I rebuild a kernel with VDSO disabled (to avoid the un-mappable
VM_* flags) and try to migrate my container, I get a pre-dump failure.
As I reported earlier, 

> the src ("client") reports
> 
>     root at vm:~/src/p.haul# ./p.haul -v4 lxc iperfs 192.168.122.54
          :
>     Starting iterations
>     * Iteration 0
>             Making directory /var/local/p.haul-fs/dmp-JUOEXm-14.10.20-09.34/img/1
>             Issuing pre-dump command to service
>     Traceback (most recent call last):
>       File "./p.haul", line 39, in <module>
>         worker.start_migration()
>       File "/home/sowmini/src/p.haul/p_haul_iters.py", line 122, in start_migration
>         raise Exception("Pre-dump failed")
>     Exception: Pre-dump failed

Looking at the pre-dump.log file in the src/client dir above did not
show anything remarkable, so I went and looked in the dst/servier
side. There, the  criu_page_server.1.log had a few odd things.

First off, it reprots 

(00.024343) Accepted connection from 0.0.0.0:0

this is an accept() on the socket that is passed as argv[2]
from p_haul_service.py to "criu swrk <fd>", and I'm not enough
of a python power-user to figure out why the accept() reports
this strangeness? At that point, lsof for the criu process reports

  :
criu    22594 root    8u  unix 0xffff88001b5c5180      0t0  92664 socket
  :

this ia a AF_UNIX socket (is that correct? to whom?)
At this point the client/sender/src-of-migration has already exited
with a pre-dump failure.

letting it go forhter, it rails with

>     Starting page server for iter 1
>             Sending criu rpc req
>             Page server started at 5353
>     Disconnected
>     Error (cr-service.c:661): Can't recv request: Connection reset by peer
>     Closing images
>     Keeping images
>     Images are kept in /var/local/p.haul-fs/rst-J48TPi-14.10.20-09.34

any hints on how to debug this further?

--Sowmini