[CRIU] Using p.haul migration failure

Pavel Emelyanov xemul at parallels.com
Mon Oct 27 04:54:52 PDT 2014


On 10/27/2014 01:52 AM, Sowmini Varadhan wrote:
> 
> 
> After I rebuild a kernel with VDSO disabled (to avoid the un-mappable
> VM_* flags) and try to migrate my container, I get a pre-dump failure.
> As I reported earlier, 
> 
>> the src ("client") reports
>>
>>     root at vm:~/src/p.haul# ./p.haul -v4 lxc iperfs 192.168.122.54
>           :
>>     Starting iterations
>>     * Iteration 0
>>             Making directory /var/local/p.haul-fs/dmp-JUOEXm-14.10.20-09.34/img/1
>>             Issuing pre-dump command to service
>>     Traceback (most recent call last):
>>       File "./p.haul", line 39, in <module>
>>         worker.start_migration()
>>       File "/home/sowmini/src/p.haul/p_haul_iters.py", line 122, in start_migration
>>         raise Exception("Pre-dump failed")
>>     Exception: Pre-dump failed
> 
> Looking at the pre-dump.log file in the src/client dir above did not
> show anything remarkable, so I went and looked in the dst/servier
> side. There, the  criu_page_server.1.log had a few odd things.
> 
> First off, it reprots 
> 
> (00.024343) Accepted connection from 0.0.0.0:0

This is a bogus log message. For p.haul migration page server doesn't
accept any connections, it just inherits the data socket from p.haul.

> this is an accept() on the socket that is passed as argv[2]
> from p_haul_service.py to "criu swrk <fd>", and I'm not enough
> of a python power-user to figure out why the accept() reports
> this strangeness? At that point, lsof for the criu process reports
>  
>   :
> criu    22594 root    8u  unix 0xffff88001b5c5180      0t0  92664 socket
>   :
> 
> this ia a AF_UNIX socket (is that correct? to whom?)

Yes, this is a socket via which criu talks to p.haul(-service).

> At this point the client/sender/src-of-migration has already exited
> with a pre-dump failure.
> 
> letting it go forhter, it rails with
> 
>>     Starting page server for iter 1
>>             Sending criu rpc req
>>             Page server started at 5353
>>     Disconnected
>>     Error (cr-service.c:661): Can't recv request: Connection reset by peer
>>     Closing images
>>     Keeping images
>>     Images are kept in /var/local/p.haul-fs/rst-J48TPi-14.10.20-09.34
> 
> 
> any hints on how to debug this further?

Can you re-run migration with -v4 option and show all CRIU logs you will
collect. Other than this, output of p.haul and p.haul-service would be
useful too.

Thanks,
Pavel




More information about the CRIU mailing list