[Devel] Re: multi-threaded app fails to restart

Oren Laadan orenl at cs.columbia.edu
Mon Jul 19 20:24:13 PDT 2010


On 07/19/2010 04:27 PM, John Paul Walters wrote:
>>> Ghost state Success
>>> [ 3210.330285] [4031:4031:c/r:pgarr_release_pages:102] total pages 0
>>> [ 3210.330288] [4031:4031:c/r:do_restart:1451] sys_restart returns -512
>>>
>>> Any thoughts?
>>
>> There were two patches posted to the containers list on 11 July - "fix
>> task tree traversal for threads" and "save/restore 'sysenter_return' for
>> threads".  Can you try with those on top of ckpt-v22-dev?
>>
>>
>>
> 
> Hi Nathan,
> 
> Thanks for your help.  I applied the two patches as you suggested.
> They fixed the first of the two bad sys_restart return values, but the
> final one (quoted above, for what it's worth) still returns -512.
> When I use the -d -v switches to restart, it appears to work (no error
> messages are returned), but only the main thread is restored while the
> second thread is not.

Hi John,

I just pushed a few more fixes related to signals to ckpt-v22-dev.
Can you please see if they fix your problem ?

Also, can you please post the test program that you are using, so
we can try to replicate the problem ?

Note that it is usually ok for sys_restart() to return -512 -- it
means that the process/thread was interrupted when the checkpoint,
and it will now retry the same syscall from then.

You can use the -F (--freezer) switch of restart(1) to freeze the
restarted tasks/threads before they are allowed to run in userspace.
Using it you can tell whether the other thread dies immediately
after restart, or is not at all restarted.

Thanks,

Oren.
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list