[Devel] Re: [C/R] sleepers don't wake up on restart
Oren Laadan
orenl at cs.columbia.edu
Wed Apr 29 14:45:32 PDT 2009
Hi,
Sukadev Bhattiprolu wrote:
> Oren Laadan [orenl at cs.columbia.edu] wrote:
> |
> | I just posted v14-rc3 which includes the c/r of restart-blocks.
> | That should improve the situation.
> |
> | However, depending on which syscalls one uses, process may still
> | seem "stuck" after restart because the current code still does
> | not save signals nor task timers; If a signal was pending (SIGALRM
> | for example) after freezing but before checkpoint, it will be lost.
> | If a timer was set at checkpoint, it will not be restored.
> |
> | So depending on your program, you may still experience issues
> | until I add patches to handle that.
>
> Ok, Just an fyi, the original program seemed to work fine, but when
> I try to restart a small process tree, I get stuck on restart again.
>
> I am running on v14-rc3 branch. Has this got anything to do with
> pending SIGCHLD ? Seems to be easier to repro with larger process
> trees (2 children per process, 4 or more levels deep).
Could be. You can verify by adding a couple of lines of code to
the checkpoint to complain if there are signals pending on a task
that is being checkpointed.
BTW, current code disregards Zombie processes.
Support for both (signals and zombies) is in the queue.
Oren.
>
> Test programs (attached) (they need some cleanup though)
>
> ptree2.c
> p2.loop
>
> --------- Processes after restart:
>
> $ ps -ef|grep ptree
>
> root 10461 10459 0 22:07 pts/0 00:00:00 ./ptree2 -n 1 -d 2
> root 10465 10461 0 22:07 pts/0 00:00:00 ./ptree2 -n 1 -d 2
> root 10466 10465 0 22:07 pts/0 00:00:00 [ptree2] <defunct>
> root 10479 8220 0 22:09 pts/1 00:00:00 grep ptree
>
> ---------- Process stacks
>
> tree2 S f6270a90 0 10461 10459
> f5e59380 00000082 08048a86 f6270a90 f6270bfc c2b32260 00000000 0000d9d3
> f5f423b0 00000000 ffffffff 00000000 00000000 00000001 00000000 f6270a88
> 00000000 f6270a90 00000000 c02243aa 00000004 00000003 0000000c 00000006
> Call Trace:
> [<c02243aa>] do_wait+0x1dd/0x2f6
> [<c021cd14>] default_wake_function+0x0/0x8
> [<c0224542>] sys_wait4+0x7f/0x92
> [<c0224568>] sys_waitpid+0x13/0x17
> [<c0202ce5>] sysenter_do_call+0x12/0x25
> [<c0510000>] rtl8139_init_one+0x5ae/0x887
> ptree2 S f5f423b0 0 10465 10461
> f6002180 00000082 c2b265c8 f5f423b0 f5f4251c c2b29260 f67b1f44 e06d0177
> 00000282 c023363c c2b265c8 00000000 00000282 0000c350 00000001 0000c350
> 00000001 f67b1f44 0000c350 c051be99 00000000 00000001 0000c350 bf9d0e04
> Call Trace:
> [<c023363c>] hrtimer_start_range_ns+0x105/0x111
> [<c051be99>] do_nanosleep+0x54/0x8c
> [<c02336d7>] hrtimer_nanosleep+0x8f/0xee
> [<c02332b8>] hrtimer_wakeup+0x0/0x18
> [<c051be7f>] do_nanosleep+0x3a/0x8c
> [<c0233777>] sys_nanosleep+0x41/0x51
> [<c0202ce5>] sysenter_do_call+0x12/0x25
> ptree2 ? f6bee040 0 10466 10465
> f638cb80 00000046 00200200 f6bee040 f6bee1ac c2b17260 f6bee038 0000dd77
> 00000000 c022f576 ffffffff 00000303 00000000 00000001 00000000 00000012
> f5a61e84 f6bee040 f6bee038 c0224c29 f6270a90 00000001 f6bee038 f5a61f88
> Call Trace:
> [<c022f576>] wakeme_after_rcu+0x0/0x8
> [<c0224c29>] do_exit+0x638/0x63c
> [<c0224c87>] do_group_exit+0x5a/0x83
> [<c0224cbd>] sys_exit_group+0xd/0x10
> [<c0202ce5>] sysenter_do_call+0x12/0x25
>
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list