[CRIU] hang in ip tool

Tycho Andersen tycho.andersen at canonical.com
Mon Sep 8 06:17:30 PDT 2014


On Mon, Sep 08, 2014 at 02:53:36PM +0400, Pavel Emelyanov wrote:
> On 09/05/2014 11:22 PM, Tycho Andersen wrote:
> > Hi all,
> > 
> > On Wed, Sep 03, 2014 at 06:45:39PM +0400, Pavel Emelyanov wrote:
> >> On 09/03/2014 05:52 PM, Tycho Andersen wrote:
> >>> Hi all,
> >>>
> >>> Recently when restoring containers I have been getting a hang in the
> >>> ip tool when criu runs 'ip addr restore'. It prints,
> >>>
> >>> RTNETLINK answers: File exists
> >>> RTNETLINK answers: File exists
> >>
> >> I've seen such when it was putting 127.0.0.1 on lo which was
> >> already there "automatically" (some kernels seem to do it by
> >> default).
> >>
> >>> and then seems to hang. Has anyone seen this behavior, or any ideas
> >>> what the problem is?
> >>
> >> Hanging is something new to me, I've never seen it. Can you 
> >> strace it to check where the problem is?
> > 
> > So this issue did just resurface and it turns out that it's not
> > actually hanging in ip tool, it is hanging in cr_system in the
> > sigprocmask call where it is resetting the mask. As I write this, it
> > seems to have gone away again. Any ideas what might cause this?
> 
> Hm... I've never seen a process hanging in procmask reset. Can you
> check the /proc/pid/stack file when it hangs for exact in-kernel
> calltrace?

Yes,

# the first criu process
criu2:~ sudo cat /proc/1537/stack
[<ffffffff810d7a8d>] futex_wait_queue_me+0xdd/0x140
[<ffffffff810d84f2>] futex_wait+0x182/0x290
[<ffffffff810daaee>] do_futex+0xde/0x760
[<ffffffff810db1e1>] SyS_futex+0x71/0x150
[<ffffffff8172adff>] tracesys+0xe1/0xe6
[<ffffffffffffffff>] 0xffffffffffffffff

# the only other forked() process
criu2:~ sudo cat /proc/1539/stack
[<ffffffff8107888b>] ptrace_stop+0x15b/0x2b0
[<ffffffff8107a25d>] get_signal_to_deliver+0x3dd/0x6f0
[<ffffffff81013448>] do_signal+0x48/0x960
[<ffffffff81013dc9>] do_notify_resume+0x69/0xb0
[<ffffffff8172aeaa>] int_signal+0x12/0x17
[<ffffffffffffffff>] 0xffffffffffffffff

I guess what is happening is that as soon as we unmask some signal is
delivered and it hangs?

Tycho

> > Thanks,
> > 
> > Tycho
> > 
> >>> Thanks,
> >>>
> >>> Tycho
> >>> _______________________________________________
> >>> CRIU mailing list
> >>> CRIU at openvz.org
> >>> https://lists.openvz.org/mailman/listinfo/criu
> >>>
> >>
> > 
> 


More information about the CRIU mailing list