[CRIU] zdtm/static/fd failure on aarch64

Dmitry Safonov 0x7f454c46 at gmail.com
Mon Jun 11 19:59:51 MSK 2018


2018-06-11 16:55 GMT+01:00 Adrian Reber <adrian at lisas.de>:
> On Mon, Jun 11, 2018 at 04:41:17PM +0100, Dmitry Safonov wrote:
>> 2018-06-11 16:34 GMT+01:00 Adrian Reber <adrian at lisas.de>:
>> [...]
>> > No, still endless loop:
>> >
>> > 11:32:31.668:  4109: ERR: ../lib/lock.h:148: futex (errno = 11 (Resource temporarily unavailable))
>> >
>> > [pid  4109] futex(0xffff87000000, FUTEX_WAIT, 2264924161, NULL) = -1 EAGAIN (Resource temporarily unavailable)
>> > [pid  4109] newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}, 0) = 0
>> > [pid  4109] write(2, "11:32:32.516:  4109: ERR: ../lib"..., 99) = 99
>>
>> Ugh, I adore this piece of code.
>> The third time, lucky time? %)
>
> :( Sorry. Still:
>
> 11:54:16.950:  4210: ERR: ../lib/lock.h:148: futex (errno = 11 (Resource temporarily unavailable))
>
> [pid  4210] futex(0xffff9f650000, FUTEX_WAIT, 2674196480, NULL) = -1 EAGAIN (Resource temporarily unavailable)
> [pid  4210] newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}, 0) = 0
> [pid  4210] write(2, "11:54:16.950:  4210: ERR: ../lib"..., 99) = 99

Heh, could you attach 20-30 lines after the first
futex(..., FUTEX_WAIT, 2,...) call in the endless loop?

I do see that there is a possible race like this:

futex val           CPU0/task0                      CPU1/task1
                        ---------------                      ---------------
1                      atomic_inc()=2                  atomic_inc()=3
3                      futex(FUTEX_WAIT, 2)
3                      EWOULDBLOCK
3                      atomic_inc()=4
4
futex(FUTEX_WAIT, 3)
4                                                               EWOULDBLOCK
...

But it looks quite unlikely to reproduce each cycle and I'm not sure
it's the problem you're observing.

Thanks,
             Dmitry


More information about the CRIU mailing list