[CRIU] zdtm/static/fd failure on aarch64

Adrian Reber adrian at lisas.de
Mon Jun 11 21:42:41 MSK 2018


On Mon, Jun 11, 2018 at 05:59:51PM +0100, Dmitry Safonov wrote:
> 2018-06-11 16:55 GMT+01:00 Adrian Reber <adrian at lisas.de>:
> > On Mon, Jun 11, 2018 at 04:41:17PM +0100, Dmitry Safonov wrote:
> >> 2018-06-11 16:34 GMT+01:00 Adrian Reber <adrian at lisas.de>:
> >> [...]
> >> > No, still endless loop:
> >> >
> >> > 11:32:31.668:  4109: ERR: ../lib/lock.h:148: futex (errno = 11 (Resource temporarily unavailable))
> >> >
> >> > [pid  4109] futex(0xffff87000000, FUTEX_WAIT, 2264924161, NULL) = -1 EAGAIN (Resource temporarily unavailable)
> >> > [pid  4109] newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}, 0) = 0
> >> > [pid  4109] write(2, "11:32:32.516:  4109: ERR: ../lib"..., 99) = 99
> >>
> >> Ugh, I adore this piece of code.
> >> The third time, lucky time? %)
> >
> > :( Sorry. Still:
> >
> > 11:54:16.950:  4210: ERR: ../lib/lock.h:148: futex (errno = 11 (Resource temporarily unavailable))
> >
> > [pid  4210] futex(0xffff9f650000, FUTEX_WAIT, 2674196480, NULL) = -1 EAGAIN (Resource temporarily unavailable)
> > [pid  4210] newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}, 0) = 0
> > [pid  4210] write(2, "11:54:16.950:  4210: ERR: ../lib"..., 99) = 99
> 
> Heh, could you attach 20-30 lines after the first
> futex(..., FUTEX_WAIT, 2,...) call in the endless loop?

4508  dup3(4, 0, 0)                     = 0
4508  close(4)                          = 0
4508  mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, 0, 0) = 0xffffa14a0000
4508  clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xffffa1502dd0) = 4509
4508  futex(0xffffa14a0000, FUTEX_WAIT, 0, NULL <unfinished ...>
4509  setsid()                          = 4509
4509  rt_sigaction(SIGCHLD, {SIG_DFL, [], SA_RESTART}, NULL, 8) = 0
4509  mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0xffffa1300000
4509  futex(0xffffa1300000, FUTEX_WAIT, 2704277504, NULL) = -1 EAGAIN (Resource temporarily unavailable)
4509  brk(NULL)                         = 0x1f8d0000
4509  brk(0x1f900000)                   = 0x1f900000
4509  brk(NULL)                         = 0x1f900000
4509  openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = 4
4509  fstat(4, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
4509  fstat(4, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
4509  mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffffa12f0000
4509  read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0\0"..., 8192) = 3519
4509  lseek(4, -2252, SEEK_CUR)         = 1267
4509  read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0\0\0\0"..., 8192) = 2252
4509  close(4)                          = 0
4509  munmap(0xffffa12f0000, 65536)     = 0
4509  write(2, "14:41:27.635:  4509: ERR: ../lib"..., 99) = 99
4509  futex(0xffffa1300000, FUTEX_WAIT, 2704277504, NULL) = -1 EAGAIN (Resource temporarily unavailable)
4509  newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}, 0) = 0
4509  write(2, "14:41:27.636:  4509: ERR: ../lib"..., 99) = 99
4509  futex(0xffffa1300000, FUTEX_WAIT, 2704277504, NULL) = -1 EAGAIN (Resource temporarily unavailable)
4509  newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}, 0) = 0
4509  write(2, "14:41:27.636:  4509: ERR: ../lib"..., 99) = 99

> I do see that there is a possible race like this:
> 
> futex val           CPU0/task0                      CPU1/task1
>                         ---------------                      ---------------
> 1                      atomic_inc()=2                  atomic_inc()=3
> 3                      futex(FUTEX_WAIT, 2)
> 3                      EWOULDBLOCK
> 3                      atomic_inc()=4
> 4
> futex(FUTEX_WAIT, 3)
> 4                                                               EWOULDBLOCK
> ...
> 
> But it looks quite unlikely to reproduce each cycle and I'm not sure
> it's the problem you're observing.
> 
> Thanks,
>              Dmitry

		Adrian


More information about the CRIU mailing list