[CRIU] [PATCH v2] tests: fix builds on alpine and centos

Adrian Reber areber at redhat.com
Wed Jun 27 00:09:34 MSK 2018


On Tue, Jun 26, 2018 at 07:47:22PM +0100, Dmitry Safonov wrote:
> 2018-06-26 19:43 GMT+01:00 Dmitry Safonov <0x7f454c46 at gmail.com>:
> > 2018-06-26 19:29 GMT+01:00 Dmitry Safonov <0x7f454c46 at gmail.com>:
> >> 2018-06-26 18:00 GMT+01:00 Adrian Reber <areber at redhat.com>:
> >>> On Tue, Jun 26, 2018 at 09:37:08AM -0700, Andrei Vagin wrote:
> >>>> On Tue, Jun 26, 2018 at 08:24:08AM +0200, Adrian Reber wrote:
> >>>> > On Thu, Jun 21, 2018 at 11:39:19PM +0200, Adrian Reber wrote:
> >>>> > > On Thu, Jun 21, 2018 at 02:35:38PM -0700, Andrei Vagin wrote:
> >>>> > > > On Thu, Jun 21, 2018 at 09:10:38PM +0000, Adrian Reber wrote:
> >>>> > > > > From: Adrian Reber <areber at redhat.com>
> >>>> > > > >
> >>>> > > > > Install sudo, create test user with ID 1000, install bash,
> >>>> > > > > fix pidfile creation and pidfile chmod.
> >>>> > > > >
> >>>> > > > > v2:
> >>>> > > > >  * use sleep to give the criu daemon some time to start up
> >>>> > > >
> >>>> > > > Can we use --status-fd? It is designed for this.
> >>>> > >
> >>>> > > Oh, very good idea. Thanks. I will try that.
> >>>> >
> >>>> > Just to let you know. I have problems reading the result from the
> >>>> > status_fd on all our different travis targets. It works for most shells,
> >>>> > but not all of them. Still trying to understand how to correctly solve
> >>>> > this, so that it works everywhere.
> >>>>
> >>>> We already install bash in all docker containers, so you can create a
> >>>> bash script.
> >>>
> >>> I currently have the problem that 'read -n 1' hangs on Ubuntu and it is
> >>> not clear why. I have the following test case:
> >>>
> >>> bash -c 'rm -f status; mkfifo status; exec 201<>status;  ./a.out 201; read -n1 -u 201'
> >>>
> >>> The test program a.out (great name) is really simple:
> >>>
> >>> #include <stdio.h>
> >>> #include <stdlib.h>
> >>>
> >>> int main(int argc, char *argv[])
> >>> {
> >>>         int status_fd;
> >>>         char c = 0;
> >>>         int r;
> >>>
> >>>         status_fd = atoi(argv[1]);
> >>>
> >>>         printf("status_fd %d\n", status_fd);
> >>>         r = write(status_fd, &c, 1);
> >>>         printf("write %d\n", r);
> >>> }
> >>>
> >>> And strace tells me the following:
> >>>
> >>> fcntl(201, F_GETFD)                     = 0
> >>> ioctl(201, TCGETS, 0x7ffd97626ad0)      = -1 ENOTTY (Inappropriate ioctl for device)
> >>> lseek(201, 0, SEEK_CUR)                 = -1 ESPIPE (Illegal seek)
> >>> read(201, "\0", 1)                      = 1
> >>> read(201,
> >>>
> >>> On CentOS and alpine it does not hang and I get the following:
> >>>
> >>> fcntl(201, F_GETFD)                     = 0
> >>> ioctl(201, TCGETS, 0x7ffeeb6ee6a0)      = -1 ENOTTY (Inappropriate ioctl for device)
> >>> lseek(201, 0, SEEK_CUR)                 = -1 ESPIPE (Illegal seek)
> >>> read(201, "\0", 1)                      = 1
> >>> exit_group(0)                           = ?
> >>>
> >>> So on Ubuntu, for some reason, 'read -n 1' does not stop after reading a
> >>> single byte from the pipe and that seems to be the problem I have in
> >>> travis.
> >>>
> >>> Can you reproduce this behavior? Do you have an idea why this is
> >>> happening? I am doing something wrong? I am already looking at the code
> >>> for a few days and it is not clear what is happening here.
> >>
> >> It looks like, `read' bash command doesn't count zero-bytes with -n.
> >>
> >>> char c = 0;
> >
> > IOW,
> > bash -c 'read a -n1 < /dev/zero'
> > hangs for me indefinetely.
> 
> JFI:
> it looks like, bash folks says they fixed it somehow:
> https://lists.gnu.org/archive/html/bug-bash/2017-07/msg00039.html
> 
> So, probably, you've got a newer version on Alpine.

Thanks for finding this. This seems to be the problem I am seeing.

Unfortunately it does not really help as we depend on the version of
bash in the Ubuntu travis image. At least I do not know how to install a
newer version of bash to fix it. And writing another character than \0
to the status_fd would also work, but CRIU already defined \0 as the
character it is writing. So this is also difficult to change.

This is really unfortunate and I am not sure how to correctly fix it. I
could temporarily patch CRIU to write something else during the test,
but this seems like the wrong approach.

Any other ideas how check if CRIU wrote something to the status fd?

Using python to read it seems to work:

bash -c 'rm -f status; mkfifo status; exec 201<>status; ./a.out 201; python -c "import os; f=open(\"status\") ; f.read(1); "'

But this looks really like overkill for a test case. I will try to get
it verified by travis on all test targets, if no one has a better idea.

		Adrian


More information about the CRIU mailing list