[CRIU] [PATCH v5 2/3] tests: fix builds on alpine and centos
Andrei Vagin
avagin at virtuozzo.com
Sat Jun 30 04:13:55 MSK 2018
On Fri, Jun 29, 2018 at 10:01:40AM +0200, Adrian Reber wrote:
> On Thu, Jun 28, 2018 at 03:20:36PM -0700, Andrei Vagin wrote:
> > On Thu, Jun 28, 2018 at 12:43:28PM +0000, Adrian Reber wrote:
> > > From: Adrian Reber <areber at redhat.com>
> > >
> > > Install sudo, create test user with ID 1000, install bash,
> > > fix pidfile creation and pidfile chmod.
> > >
> > > v2:
> > > * use sleep to give the criu daemon some time to start up
> > >
> > > v3:
> > > * Andrei is of course right and sleep is not good solution.
> > > After adding --status-fd support to criu service, this
> > > is how we now detect that criu is ready.
> > >
> > > v4:
> > > * This was much more complicated than expected which is related
> > > to the different versions of the tools on the different travis
> > > test targets. There seems to be a bug in bash on Ubuntu
> > > https://lists.gnu.org/archive/html/bug-bash/2017-07/msg00039.html
> > > which prevents using 'read -n1' on Ubuntu. As a workaround
> > > the result from CRIU's status FD is now read via python.
> > >
> > > Another problem was discovered on alpine with the loop restore test.
> > > CRIU says to use setsid even if the process is already using setsid.
> > > As a workaround, still with setsid, this process is now using
> > > shell-job true for checkpoint and restore.
> > >
> > > Parts of v2 have been committed before. So the changes from this commit
> > > are partially already in another commit.
> > >
> > > Signed-off-by: Adrian Reber <areber at redhat.com>
> > > ---
> > > scripts/build/Dockerfile.centos | 4 ++++
> > > test/others/rpc/Makefile | 17 +++++++++++++----
> > > test/others/rpc/read.py | 18 ++++++++++++++++++
> > > test/others/rpc/restore-loop.py | 5 +++++
> > > test/others/rpc/run.sh | 4 +++-
> > > 5 files changed, 43 insertions(+), 5 deletions(-)
> > > create mode 100644 test/others/rpc/read.py
> > >
> > > diff --git a/scripts/build/Dockerfile.centos b/scripts/build/Dockerfile.centos
> > > index 0160b75..d8e70ac 100644
> > > --- a/scripts/build/Dockerfile.centos
> > > +++ b/scripts/build/Dockerfile.centos
> > > @@ -40,4 +40,8 @@ WORKDIR /criu
> > > ENV CCACHE_DIR=/tmp/.ccache CCACHE_NOCOMPRESS=1 $ENV1=yes
> > > RUN mv .ccache /tmp && make mrproper && ccache -sz && \
> > > date && make -j $(nproc) CC="$CC" && date && ccache -s
> > > +
> > > +# The rpc test cases are running as user #1000, let's add the user
> > > +RUN adduser -u 1000 test
> > > +
> > > RUN make -C test/zdtm -j $(nproc)
> > > diff --git a/test/others/rpc/Makefile b/test/others/rpc/Makefile
> > > index 2b15873..50cd063 100644
> > > --- a/test/others/rpc/Makefile
> > > +++ b/test/others/rpc/Makefile
> > > @@ -4,13 +4,22 @@ all: test-c rpc_pb2.py criu
> > > CFLAGS += -g -Werror -Wall -I.
> > > LDLIBS += -lprotobuf-c
> > >
> > > +PYTHON ?= python
> > > +
> > > run: all
> > > mkdir -p build
> > > chmod a+rwx build
> > > - @# need to start the criu daemon here to access the pidfile
> > > - sudo -g '#1000' -u '#1000' ./criu service -v4 -W build -o service.log --address criu_service.socket -d --pidfile pidfile
> > > - # Give the criu daemon some time to start up
> > > - sleep 0.5
> > > + rm -f build/status
> > > + sudo -g '#1000' -u '#1000' mkfifo build/status
> > > + @# Need to start the criu daemon here to access the pidfile.
> > > + @# The script read.py is used to wait until 'criu service'
> > > + @# is ready. As 'read -n 1' in some releases has a bug and does
> > > + @# not read correctly a \0, using python is a workaround.
> > > + sudo -g '#1000' -u '#1000' -- bash -c "exec 200<>build/status; \
> > > + ./criu service -v4 -W build --address criu_service.socket \
> > > + -d --pidfile pidfile -o service.log --status-fd 200; \
> > > + $(PYTHON) read.py build/status"
> >
> > criu service daemonizes after creating a socker, so I don't understand
> > why do we need to wait something or why do we need to use status-fd?
>
> The problem is not the socket, but the pidfile. The pidfile is written
> after forking and sometimes I saw problems in the tests because the
> pidfile did not yet exist. The status_fd is closed after the pidfile is
> written.
pidfile has to be written from a parent process before it exits. We
probably need to write our own implementation of the glibc's daemon().
>
> Adrian
More information about the CRIU
mailing list