[CRIU] Fwd: Checkpoint failure on arm64 platform

Vijay Kilari vijay.kilari at gmail.com
Mon Dec 21 05:10:57 PST 2015


On Mon, Dec 21, 2015 at 2:58 PM, Pavel Emelyanov <xemul at parallels.com> wrote:
> On 12/21/2015 11:00 AM, Vijay Kilari wrote:
>> Reposting.
>>
>> ---------- Forwarded message ----------
>> From: Vijay Kilari <vijay.kilari at gmail.com>
>> Date: Mon, Dec 21, 2015 at 11:45 AM
>> Subject: Checkpoint failure on arm64 platform
>> To: criu at openvz.org
>>
>>
>> Hi,
>>
>> I am trying to do docker checkpoint/restore on arm64 platform.
>> The checkpoint fails while with the sys_readlink of /proc/self.
>> Below is the list of steps that I am trying.
>>
>> crui check returns as below.
>>
>> ubuntu at ubuntu:~/criu/criu-1.7$ sudo criu check
>> Error (cr-check.c:602): Kernel doesn't support PTRACE_O_SUSPEND_SECCOMP
>> Warn  (cr-check.c:619): Dirty tracking is OFF. Memory snapshot will not work.
>> Error (cr-check.c:749): CLONE_PARENT | CLONE_NEWPID don't work together
>> ubuntu at ubuntu:~/criu/criu-1.7$
>
> You're using criu-1.7, aren't you? Would you update on 1.8?

With 1.8 version also, I face same problem.

>
>> Am I missing some kernel configuration or patches required for criu required
>> for kernel 4.2?.
>>
>> ubuntu at ubuntu:~$ sudo docker run -d justinzh/arm64-vivid:latest tail
>> -f /dev/null
>>
>> and with checkpoint, I get below error. Cannot readlink /proc/self(-9)
>
> This is EBADF when accessing /proc/self from inside the container. Can
> you check what's really there?

I digged little bit, from the below log, parasite with pid 1456 is switched to
daemon mode and criu issues PARASITE_CMD_GET_PROC_FD command
to parasite daemon. This pid 1456 is the 'tail' process running inside
container.

ubuntu at ubuntu:~/criu/criu-1.8$ ps -eaf | grep tail
root      1456   884  0 10:56 ?        00:00:00 tail -f /dev/null
ubuntu    5341  1380  0 13:09 pts/0    00:00:00 grep --color=auto tail

dump.log
------------
(00.100990) Collecting fds (pid: 1456)
(00.100999) ----------------------------------------
(00.101035) Found 5 file descriptors
(00.101049) ----------------------------------------
(00.101077) Dump private signals of 1456
(00.101096) Dump shared signals of 1456
(00.101112) Parasite syscall_ip at 0x400000
(00.101239) Putting parasite blob into 0x3ffb1b00000->0x3ff82ea0000
(00.101279) Dumping GP/FPU registers for 1456
(00.101305) Putting tsock into pid 1456
(00.101419) Wait for parasite being daemonized...
pie: Running daemon thread leader
(00.101441) Wait for ack 2 on daemon socket
pie: __sent ack msg: 2 2 0
(00.101475) Fetched ack: 2 2 0
pie: Daemon waits for command
(00.101489) Parasite 1456 has been switched to daemon mode
(00.101518) Sent msg to daemon 15 0 0
pie: __fetched msg: 15 0 0
pie: In parasite_get_proc_fd
pie: Error (pie/parasite.c:293): Can't readlink /proc/self (-9)
pie: __sent ack msg: 15 15 -9
pie: Error (pie/parasite.c:636): Close the control socket for writing
>
(00.106975) Error (parasite-syscall.c:815): Can't retrieve FD from socket
pie: Daemon waits for command
(00.106999) Wait for ack 15 on daemon socket
(00.107036) Error (parasite-syscall.c:298): Message reply from daemon
is trimmed (12/0)
(00.107047) Error (cr-dump.c:1216): Can't get proc fd (pid: 1456)
(00.107066) Waiting for 1456 to trap
(00.107080) Daemon 1456 exited trapping

In the kernel in readlinkat syscall, I have put printk to know the context
in which /proc/self is read. It shows the same process id 1456 and
name as 'tail'
which is the process running inside container.

[ 6461.973166] In readlinkat error < 0 -9 pid 1456 name tail

zdtm.sh shows dump is successful where as re-store is failing with
clone syscall.
Looks like zdtm.sh is not testing /proc/self.

ubuntu at ubuntu:~/criu/criu-1.8$ sudo ./test/zdtm.sh
================================= CRIU CHECK =================================
Error (cr-check.c:634): Kernel doesn't support PTRACE_O_SUSPEND_SECCOMP
Error (cr-check.c:572): read: Invalid argument
Error (cr-check.c:826): CLONE_PARENT | CLONE_NEWPID don't work together
============================= WARNING =============================
Not all features needed for CRIU are merged to upstream kernel yet,
so for now we maintain our own branch which can be cloned from:
git://git.kernel.org/pub/scm/linux/kernel/git/gorcunov/linux-cr.git
===================================================================
Execute static/pipe00
./pipe00 --pidfile=pipe00.pid --outfile=pipe00.out
Dump 5319
Restore
Test: zdtm/live/static/pipe00, Result: FAIL
==================================== ERROR ====================================
Test: zdtm/live/static/pipe00, Namespace:
Dump log   : /home/ubuntu/criu/criu-1.8/test/dump/static/pipe00/5319/1/dump.log
--------------------------------- grep Error ---------------------------------
------------------------------------- END -------------------------------------
Restore log: /home/ubuntu/criu/criu-1.8/test/dump/static/pipe00/5319/1/restore.log
--------------------------------- grep Error ---------------------------------
(00.012996) Error (cr-restore.c:1175): Can't fork for 5319: Invalid argument
(00.013053) Error (cr-restore.c:1995): Restoring FAILED.
------------------------------------- END -------------------------------------
================================= ERROR OVER =================================
ubuntu at ubuntu:~/criu/criu-1.8$ criu --version
Version: 1.8

>
> -- Pavel


More information about the CRIU mailing list