[CRIU] Docker HOWTO

Saied Kazemi saied at google.com
Fri Aug 29 17:38:51 PDT 2014


Hi Pavel,

I've put together the attached Docker HOWTO document for CRIU 1.3.  Hope
this helps.

Since the command line for checkpointing and restoring Docker containers is
very long and also there is a manual step to set up the filesystem before
restore, I will soon send you a shell script that does the heavy lifting.

Cheers!

--Saied
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140829/1f6497d5/attachment.html>
-------------- next part --------------
Docker

This HOWTO page describes how to checkpoint and restore a Docker
container.

Background

Starting with CRIU 1.3, it's possible to checkpoint and restore a
process tree running inside a Docker container.  However, it's
important to note that Docker needs native support for checkpoint
and restore in order to maintain its parent-child relationship and
to correctly keep track of container states.  In other words, while
CRIU can C/R a process tree, the restored tree will not become a
child of Docker and, from Docker's point of view, the container's
state will remain "Exited" (even after successful restore).

Work is in progress to add native checkpoint and restore support
to Docker.  Once ready, specific commands (for example, "docker
checkpoint" and "docker restore") will use CRIU to do the actual
C/R operations while Docker continues to maintain its parent-child
relationship and container states.

It's important to re-emphasize that by checkpointing and restoring
a Docker container, we mean C/R of a process tree running inside a
container, excluding the Docker daemon itself.  As CRIU currently
does not support nested PID namespaces, the C/R process tree cannot
include the Docker daemon which runs in the global PID namespace.

Command Line Options

In addition to the usual CRIU command line options used when
checkpointing and restoring a process tree, the following command
line options are needed for Docker containers.

--root

This option has been used in the past only for restore operations
that wanted to change the root of the mount namespace.  It was not
used for checkpoint operations.

However, because Docker by default uses the AUFS graph driver and
the AUFS module in the kernel reveals branch pathnames in
/proc/<pid>/map_files, --root is used to specify the root of the
mount namespace.  Once the kernel AUFS module is fixed, it won't
be necessary to specify this option anymore.

--ext-mount-map

This option is used to specify the path of the external bind mounts.
Docker sets up /etc/{hostname,hosts,resolv.conf} as targets with
source files outside the container's mount namespace.  Older versions
of Docker also bind mount /.dockerinit.

For example, assuming the default Docker configuration, /etc/hostname
in the container's mount namespace is bind mounted from the source
at /var/lib/docker/containers/<container_id>/hostname.

--manage-cgroups

When a process tree exits after a checkpoint operation, the cgroups
that Docker had created for the container are removed.  This option
is needed during restore to move the process tree into its cgroups,
re-creating them if necessary.

--evasive-devices

Docker bind mounts /dev/null on /dev/stdin for detached containers
(i.e., docker run -d ...).  Since earlier versions of Docker used
/dev/null in the global namespace, this option tells CRIU to treat
the global /dev/null and the container /dev/null as the same device.

Restore Prework

As mentioned earlier, by default Docker uses AUFS to set up the
container's filesystem.  When Docker notices that the process has
exited (due to criu dump), it dismantles the filesystem.  We need
to set up the filesystem again before attempting to restore.

An Example

Below is an example to show C/R operations for a shell script that
continuously appends a number to a file.  You can use tail -f to
see the process in action.

As you will see below, after restore, the process's parent is PID
1 (init), not Docker.  Also, although the process has been successfully
restored, Docker still thinks that the container has exited.

To set up the container's AUFS filesystem before restore, its branch
information should be saved before checkpointing the container.
For convenience, however, AUFS branch information is saved in the
dump.log file.  So we can examine dump.log to set up the filesystem
again.

For brevity, the 64-character long container ID is replaced by the
string <container_id> in the following lines.

$ docker run -d busybox:latest /bin/sh -c 'i=0; while true; do echo $i >> /foo; i=$(expr $i + 1); sleep 3; done'
<container_id>
$ 
$ docker ps
CONTAINER ID  IMAGE           COMMAND           CREATED        STATUS
168aefb8881b  busybox:latest  "/bin/sh -c 'i=0; 6 seconds ago  Up 4 seconds
$ 
$ sudo criu dump -o dump.log -v4 -t 17810 \
	-D /tmp/img/<container_id> \
	--root /var/lib/docker/aufs/mnt/<container_id> \
	--ext-mount-map /etc/resolv.conf:/etc/resolv.conf \
	--ext-mount-map /etc/hosts:/etc/hosts \
	--ext-mount-map /etc/hostname:/etc/hostname \
	--ext-mount-map /.dockerinit:/.dockerinit \
	--manage-cgroups \
	--evasive-devices
$
$ sudo grep successful /tmp/img/<container_id>/dump.log
(00.020103) Dumping finished successfully
$
$ docker ps -a
CONTAINER ID  IMAGE           COMMAND           CREATED        STATUS
168aefb8881b  busybox:latest  "/bin/sh -c 'i=0; 6 minutes ago  Exited (-1) 4 minutes ago
$
$ sudo mount -t aufs -o br=\
/var/lib/docker/aufs/diff/<container_id>:\
/var/lib/docker/aufs/diff/<container_id>-init:\
/var/lib/docker/aufs/diff/a9eb172552348a9a49180694790b33a1097f546456d041b6e82e4d7716ddb721:\
/var/lib/docker/aufs/diff/120e218dd395ec314e7b6249f39d2853911b3d6def6ea164ae05722649f34b16:\
/var/lib/docker/aufs/diff/42eed7f1bf2ac3f1610c5e616d2ab1ee9c7290234240388d6297bc0f32c34229:\
/var/lib/docker/aufs/diff/511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158:\
none /var/lib/docker/aufs/mnt/<container_id>
$
$ sudo criu restore -o restore.log -v4 -d
	-D /tmp/img/<container_id> \
	--root /var/lib/docker/aufs/mnt/<container_id> \
	--ext-mount-map /etc/resolv.conf:/var/lib/docker/containers/<container_id>/resolv.conf \
	--ext-mount-map /etc/hosts:/var/lib/docker/containers/<container_id>/hosts \
	--ext-mount-map /etc/hostname:/var/lib/docker/containers/<container_id>/hostname \
	--ext-mount-map /.dockerinit:/var/lib/docker/init/dockerinit-1.0.0 \
	--manage-cgroups \
	--evasive-devices
$
$ sudo grep successful /tmp/img/<container_id>/restore.log
(00.424428) Restore finished successfully. Resuming tasks.
$
$ ps -ef | grep /bin/sh
root     18580     1  0 12:38 ?        00:00:00 /bin/sh -c i=0; while true; do echo $i >> /foo; i=$(expr $i + 1); sleep 3; done
$
$ docker ps -a
CONTAINER ID  IMAGE           COMMAND           CREATED        STATUS
168aefb8881b  busybox:latest  "/bin/sh -c 'i=0; 7 minutes ago  Exited (-1) 5 minutes ago
$


More information about the CRIU mailing list