[CRIU] ***SPAM*** Re: docker restore from checkpoint - cgroup and mountpoints error

Pavel Emelyanov xemul at virtuozzo.com
Thu Jul 21 07:53:02 PDT 2016


On 07/17/2016 05:15 AM, vikram kaul wrote:
> Andrew, Ross, others

Ross has fell over from the cc list :) Putting him back.

>  I took the hint of 'container init scripts' and added the apache startup scripts to the init (CMD in dockerfile) so that it starts the service instead of I having to docker exec it. By doing that, the checkpoint and restore worked as I wanted it.
> 
>  So, this lead me experiment with a simple setup where I instantiate the source container and then later on exec the sample program (tcpdump in the background) as
> 
>       docker exec -d test-xenial-apache tcpdump -i any -nn -s 0 -w /tmp/f.pcap
> 
> Now, I try to just checkpoint it. It does work, but I don't see this process being checkpointed. What I have on the host is
> 
> root     10908 10884  0 21:32 ?        00:00:00 bash start.sh
> root     10978 10908  0 21:32 ?        00:00:00 /usr/sbin/apache2 -k start
> root     11053 10908  0 21:32 ?        00:00:00 tail -f /dev/null
> root     12499 12482  0 21:36 ?        00:00:00 tcpdump -i eth0 -nn -s 0 -w /tmp/f.pcap
> 
> And what I have in the container is:
> 
> root         1     0  0 01:32 ?        00:00:00 bash start.sh
> root        29     1  0 01:32 ?        00:00:00 /usr/sbin/apache2 -k start
> www-data    32    29  0 01:32 ?        00:00:00 /usr/sbin/apache2 -k start
> www-data    33    29  0 01:32 ?        00:00:00 /usr/sbin/apache2 -k start
> root        90     1  0 01:32 ?        00:00:00 tail -f /dev/null
> root        91     0  0 01:36 ?        00:00:00 tcpdump -i eth0 -nn -s 0 -w /tmp
> 
> But the only 'dumping' in /var/lib/docker/containers/<CID>/checkpoints/<CHECKPT>/criu.work/dump.log are
> 
> (00.000097) Dumping processes (pid: 10908)
> (00.011147) Dumping path for -3 fd via self 9 [/bin/bash]
> (00.204961) Dumping path for -3 fd via self 9 [/usr/sbin/apache2]
> (00.250838) Dumping path for -3 fd via self 12 [/usr/bin/tail]
> 
> 
> So, why is tcpdump not getting dumped ? Is this by design ? Why is hostPID process 12499 (container PID 91) not being dumped ? I am using latest xemul/criu from github.com <http://github.com> (ver 2.4).
> 
> What information can I provide to help you give me some pointers ?
> I can send you the entire dump.log, stats-dump and the dockerfile. From what I can see, docker is instantiating criu as with the following params
> 
>   persist open tcp connections = true (default in docker)
>   persist unix sockets  = true  (default in docker)
>   exit the container after checkpoint complete = false  (because I use --leave-running in docker checkpoint)
>   checkpoint shell jobs = false (default)
>   directory = /var/lib/containers/<CID>/checkpoints/<CHECKPT>/criu.work
>   create a namespace,.. = "network"
> 
> Could this be related to "checkpoint shell jobs" being set to false by docker ?
> 
> I could create a new topic for this specific query, if needed for clarity
> 
> Thanks
>  
> vikram
> 
> On Tue, Jul 12, 2016 at 11:19 PM, vikram kaul <kaul.vikram.kaul at gmail.com <mailto:kaul.vikram.kaul at gmail.com>> wrote:
> 
>     I am trying to do a C/R on a docker container. In the past I have been working with lightweight containers derived from alpine. However, I now have to use Ubuntu xenial containers. I have created a stackoverflow question for this (link given), but I will provide a summary so that you can get some context 
> 
>     http://stackoverflow.com/questions/38341520/docker-restore-from-checkpoint-cgroup-and-mountpoints-error
> 
>     So, I am getting 
> 
>     |mount.c:2555): mnt: Unable to statfs ./HOME: No such file or directory|
> 
>     and
> 
>     |Error (cgroup.c:1152): cg: No set 1 found|
> 
>     errors when I try to create a docker container from a checkpoint of an currently running container. When creating the checkpoint, I keep the source container running. Note that if I checkpoint (and shutdown the source container) and then restore the same container, it works.
> 
>     I upgraded to the latest criu/crit from source (ver 2.4) - seeing that there are a bunch of changes to cgroup handling - but that did not help.
> 
>     I presume that since I don't have any trouble with alpine derived containers with restoring to new ones while the source is still running,  it must be something related to Xenial derived containers. But I really don't know where to look. 
> 
>     Any help will be appreciated
>     Thanks
> 
> 
> 
> 
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu
> 



More information about the CRIU mailing list