[CRIU] Restarting Process in an LXC Container

Tycho Andersen tycho.andersen at canonical.com
Wed Jan 21 21:42:39 PST 2015


On Tue, Jan 20, 2015 at 02:43:09PM +0300, Andrew Vagin wrote:
> On Mon, Jan 19, 2015 at 06:09:47PM +0100, Thouraya TH wrote:
> > Hello :) thanks a lot for help.
> > 
> > How do you enter into CT? Do you use screen or ssh?
> > 
> > Before the Dumping Process:
> > 
> > root at g-2:~# lxc-ls -f
> > NAME     STATE    IPV4  IPV6  GROUPS  AUTOSTART 
> > -----------------------------------------------
> > worker   STOPPED  -     -     -       NO          
> > root at g-2:~# lxc-start -n worker
> > root at g-2:~# lxc-attach -n worker
> 
> I've understood the problem and its causes. If you execute a process in
> CT, it lives in CT's pid namespaces, but it belongs to another process
> tree, becuase its parent is outside of the CT.
> 
> Currently CRIU is able to dump only one process tree. I am not sure that
> we will fix this problem in a near future. I can suggest you to use
> screen or tmux, they have to workaround your problem.

If I've understood the problem correctly, it sounds like one thing we
could do to prevent this error is to have lxc-checkpoint make sure
nobody is attached to the container via lxc-attach? (Or ideally via
nsenter or whatever, although that may be harder.)

Tycho

> Thanks,
> Andrew Vagin
> 
> > root at worker:/home# cat > test.sh <<-EOF
> > > #!/bin/sh
> > > while :; do
> > >     sleep 1
> > >     date
> > > done
> > > EOF
> > root at worker:/home# chmod +x test.sh
> > root at worker:/home# ./test.sh
> > Mon Jan 19 17:58:57 CET 2015
> > Mon Jan 19 17:58:58 CET 2015
> > ..........................
> > After Restart, i have used ssh.
> > 
> > 
> > dump.log:
> > Warn  (fsnotify.c:183): fsnotify:       Handle 800003:9db17 cannot be opened
> > Warn  (fsnotify.c:183): fsnotify:       Handle 800003:9db1d cannot be opened
> > tar: ./udev/control: socket ignored
> > 
> > restore.log"
> > Warn  (cr-restore.c:996): Set CLONE_PARENT | CLONE_NEWPID but it might cause
> > restore problem,because not all kernels support such clone flags combinations!
> > RTNETLINK answers: File exists
> > RTNETLINK answers: File exists
> > RTNETLINK answers: File exists
> > 
> > Thank you so much for help.
> > Best Regards.
> > 
> > 2015-01-16 12:29 GMT+01:00 Andrew Vagin <avagin at parallels.com>:
> > 
> >     On Thu, Jan 15, 2015 at 02:02:56PM +0100, Thouraya TH wrote:
> >     > Hello,
> > 
> >     Hello,
> > 
> >     Add Tycho into CC.
> >    
> >     >
> >     > Please,i have a question about the restarting process of a LXC container.
> >     > i run this script http://criu.org/Simple_loop in a container:
> >     >
> >     > 1)
> >     > ubuntu at worker:/home$ ./test.sh
> >     > Wed Jan 14 22:30:23 CET 2015
> >     > Wed Jan 14 22:30:24 CET 2015
> >     > Wed Jan 14 22:30:25 CET 2015
> >     > Wed Jan 14 22:30:26 CET 2015
> >     > Wed Jan 14 22:30:35 CET 2015
> >     > .......................
> > 
> >     How do you enter into CT? Do you use screen or ssh?
> >    
> > 
> >     > 2) i have done the dumping process:
> >     >       root at g-3:/home# lxc-checkpoint -s -D /home/ImGLXC1Worker -n worker
> >     > 3) i have restart the container:
> >     >     root at g-3:/home/ImGLXC1Worker# lxc-checkpoint -r -D /home/
> >     ImGLXC1Worker -n
> >     > worker
> >     >      # lxc-ls -f
> >     > NAME     STATE    IPV4        IPV6  GROUPS  AUTOSTART
> >     > -----------------------------------------------------
> >     > worker   RUNNING  10.0.3.109  -     -       NO
> > 
> >     Thouraya, Could you increase verbose level for criu and show us dump and
> >     restore
> >     logs? Tycho, could you explain how to do this with lxc-checkpoint?
> > 
> >     Thanks,
> >     Andrew
> >    
> >     >
> >     > 4) ssh ubuntu@$(sudo lxc-info -n worker -H -i)
> >     > 5) ubuntu at worker:/home$ ps
> >     >   PID TTY          TIME CMD
> >     >   304 pts/0    00:00:00 bash
> >     >   318 pts/0    00:00:00 ps
> >     >
> >     > The process "test" didn't restart!
> >     >
> >     > i have done another test : ubuntu at worker:/home$ ./test.sh > Results.txt /
> >     > Dumping the container / Restarting the container/ The process "test"
> >     didn't
> >     > restart! and the file Results.txt didn't change !
> >     >
> >     > Have you an idea please ?
> >     >
> >     > Thanks a lot.
> >     > Best Regards.
> > 
> >     > _______________________________________________
> >     > CRIU mailing list
> >     > CRIU at openvz.org
> >     > https://lists.openvz.org/mailman/listinfo/criu
> > 
> > 
> > 


More information about the CRIU mailing list