[CRIU] Live migration of LXC containers using Criu?

Divjyot sethi dsethi at princeton.edu
Mon Jun 22 12:21:25 PDT 2015


Hello Pavel,
Thanks for the help -- I was finaly able to do end-end live migration with
Criu (version 1.5.2). I did not remove the cgroups but just made sure that
the directory structure at destination cgroup matched at the source to make
this work.

One question: this works with Criu version 1.5.2. I had done some
development with Criu version 1.3.1. In order to be on a faster track, I
was hoping to be able to use Criu version 1.3.1 for live migration -- do
you know if live migration could be made to work with that version?
Currently I get an RPC error with that version -- .

Error: Exception: CRIU RPC error (0/8)
Error (cr-service.c:705): Can't recv request: Connection reset by peer

Thanks!
Divjyot

On Mon, Jun 8, 2015 at 12:50 PM, Pavel Emelyanov <xemul at parallels.com>
wrote:

> On 06/05/2015 09:17 PM, Divjyot sethi wrote:
> > I see -- thanks. Can you please let me know what change should I
> specifically make in
> > the p.haul code to make this work?
>
> Removing the p_haul_cgroup.py (and fixing all the stuff linking to it)
> and adding "manage_cgroups" option to criu_opts on dump and restore.
>
> > Asking this as I am trying to replicate your setup where you were able
> to live migrate
> > OpenVZ containers on fedora (I am following:
> >
> https://github.com/xemul/p.haul/wiki/Live-migrating-OVZ-mainstream-container
> ).
> > Thanks for the help!
> >
> > On Fri, Jun 5, 2015 at 1:47 AM, Pavel Emelyanov <xemul at parallels.com
> <mailto:xemul at parallels.com>> wrote:
> >
> >     On 06/05/2015 11:20 AM, Divjyot sethi wrote:
> >     > Ok cool - thanks. Let me add --manage-cgroups in my p.haul-service
> at the point where it calls
> >     > restore. That would be sufficient? Or, since I am also using ovz
> driver - do I need to do something
> >     > for that as well?
> >
> >     Yes, since the cgroups management is off-loaded to CRIU, then the
> whole p_haul_cgroup.py should
> >     be removed (it used to be the code doing this).
> >
> >     > On Fri, Jun 5, 2015 at 1:05 AM, Pavel Emelyanov <
> xemul at parallels.com <mailto:xemul at parallels.com> <mailto:
> xemul at parallels.com <mailto:xemul at parallels.com>>> wrote:
> >     >
> >     >     On 06/05/2015 01:02 AM, Divjyot sethi wrote:
> >     >     > Thanks -- was able to fix it. :) Now another problem.
> Apparently restore doesnt work on desitination I get an error  saying:
> >     >     > 1: Error (cgroup.c:907): cg: Can't move into
> systemd//user.slice/user-1000.slice/session-3.scope/tasks (-1/-1): No such
> file or directory.
> >     >
> >     >     Ah. This is because p.haul doesn't feed the --manage-cgroups
> option into criu on restore. And, if you're using
> >     >     the ovz haul driver, tries to mess with cgroups itself, need
> do rip this piece from p.haul.
> >     >
> >     >     > Error (cr-restore.c 1896): Restoring FAILED.
> >     >     >
> >     >     > Seems that this error was in a prior mailing list where you
> asked to list the cgroups. I did that and it seems that
> >     >     > session-3.scop doesnt exist in user-1000.slice at
> destination (exists at source). Is there some way of creating this? The
> discussion
> >     >     > in the prioir mailing list doesnt seem to list a solution to
> this problem...
> >     >     >
> >     >     > Thanks,
> >     >     > Divjyot
> >     >     >
> >     >     > On Thu, Jun 4, 2015 at 2:36 AM, Pavel Emelyanov <
> xemul at parallels.com <mailto:xemul at parallels.com> <mailto:
> xemul at parallels.com <mailto:xemul at parallels.com>> <mailto:
> xemul at parallels.com <mailto:xemul at parallels.com> <mailto:
> xemul at parallels.com <mailto:xemul at parallels.com>>>> wrote:
> >     >     >
> >     >     >     On 06/04/2015 04:50 AM, Kir Kolyshkin wrote:
> >     >     >     >
> >     >     >     >
> >     >     >     > On 06/03/2015 06:45 PM, Divjyot sethi wrote:
> >     >     >     >> Hey Pavel,
> >     >     >     >> After a bit of a hiatus, I finally got around to
> istalling everything on my machines and am now trying
> >     >     >     >> to live migrate OpenVZ containers running CentOS with
> p.haul. I however get an error at CRIU dump stage
> >     >     >     >> -- log file says "Error (sk-unix.c:222): Unix socket
> 0x6893 without peer 0xc5b". Any thoughts on this issue?
> >     >     >     >
> >     >     >     > The message essentially means that there is a UNIX
> socket that has one end inside the container and the
> >     >     >     > other end out of it. Like, a container is running
> mysql and someone who's not inside that CT is connected
> >     >     >     > to it via a UNIX socket. CRIU warns you that if you
> will checkpoint a process at one end of such a socket,
> >     >     >     > a process at the other end might get disappointed. In
> case you know what you are doing, you can add
> >     >     >     > --ext-unix-sk to criu commandline to allow
> checkpointing of such processes.
> >     >     >
> >     >     >     Yup. These are connections to the outer world, the
> --ext-unix-sk should help, unless the
> >     >     >     connection is SOCK_STREAM. In the latter case you'll
> have to stop the process that has one.
> >     >     >
> >     >     >     > This is as much as I can tell without looking into
> speciifics.
> >     >     >
> >     >     >     -- Pavel
> >     >     >
> >     >     >
> >     >
> >     >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20150622/2aeecb52/attachment.html>


More information about the CRIU mailing list