[CRIU] [PATCH 01/10] p.haul: implement migration over existing connections

Tycho Andersen tycho.andersen at canonical.com
Thu Oct 15 12:20:53 PDT 2015


On Thu, Oct 15, 2015 at 12:21:35PM +0300, Pavel Emelyanov wrote:
> On 10/14/2015 10:27 PM, Tycho Andersen wrote:
> > Hi Nikita,
> > 
> > Thanks for this work, it will be very useful for us.
> > 
> > On Fri, Oct 09, 2015 at 09:11:33PM +0400, Nikita Spiridonov wrote:
> >> Remove standalone mode, p.haul now can work only over existing
> >> connections specified via command line arguments as file
> >> descriptors.
> >>
> >> Three arguments required - --fdrpc for rpc calls, --fdmem for c/r
> >> images migration and --fdfs for disk migration. Expect that each
> >> file descriptor represent socket opened in blocking mode with domain
> >> AF_INET and type SOCK_STREAM.
> > 
> > Do we have to require --fdfs here for anything? I haven't looked
> > through the code to see why exactly it is required.
> 
> The fd socket is required to copy filesystem, but (!) only if required.
> If the storage the container's files are on is shared, then this fd
> will effectively become unused.
> 
> I think we can do it like -- one can omit this parameter, but if the
> htype driver says that fs migration _is_ required, then p.haul will
> fail with error "no data channel for fs migration". Does this sound
> OK to you?

Yep, that sounds fine.

> > In LXD (and I guess openvz as well, with your ploop patch) we are
> > managing our own storage backends, and have our own mechanism for
> > transporting the rootfs. 
> 
> Can you shed more light on this? :) If there's some backend that can
> be used by us as well, maybe it would make sense to put migration code
> into p.haul?

Right now we have backends for zfs, lvm, btrfs, and just a regular
directory on a filesystem. I'm not aware of us planning support for
any other backends right now, but it's not out of the question.
Additionally, we also want to migrate a container's snapshots when we
migrate the container, which requires something to know about how we
handle snapshotting for these various storage backends as well.

We also support non-live copying containers, so we need the code even
without p.haul and ideally it would be good not to maintain it in two
places, but,

> > Ideally, I could invoke p.haul over an fd to
> > just do the criu iterative piece, and potentially do some callbacks to
> > tell LXD when the process is stopped so that we can do a final fs
> > sync.
> 
> The issue with fs sync is tightly coupled with memory migration iterations,
> that's why I planned to put all this stuff into p.haul. If you do the
> final fs sync and while doing this the amount of memory to be copied
> increases, it might make sense to do one more iteration of pre-copy.
> Without full p.haul control over both (memory and fs) it's hardly possible.

What about passing p.haul a socket and inventing a messaging protocol?
Then p.haul could ask LXD (or whoever) to sync the filesystem, but
also report any errors during migration better than just exit(1).

Tycho


More information about the CRIU mailing list