[CRIU] [PATCH v3 12/12] p.haul: implement vz migration with shared ploops
Alexander Burluka
aburluka at virtuozzo.com
Tue Apr 12 01:36:42 PDT 2016
On Mon, Apr 11, 2016 at 09:38:47PM +0300, Pavel Emelyanov wrote:
> On 04/11/2016 07:29 PM, Alexander Burluka wrote:
> > Final commit that introduces vz migration with ploops
> > that are located on shared disks.
> > Brief algorithm description:
> > After freezing container source side creates copy of
> > every shared ploop DiskDescriptor.xml (named DiskDescriptor.xml.copy)
> > and makes snapshots delta on origin and copy.
> > Origin snapshot would be used on destination after successful
> > "vzctl restore" operation (this snapshot delta is created
> > on source and merged on destination so it is create as "offline
> > snapshot")
> > On restore fail case DiskDescriptor.xml.copy would replace origin file
> > and merged with copy snapshot delta.
> >
> > Signed-off-by: Alexander Burluka <aburluka at virtuozzo.com>
> > ---
> > phaul/iters.py | 32 ++++++++++++++++++++++----------
> > 1 file changed, 22 insertions(+), 10 deletions(-)
> >
> > diff --git a/phaul/iters.py b/phaul/iters.py
> > index cbac565..453360e 100644
> > --- a/phaul/iters.py
> > +++ b/phaul/iters.py
> > @@ -202,27 +202,36 @@ class phaul_iter_worker:
> > self.htype.final_dump(root_pid, self.img, self.criu_connection, self.fs)
> > self.target_host.end_iter()
> >
> > - # Handle final FS and images sync on frozen htype
> > - logging.info("Final FS and images sync")
> > - fsstats = self.fs.stop_migration()
> > - self.img.sync_imgs_to_target(self.target_host, self.htype,
> > - self.connection.mem_sk)
> > + try:
> > + # Handle final FS and images sync on frozen htype
> > + logging.info("Final FS and images sync")
> > + fsstats = self.fs.stop_migration()
> > +
> > + self.img.sync_imgs_to_target(self.target_host, self.htype,
> > + self.connection.mem_sk)
> >
> > - # Restore htype on target
> > - logging.info("Asking target host to restore")
> > - self.target_host.restore_from_images()
> > - logging.info("Restored on target host")
> > + # Restore htype on target
> > + logging.info("Asking target host to restore")
> > + self.target_host.restore_from_images()
> > + logging.info("Restored on target host")
> > + except:
>
> Which exception do we expect here?
>
Any kind of exception. We catch it, restore original ploop state and
reraise it. That is original phaul behavior but with additional step
> > + self.fs.restore_shared_backups()
>
> Why generic callback is called '_backups'?
>
It's just a draft name. Is it really confusing?
My thoughts were: ploop original state is snapshoted so you may say that
it is saved or backuped. "snapshot" will be way too specific as for me.
> > + raise
> >
> > # Ack previous dump request to terminate all frozen tasks
> > resp = self.criu_connection.ack_notify()
> > if not resp.success:
> > - raise Exception("Dump screwed up")
> > + logging.warning("Bad notification from target host")
> > +
> > + # cleanup shared disks backup
> > + self.fs.cleanup_shared_backups()
>
> Up above there was self.fs.stop_migration(). Why isn't this callback enough?
>
This operation makes source restore impossible. So it should be
somewhere after restore_shared_backups method. I made this cleanup as
soon as we got notification about success migration. It can be easily
done later.
> >
> > dstats = criu_api.criu_get_dstats(self.img)
> > migration_stats.handle_iteration(dstats, fsstats)
> >
> > logging.info("Migration succeeded")
> > self.htype.umount()
> > + self.target_host.final_cleanup(self.fs.prepare_src_data({}))
>
> This is probably comment for patch #10, but I'll ask one here -- why do we
> need to carry more args to final_cleanup()? All info needed for target_host
> should already be there.
>
Ploop snapshot guids are not stored on destination. I did not found more
natural way to deliver it there. Does it look weird?
> > migration_stats.handle_stop(self)
> > self.img.close()
> > self.criu_connection.close()
> > @@ -276,11 +285,14 @@ class phaul_iter_worker:
> > logging.info("Started on target host")
> >
> > except:
> > + self.fs.restore_shared_backups()
> > self.htype.start()
> > raise
> >
> > logging.info("Migration succeeded")
> > + self.fs.cleanup_shared_backups()
> > self.htype.umount()
>
> Can we do fs.cleanup_shared_backups() in htype.umount()?
>
We can but thereafter "umount" name would be wrong. What do you think
about "release" name?
> > + self.target_host.final_cleanup(self.fs.prepare_src_data({}))
> > migration_stats.handle_stop()
> >
> > def __check_live_iter_progress(self, index, dstats, prev_dstats):
> >
>
--
Regards,
Alexander Burluka
More information about the CRIU
mailing list