[CRIU] [PATCH v3 12/12] p.haul: implement vz migration with shared ploops

Pavel Emelyanov xemul at virtuozzo.com
Mon Apr 11 11:38:47 PDT 2016


On 04/11/2016 07:29 PM, Alexander Burluka wrote:
> Final commit that introduces vz migration with ploops
> that are located on shared disks.
> Brief algorithm description:
> After freezing container source side creates copy of
> every shared ploop DiskDescriptor.xml (named DiskDescriptor.xml.copy)
> and makes snapshots delta on origin and copy.
> Origin snapshot would be used on destination after successful
> "vzctl restore" operation (this snapshot delta is created
> on source and merged on destination so it is create as "offline
> snapshot")
> On restore fail case DiskDescriptor.xml.copy would replace origin file
> and merged with copy snapshot delta.
> 
> Signed-off-by: Alexander Burluka <aburluka at virtuozzo.com>
> ---
>  phaul/iters.py | 32 ++++++++++++++++++++++----------
>  1 file changed, 22 insertions(+), 10 deletions(-)
> 
> diff --git a/phaul/iters.py b/phaul/iters.py
> index cbac565..453360e 100644
> --- a/phaul/iters.py
> +++ b/phaul/iters.py
> @@ -202,27 +202,36 @@ class phaul_iter_worker:
>  		self.htype.final_dump(root_pid, self.img, self.criu_connection, self.fs)
>  		self.target_host.end_iter()
>  
> -		# Handle final FS and images sync on frozen htype
> -		logging.info("Final FS and images sync")
> -		fsstats = self.fs.stop_migration()
> -		self.img.sync_imgs_to_target(self.target_host, self.htype,
> -			self.connection.mem_sk)
> +		try:
> +			# Handle final FS and images sync on frozen htype
> +			logging.info("Final FS and images sync")
> +			fsstats = self.fs.stop_migration()
> +
> +			self.img.sync_imgs_to_target(self.target_host, self.htype,
> +				self.connection.mem_sk)
>  
> -		# Restore htype on target
> -		logging.info("Asking target host to restore")
> -		self.target_host.restore_from_images()
> -		logging.info("Restored on target host")
> +			# Restore htype on target
> +			logging.info("Asking target host to restore")
> +			self.target_host.restore_from_images()
> +			logging.info("Restored on target host")
> +		except:

Which exception do we expect here?

> +			self.fs.restore_shared_backups()

Why generic callback is called '_backups'?

> +			raise
>  
>  		# Ack previous dump request to terminate all frozen tasks
>  		resp = self.criu_connection.ack_notify()
>  		if not resp.success:
> -			raise Exception("Dump screwed up")
> +			logging.warning("Bad notification from target host")
> +
> +		# cleanup shared disks backup
> +		self.fs.cleanup_shared_backups()

Up above there was self.fs.stop_migration(). Why isn't this callback enough?

>  
>  		dstats = criu_api.criu_get_dstats(self.img)
>  		migration_stats.handle_iteration(dstats, fsstats)
>  
>  		logging.info("Migration succeeded")
>  		self.htype.umount()
> +		self.target_host.final_cleanup(self.fs.prepare_src_data({}))

This is probably comment for patch #10, but I'll ask one here -- why do we
need to carry more args to final_cleanup()? All info needed for target_host
should already be there.

>  		migration_stats.handle_stop(self)
>  		self.img.close()
>  		self.criu_connection.close()
> @@ -276,11 +285,14 @@ class phaul_iter_worker:
>  			logging.info("Started on target host")
>  
>  		except:
> +			self.fs.restore_shared_backups()
>  			self.htype.start()
>  			raise
>  
>  		logging.info("Migration succeeded")
> +		self.fs.cleanup_shared_backups()
>  		self.htype.umount()

Can we do fs.cleanup_shared_backups() in htype.umount()?

> +		self.target_host.final_cleanup(self.fs.prepare_src_data({}))
>  		migration_stats.handle_stop()
>  
>  	def __check_live_iter_progress(self, index, dstats, prev_dstats):
> 



More information about the CRIU mailing list