[Debian] Re: Bug#494890: extra information

Ola Lundqvist ola at inguza.com
Thu Aug 14 07:32:52 EDT 2008


Hi Alexander

On Thu, Aug 14, 2008 at 01:04:10PM +0200, Alexander Prinsier wrote:
> Ola Lundqvist schreef:
> >Hi Alexander
> >
> >I think it would be good to know the following:
> >* A copy of your vzmigrate script as it was modified. It is hard to tell
> >  exactly what line that caused the problem otherwise.
> 
> I attached the modified vzmigrate. Only one line was modified.
> 
> >* All your "unusual setup", to know such things as /backup2 etc. Are
> >  everything in /backup2 or is some parts in other places?
> 
> Well the only unusual thing is that when I wanted to try out OpenVZ, I 
> had no separate partition to put it on, and /backup2 was the only one 
> with enouph space.

Ok.

> From /etc/vz/vz.conf:
> 
> LOCKDIR=/backup2/OpenVZ/lock
> DUMPDIR=/backup2/OpenVZ/dump
> TEMPLATE=/backup2/OpenVZ/template
> VE_ROOT=/backup2/OpenVZ/root/$VEID
> VE_PRIVATE=/backup2/OpenVZ/private/$VEID

Looks ok to me.

> Just to rule this out: /backup2 doesn't have much free space left. It 
> shouldn't crash when it runs out of space right?

Well it is not supposed to crash ever, but that could be the case.

> I'm running kernel 2.6.18-028stab035.1-ovz-smp from the debian.systs.org 
> repository.

Ok, thanks. I'll add Thorsten Schifferdecker that has built that version
of the kernel to this discussion. There is a crash in the kernel and he
may know about this. I have also added the openvz project people to see
if they know any such issues.

Hope they can help to see if there are any such known faults.

Best regards,

// Ola

> Alexander

> #!/bin/sh
> # Copyright (C) 2000-2007 SWsoft. All rights reserved.
> #
> # This program is free software; you can redistribute it and/or
> # modify it under the terms of the GNU General Public License
> # as published by the Free Software Foundation; either version 2
> # of the License, or (at your option) any later version.
> #
> # This program is distributed in the hope that it will be useful,
> # but WITHOUT ANY WARRANTY; without even the implied warranty of
> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> # GNU General Public License for more details.
> #
> # You should have received a copy of the GNU General Public License
> # along with this program; if not, write to the Free Software
> # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> #
> #
> # vzmigrate is used for VE migration to another node
> #
> # Usage:
> # vzmigrate [-r yes|no] [--ssh=<options>] [--keep-dst] [--online] [-v]
> #           destination_address VEID
> # Options:
> #	-r, --remove-area yes|no
> #		Whether to remove VE on source HN for successfully migrated VE.
> #	--ssh=<ssh options>
> #		Additional options that will be passed to ssh while establishing
> #		connection to destination HN. Please be careful with options
> #		passed, DO NOT pass destination hostname.
> #	--keep-dst
> #		Do not clean synced destination VE private area in case of some
> #		error. It makes sense to use this option on big VE migration to
> #		avoid syncing VE private area again in case some error
> #		(on VE stop for example) occurs during first migration attempt.
> #	--online
> #		Perform online (zero-downtime) migration: during the migration the
> #		VE hangs for a while and after the migration it continues working
> #		as though nothing has happened.
> #	-v
> #		Verbose mode. Causes vzmigrate to print debugging messages about
> #		its progress (including some time statistics).
> #
> # Examples:
> #	Online migration of VE #101 to foo.com:
> #		vzmigrate --online foo.com 101
> #	Migration of VE #102 to foo.com with downtime:
> #		vzmigrate foo.com 102
> # NOTE:
> #	This program uses ssh as a transport layer. You need to put ssh
> #	public key to destination node and be able to connect without
> #	entering a password.
> 
> 
> ACT_SCRIPTS_SFX="start stop mount umount"
> SSH_OPTIONS=""
> SSH="ssh $SSH_OPTIONS"
> SCP_OPTIONS=""
> SCP="scp $SCP_OPTIONS"
> RSYNC_OPTIONS="-aH --delete --numeric-ids"
> RSYNC="rsync $RSYNC_OPTIONS"
> 
> online=0
> verbose=0
> remove_area=1
> keep_dst=0
> debug=0
> confdir="/etc/vz/conf"
> vzconf="/etc/vz/vz.conf"
> tmpdir="/var/tmp"
> act_scripts=
> 
> # Errors:
> MIG_ERR_USAGE=1
> MIG_ERR_VPS_IS_STOPPED=2
> MIG_ERR_CANT_CONNECT=4
> MIG_ERR_COPY=6
> MIG_ERR_START_VPS=7
> MIG_ERR_STOP_SOURCE=8
> MIG_ERR_EXISTS=9
> MIG_ERR_NOEXIST=10
> MIG_ERR_IP_IN_USE=12
> MIG_ERR_QUOTA=13
> MIG_ERR_CHECKPOINT=$MIG_ERR_STOP_SOURCE
> MIG_ERR_MOUNT_VPS=$MIG_ERR_START_VPS
> MIG_ERR_RESTORE_VPS=$MIG_ERR_START_VPS
> MIG_ERR_OVZ_NOT_RUNNING=14
> MIG_ERR_APPLY_CONFIG=15
> 
> usage() {
> 	cat >&2 <<EOF
> This program is used for VE migration to another node
> Usage:
> vzmigrate [-r yes|no] [--ssh=<options>] [--keep-dst] [--online] [-v]
> 	destination_address <VEID>
> Options:
> -r, --remove-area yes|no
> 	Whether to remove VE on source HN for successfully migrated VE.
> --ssh=<ssh options>
> 	Additional options that will be passed to ssh while establishing
> 	connection to destination HN. Please be careful with options
> 	passed, DO NOT pass destination hostname.
> --keep-dst
> 	Do not clean synced destination VE private area in case of some
> 	error. It makes sense to use this option on big VE migration to
> 	avoid syncing VE private area again in case some error
> 	(on VE stop for example) occurs during first migration attempt.
> --online
> 	Perform online (zero-downtime) migration: during the migration the
> 	VE hangs for a while and after the migration it continues working
> 	as though nothing has happened.
> -v
> 	Verbose mode. Causes vzmigrate to print debugging messages about
> 	its progress (including some time statistics).
> EOF
> 	exit $MIG_ERR_USAGE
> }
> 
> # Logs message
> # There are 3 types of messages:
> # 0 - error messages (print to stderr)
> # 1 - normal messages (print to stdout)
> # 2 - debug messages (print to stdout if in verbose mode)
> log () {
> 	if [ $1 -eq 0 ]; then
> 		shift
> 		echo -e "Error: " $@ >&2
> 	elif [ $1 -eq 1 ]; then
> 		shift
> 		echo -e $@
> 	elif [ $verbose -eq 1 ]; then
> 		shift
> 		echo -e "   " $@
> 	fi
> }
> 
> # Executes command and returns result of execution
> # There are 2 types of execution:
> # 1 - normal execution (all output will be printed)
> # 2 - debug execution (output will be printed if verbose mode is set,
> #     in other case stdout and stderr redirected to /dev/null)
> logexec () {
> 	if [ $1 -eq 1 -o $verbose -eq 1 ]; then
> 		shift
> 		$@
> 	else
> 		shift
> 		$@ >/dev/null 2>&1
> 	fi
> }
> 
> undo_conf () {
> 	$SSH "root@$host" "rm -f $vpsconf"
> }
> 
> undo_act_scripts () {
> 	if [ -n "$act_scripts" ] ; then
> 		$SSH "root@$host" "rm -f $act_scripts"
> 	fi
> 	undo_conf
> }
> 
> undo_private () {
> 	if [ $keep_dst -eq 0 ]; then
> 		$SSH "root@$host" "rm -rf $VE_PRIVATE"
> 	fi
> 	undo_act_scripts
> }
> 
> undo_root () {
> 	$SSH "root@$host" "rm -rf $VE_ROOT"
> 	undo_private
> }
> 
> undo_quota_init () {
> 	[ "${DISK_QUOTA}" = 'no' ] || $SSH "root@$host" "vzquota drop $VEID"
> 	undo_root
> }
> 
> undo_quota_on () {
> 	[ "${DISK_QUOTA}" = 'no' ] || $SSH "root@$host" "vzquota off $VEID"
> 	undo_quota_init
> }
> 
> undo_sync () {
> 	# Root will be destroed in undo_root
> 	undo_quota_on
> }
> 
> undo_suspend () {
> 	logexec 2 vzctl chkpnt $VEID --resume
> 	undo_sync
> }
> 
> undo_dump () {
> 	if [ $debug -eq 0 ]; then
> 		rm -f "$VE_DUMPFILE"
> 	fi
> 	undo_suspend
> }
> 
> undo_copy_dump () {
> 	$SSH "root@$host" "rm -f $VE_DUMPFILE"
> 	undo_suspend
> }
> 
> undo_stop () {
> 	if [ "$state" = "running" ]; then
> 		vzctl start $VEID
> 	elif [ "$mounted" = "mounted" ]; then
> 		vzctl mount $VEID
> 	fi
> 	undo_sync
> }
> 
> undo_source_stage() {
> 	if [ $online -eq 1 ]; then
> 		undo_copy_dump
> 	else
> 		undo_stop
> 	fi
> }
> 
> undo_quota_dump () {
> 	rm -f "$VE_QUOTADUMP"
> 	undo_source_stage
> }
> 
> undo_copy_quota () {
> 	$SSH "root@$host" "rm -f $VE_QUOTADUMP"
> 	undo_quota_dump
> }
> 
> undo_undump () {
> 	logexec 2 $SSH root@$host vzctl restore $VEID --kill
> 	undo_copy_quota
> }
> 
> get_status() {
> 	exist=$3
> 	mounted=$4
> 	state=$5
> }
> 
> get_time () {
> 	awk -v t2=$2 -v t1=$1 'BEGIN{print t2-t1}'
> }
> 
> if [ $# -lt 2 ]; then
> 	usage
> fi
> 
> while [ ! -z "$1" ]; do
> 	log 1 "OPT:$1"
> 	case "$1" in
> 	--online)
> 		online=1
> 		;;
> 	-v)
> 		verbose=1
> 		;;
> 	--remove-area|-r)
> 		shift
> 		if [ "$1" = "yes" ]; then
> 			remove_area=1
> 		elif [ "$1" = "no" ]; then
> 			remove_area=0
> 		else
> 			usage
> 		fi
> 		;;
> 	--keep-dst)
> 		keep_dst=1
> 		;;
> 	--ssh=*)
> 		SSH_OPTIONS="$SSH_OPTIONS $(echo $1 | cut -c7-)"
> 		SSH="ssh $SSH_OPTIONS"
> 		SCP_OPTIONS="`echo $SSH_OPTIONS | sed 's/-p/-P/1'`"
> 		SCP="scp $SCP_OPTIONS"
> 		;;
> 	*)
> 		break
> 		;;
> 	esac
> 	shift
> done
> 
> host=$1
> shift
> VEID=$1
> shift
> 
> if [ -z "$host" -o -z "$VEID" -o $# -ne 0 ]; then
> 	usage
> fi
> 
> vpsconf="$confdir/$VEID.conf"
> 
> if [ ! -r "$vzconf" -o ! -r "$vpsconf" ]; then
> 	log 0 "Can't read global config or VE #$VEID config file"
> 	exit $MIG_ERR_NOEXIST
> fi
> 
> get_status $(vzctl status $VEID)
> if [ "$exist" = "deleted" ]; then
> 	log 0 "VE #$VEID doesn't exist"
> 	exit $MIG_ERR_NOEXIST
> fi
> 
> if [ $online -eq 1 ]; then
> 	log 1 "Starting online migration of VE $VEID on $host"
> else
> 	log 1 "Starting migration of VE $VEID on $host"
> fi
> 
> # Try to connect to destination
> if ! logexec 2 $SSH -o BatchMode=yes root@$host /bin/true; then
> 	log 0 "Can't connect to destination address using public key"
> 	log 0 "Please put your public key to destination node"
> 	exit $MIG_ERR_CANT_CONNECT
> fi
> 
> # Check if OpenVZ is running
> if ! logexec 2 $SSH -o BatchMode=yes root@$host /etc/init.d/vz status ; then
> 	log 0 "OpenVZ is not running on the target machine"
> 	log 0 "Can't continue migration"
> 	exit $MIG_ERR_OVZ_NOT_RUNNING
> fi
> 
> # Check if CPT modules are loaded for online migration
> if [ $online -eq 1 ]; then
> 	if [ ! -f /proc/cpt ]; then
> 		log 0 "vzcpt module is not loaded on the source node"
> 		log 0 "Can't continue online migration"
> 		exit $MIG_ERR_OVZ_NOT_RUNNING
> 	fi
> 	if ! logexec 2 $SSH -o BatchMode=yes root@$host "test -f /proc/rst";
> 	then
> 		log 0 "vzrst module is not loaded on the destination node"
> 		log 0 "Can't continue online migration"
> 		exit $MIG_ERR_OVZ_NOT_RUNNING
> 	fi
> fi
> 
> dst_exist=$($SSH "root@$host" "vzctl status $VEID" | awk '{print $3}')
> if [ "$dst_exist" = "exist" ]; then
> 	log 0 "VE #$VEID already exists on destination node"
> 	exit $MIG_ERR_EXISTS
> fi
> 
> if [ $online -eq 1 -a "$state" != "running" ]; then
> 	log 0 "Can't perform online migration of stopped VE"
> 	exit $MIG_ERR_VPS_IS_STOPPED
> fi
> 
> log 2 "Loading $vzconf and $vpsconf files"
> 
> . "$vzconf"
> . "$vpsconf"
> VE_DUMPFILE="$tmpdir/dump.$VEID"
> VE_QUOTADUMP="$tmpdir/quotadump.$VEID"
> 
> log 2 "Check IPs on destination node: $IP_ADDRESS"
> for IP in $IP_ADDRESS; do
> 	if [ $($SSH "root@$host" "grep -c \" $IP \" /proc/vz/veip") -gt 0 ];
> 	then
> 		log 0 "IP address $IP already in use on destination node"
> 		exit $MIG_ERR_IP_IN_USE
> 	fi
> done
> 
> log 1 "Preparing remote node"
> 
> log 2 "Copying config file"
> if ! logexec 2 $SCP $vpsconf root@$host:$vpsconf ; then
> 	log 0 "Failed to copy config file"
> 	exit $MIG_ERR_COPY
> fi
> 
> logexec 2 $SSH root@$host vzctl set $VEID --applyconfig_map name --save
> # vzctl return code 20 or 21 in case of unrecognized option
> #ALEX: zie https://bugzilla.altlinux.org/show_bug.cgi?id=15085
> #if [ $? != 20 && $? != 21 && $? != 0 ]; then
> if [ $? != 20 ] && [ $? != 21 ] && [ $? != 0 ]; then
> 	log 0 "Failed to apply config on destination node"
> 	undo_conf
> 	exit $MIG_ERR_APPLY_CONFIG
> fi
> 
> for sfx in $ACT_SCRIPTS_SFX; do
> 	file="$confdir/$VEID.$sfx"
> 	if [ -f "$file" ]; then
> 		act_scripts="$act_scripts $file"
> 	fi
> done
> if [ -n "$act_scripts" ]; then
> 	log 2 "Copying action scripts"
> 	if ! logexec 2 $SCP $act_scripts root@$host:$confdir ; then
> 		log 0 "Failed to copy action scripts"
> 		undo_conf
> 		exit $MIG_ERR_COPY
> 	fi
> fi
> 
> log 2 "Creating remote VE root dir"
> if ! $SSH "root@$host" "mkdir -p $VE_ROOT"; then
> 	log 0 "Failed to make VE root"
> 	undo_act_scripts
> 	exit $MIG_ERR_COPY
> fi
> 
> log 2 "Creating remote VE private dir"
> if ! $SSH "root@$host" "mkdir -p $VE_PRIVATE"; then
> 	log 0 "Failed to make VE private area"
> 	undo_private
> 	exit $MIG_ERR_COPY
> fi
> 
> if [ "${DISK_QUOTA}" != "no" ]; then
> 	log 1 "Initializing remote quota"
> 
> 	log 2 "Quota init"
> 	if ! $SSH "root@$host" "vzctl quotainit $VEID"; then
> 		log 0 "Failed to initialize quota"
> 		undo_root
> 		exit $MIG_ERR_QUOTA
> 	fi
> 
> 	log 2 "Turning remote quota on"
> 	if ! $SSH "root@$host" "vzctl quotaon $VEID"; then
> 		log 0 "Failed to turn quota on"
> 		undo_quota_init
> 		exit $MIG_ERR_QUOTA
> 	fi
> else
> 	log 2 "VZ disk quota disabled -- skipping quota migration"
> fi
> 
> log 1 "Syncing private"
> if ! $RSYNC --progress \
> 		"$VE_PRIVATE" "root@$host:${VE_PRIVATE%/*}" |
> 		grep "% of" | awk -v ORS="\r" '{print $10}'; then
> 	log 0 "Failed to sync VE private areas"
> 	undo_quota_on
> 	exit $MIG_ERR_COPY
> fi
> 
> if [ $online -eq 1 ]; then
> 	log 1 "Live migrating VE"
> 
> 	log 2 "Suspending VE"
> 	time_suspend=$(date +%s.%N)
> 	if ! logexec 2 vzctl chkpnt $VEID --suspend ; then
> 		log 0 "Failed to suspend VE"
> 		undo_sync
> 		exit $MIG_ERR_CHECKPOINT
> 	fi
> 
> 	log 2 "Dumping VE"
> 	if ! logexec 2 vzctl chkpnt $VEID --dump --dumpfile $VE_DUMPFILE ; then
> 		log 0 "Failed to dump VE"
> 		undo_suspend
> 		exit $MIG_ERR_CHECKPOINT
> 	fi
> 
> 	log 2 "Copying dumpfile"
> 	time_copy_dump=$(date +%s.%N)
> 	if ! logexec 2 $SCP $VE_DUMPFILE root@$host:$VE_DUMPFILE ; then
> 		log 0 "Failed to copy dump"
> 		undo_dump
> 		exit $MIG_ERR_COPY
> 	fi
> else
> 	if [ "$state" = "running" ]; then
> 		log 1 "Stopping VE"
> 		if ! logexec 2 vzctl stop $VEID ; then
> 			log 0 "Failed to stop VE"
> 			undo_sync
> 			exit $MIG_ERR_STOP_SOURCE
> 		fi
> 	elif [ "$mounted" = "mounted" ]; then
> 		log 1 "Unmounting VE"
> 		if ! logexec 2 vzctl umount $VEID ; then
> 			log 0 "Failed to umount VE"
> 			undo_sync
> 			exit $MIG_ERR_STOP_SOURCE
> 		fi
> 	fi
> fi
> 
> if [ "$state" = "running" ]; then
> 	log 2 "Syncing private (2nd pass)"
> 	time_rsync2=$(date +%s.%N)
> 	if ! $RSYNC \
> 			"$VE_PRIVATE" "root@$host:${VE_PRIVATE%/*}"; then
> 		log 0 "Failed to sync VE private areas"
> 		undo_source_stage
> 		exit $MIG_ERR_COPY
> 	fi
> fi
> 
> if [ "${DISK_QUOTA}" != "no" ]; then
> 	log 1 "Syncing 2nd level quota"
> 
> 	log 2 "Dumping 2nd level quota"
> 	time_quota=$(date +%s.%N)
> 	if ! vzdqdump $VEID -U -G -T > "$VE_QUOTADUMP"; then
> 		log 0 "Failed to dump 2nd level quota"
> 		undo_quota_dump
> 		exit $MIG_ERR_QUOTA
> 	fi
> 
> 	log 2 "Copying 2nd level quota"
> 	if ! logexec 2 $SCP $VE_QUOTADUMP root@$host:$VE_QUOTADUMP ; then
> 		log 0 "Failed to copy 2nd level quota dump"
> 		undo_quota_dump
> 		exit $MIG_ERR_COPY
> 	fi
> 
> 	log 2 "Load 2nd level quota"
> 	if ! $SSH "root@$host" "(vzdqload $VEID -U -G -T < $VE_QUOTADUMP &&
> 			vzquota reload2 $VEID)"; then
> 		log 0 "Failed to load 2nd level quota"
> 		undo_copy_quota
> 		exit $MIG_ERR_QUOTA
> 	fi
> else
> 	log 2 "VZ disk quota disabled -- skipping quota migration"
> fi
> 
> if [ $online -eq 1 ]; then
> 	log 2 "Undumping VE"
> 	time_undump=$(date +%s.%N)
> 	if ! logexec 2 $SSH root@$host vzctl restore $VEID --undump \
> 			--dumpfile $VE_DUMPFILE --skip_arpdetect ; then
> 		log 0 "Failed to undump VE"
> 		undo_copy_quota
> 		exit $MIG_ERR_RESTORE_VPS
> 	fi
> 
> 	log 2 "Resuming VE"
> 	if ! logexec 2 $SSH root@$host vzctl restore $VEID --resume ; then
> 		log 0 "Failed to resume VE"
> 		undo_undump
> 		exit $MIG_ERR_RESTORE_VPS
> 	fi
> 	time_finish=$(date +%s.%N)
> 	log 2 "Times:"
> 	log 2 "\tSuspend + Dump:\t" $(get_time $time_suspend $time_copy_dump)
> 	log 2 "\tCopy dump file:\t" $(get_time $time_copy_dump $time_rsync2)
> 	log 2 "\tSecond rsync:\t" $(get_time $time_rsync2 $time_quota)
> 	log 2 "\t2nd level quota:\t" $(get_time $time_quota $time_undump)
> 	log 2 "\tUndump + Resume:\t" $(get_time $time_undump $time_finish)
> 	log 2 "Total time: " $(get_time $time_suspend $time_finish)
> 
> 	log 1 "Cleanup"
> 
> 	log 2 "Killing VE"
> 	logexec 2 vzctl chkpnt $VEID --kill
> 	logexec 2 vzctl umount $VEID
> 
> 	log 2 "Removing dumpfiles"
> 	rm -f "$VE_DUMPFILE"
> 	$SSH "root@$host" "rm -f $VE_DUMPFILE"
> else
> 	if [ "$state" = "running" ]; then
> 		log 1 "Starting VE"
> 		if ! logexec 2 $SSH root@$host vzctl start $VEID ; then
> 			log 0 "Failed to start VE"
> 			undo_copy_quota
> 			exit $MIG_ERR_START_VPS
> 		fi
> 	elif [ "$mounted" = "mounted" ]; then
> 		log 1 "Mounting VE"
> 		if ! logexec 2 $SSH root@$host vzctl mount $VEID ; then
> 			log 0 "Failed to mount VE"
> 			undo_copy_quota
> 			exit $MIG_ERR_MOUNT_VPS
> 		fi
> 	else
> 		log 1 "Turning quota off"
> 		if ! logexec 2 $SSH root@$host vzquota off $VEID ; then
> 			log 0 "failed to turn quota off"
> 			undo_copy_quota
> 			exit $MIG_ERR_QUOTA
> 		fi
> 	fi
> 
> 	log 1 "Cleanup"
> fi
> 
> if [ $remove_area -eq 1 ]; then
> 	log 2 "Destroying VE"
> 	logexec 2 vzctl destroy $VEID
> else
> 	# Move config as veid.migrated to allow backward migration
> 	mv -f $vpsconf $vpsconf.migrated
> fi


-- 
 --- Inguza Technology AB --- MSc in Information Technology ----
/  ola at inguza.com                    Annebergsslingan 37        \
|  opal at debian.org                   654 65 KARLSTAD            |
|  http://inguza.com/                Mobile: +46 (0)70-332 1551 |
\  gpg/f.p.: 7090 A92B 18FE 7994 0C36 4FE4 18A1 B1CF 0FE5 3DD9  /
 ---------------------------------------------------------------


More information about the Debian mailing list