[Users] Container Migration - Ploop

Axton axton.grams at gmail.com
Mon Mar 4 19:00:34 EST 2013


On Sun, Mar 3, 2013 at 6:21 PM, Axton <axton.grams at gmail.com> wrote:
> I am attempting to migrate a container from one host to another.  I am
> running into issues that cause the migration to fail.
>
> Environment information:
> - Build:Debian 6; updates, install RHEL kernel
> - Kernel:
> 042stab072.10
>
> The only part that is failing is the live migration.  If I stop the
> container, I can migrate without any issue.
>
> I will try to look more into this, but wanted to see if it is a known issue.
>
> Axton Grams

The two machines were built identically and are patched at the same
level, have the same hardware configuration, same drive and file
system configuration.  I did some things with the file systems to put
things into an optimal arrangement.  Here are the details of the file
system configuration:

Looking further at this, it appears that the vzctl 4.1.2 snapshot
located at http://download.openvz.org/utils/vzctl/4.1.2/vzctl-4.1.2-1.x86_64.rpm
does not support the chkpnt option.  I tried the following command,
and it returns the same error that the live migration returns:

root at cluster-01:/var/log# vzctl chkpnt 5000 --dumpfile /tmp/5000.dump
Setting up checkpoint...
        suspend...
        dump...
Can not dump container: Invalid argument
Checkpointing failed

/var/log/kern.log contains the following entry for each failed chkpnt:
Mar  4 17:44:57 cluster-01 kernel: [174834.884946] CT: checkpointing
not supported yet for hidden pid namespaces.

I found that the following kernel parameter caused the issue:
kernel.pid_ns_hide_child=1

Too bad they don't work together.  It is probably worth noting on the
wiki pages or in the man pages.

Here is the output from a successful live migration:

root at cluster-01:~# vzmigrate -s -v -t --live cluster-02 5000
Starting live migration of CT 5000 to cluster-02
OpenVZ is running...
   Loading /etc/vz/vz.conf and /etc/vz/conf/5000.conf files
   Check if ploop is supported on destination node
Next unused minor: 1017904
   Check IPs on destination node:
Preparing remote node
   Copying config file
5000.conf
                              100% 1570     1.5KB/s   00:00
Name fs01 assigned
No changes in CT configuration, not saving
   Creating remote container root dir
   Creating remote container private dir
Creating a container snapshot
Creating snapshot {33d65014-8946-487c-93cc-4f17f4bd2527}
Storing /vz/private/5000/Snapshots.xml.tmp
Setting up checkpoint...
        suspend...
        get context...
Checkpointing completed successfully
Storing /vz/private/5000/root.hdd/DiskDescriptor.xml.tmp
Creating delta /vz/private/5000/root.hdd/root.hdd.{f5e57811-b586-4e06-8957-3d63ca6d7c47}
bs=2048 size=4614144 sectors
Creating snapshot dev=/dev/ploop47831
img=/vz/private/5000/root.hdd/root.hdd.{f5e57811-b586-4e06-8957-3d63ca6d7c47}
ploop snapshot {33d65014-8946-487c-93cc-4f17f4bd2527} has been
successfully created
Setting up checkpoint...
        join context..
        dump...
Checkpointing completed successfully
Resuming...
Snapshot {33d65014-8946-487c-93cc-4f17f4bd2527} has been successfully created
Syncing private
Live migrating container...
   Copying top ploop delta with CT suspend
Sending /vz/private/5000/root.hdd/root.hdd.{f5e57811-b586-4e06-8957-3d63ca6d7c47}
Setting up checkpoint...
        suspend...
        get context...
Checkpointing completed successfully
   Dumping container
Setting up checkpoint...
        join context..
        dump...
Checkpointing completed successfully
   Copying dumpfile
dump.5000
                              100% 8734KB   8.5MB/s   00:00
   Undumping container
Restoring container ...
Error in ploop_fsck (fsck_util.c:376): Dirty flag is set
Adding delta dev=/dev/ploop63619 img=/vz/private/5000/root.hdd/root.hdd (ro)
Adding delta dev=/dev/ploop63619
img=/vz/private/5000/root.hdd/root.hdd.{1cfb54ce-bade-4ca4-8882-ad322718974e}
(ro)
Adding delta dev=/dev/ploop63619
img=/vz/private/5000/root.hdd/root.hdd.{f5e57811-b586-4e06-8957-3d63ca6d7c47}
(rw)
Mounting /dev/ploop63619p1 at /vz/root/5000 fstype=ext4 data='balloon_ino=12,'
Container is mounted
        undump...
Setting CPU units: 1000
Configure veth devices: veth5000.20 veth5000.40
Adding interface veth5000.20 to bridge vmbr20 on CT0 for CT5000
Adding interface veth5000.40 to bridge vmbr40 on CT0 for CT5000
        get context...
Container start in progress...
Restoring completed successfully
   Resuming container
Resuming...

        Suspend + Dump:   0.42
   Pcopy after suspend:   0.50
        Copy dump file:   0.84
       Undump + Resume:   5.98
                        ------
  Total suspended time:   7.75

Cleanup
   Killing container
Killing...
Unmounting file system at /vz/root/5000
Unmounting device /dev/ploop47831
Container is unmounted
   Removing dumpfiles
   Destroying container
Destroying container private area: /vz/private/5000
Container private area was destroyed

Axton Grams


More information about the Users mailing list