[Users] Live Migration Optimal execution

Nipun Arora nipunarora2512 at gmail.com
Sun Nov 23 19:13:38 PST 2014


Thanks, I will try your suggestions, and get back to you.
btw... any idea what could be used to share the base image on both
containers?
Like hardlink it in what way? Once both containers start, won't they have
to write to different locations?

I understand that some file systems have a copy on write mechanism, where
after a snapshot all future writes are written to a additional linked disks.
Does ploop operate in a similar way?

http://wiki.qemu.org/Features/Snapshots

The cloning with a modified vzmigrate script helps.

- Nipun

On Sun, Nov 23, 2014 at 5:29 PM, Kir Kolyshkin <kir at openvz.org> wrote:

>
> On 11/23/2014 04:59 AM, Nipun Arora wrote:
>
> Hi Kir,
>
>  Thanks for the response, I'll update it, and tell you about the results.
>
>  1. A follow up question... I found that the write I/O speed of 500-1Mbps
> increased the suspend time  to several minutes.(mostly pcopy stage)
> This seems extremely high for a relatively low I/O workload, which is why
> I was wondering if there are any special things I need to take care of.
> (I ran fio (flexible i/o writer) with fixed throughput while doing live
> migration)
>
>
> Please retry with vzctl 4.8 and ploop 1.12.1 (make sure they are on both
> sides).
> There was a 5 second wait for the remote side to finish syncing
> copied ploop data. It helped a case with not much I/O activity in
> container, but
> ruined the case you are talking about.
>
> Newer ploop and vzctl implement a feedback channel for ploop copy that
> eliminates
> that wait time.
>
> http://git.openvz.org/?p=ploop;a=commit;h=20d754c91079165b
> http://git.openvz.org/?p=vzctl;a=commit;h=374b759dec45255d4
>
> There are some other major improvements as well, such as async send for
> ploop.
>
> http://git.openvz.org/?p=ploop;a=commit;h=a55e26e9606e0b
>
>
>  2. For my purposes, I have modified the live migration script to allow
> me to do cloning... i.e. I start both the containers instead of deleting
> the original. I need to do this "cloning" from time to time for the same
> target container...
>
>         a. Which means that lets say we cloned container C1 to container
> C2, and let both execute at time t0, this works with no apparent loss of
> service.
>
>         b. Now at time t1 I would like to again clone C1 to C2, and would
> like to optimize the rsync process as most of the ploop file for C1 and C2
> should still be the same (i.e. less time to sync). Can anyone suggest what
> would be the best way to realize the second point?
>
>
> You can create a ploop snapshot and use shared base image for both
> containers
> (instead of copying the base delta, hardlink it). This is not supported by
> tools
> (for example, since base delta is now shared you can't merge down to it,
> but the
> tools are not aware) so you need to figure it out by yourself and be
> accurate
> but it should work.
>
>
>
>
>  Thanks
> Nipun
>
> On Sun, Nov 23, 2014 at 12:56 AM, Kir Kolyshkin <kir at openvz.org> wrote:
>
>>
>> On 11/22/2014 09:09 AM, Nipun Arora wrote:
>>
>> Hi All,
>>
>>  I was wondering if anyone can suggest what is the most optimal way to
>> do the following
>>
>>  1. Can anyone clarify if ploop is the best layout for minimum suspend
>> time during live migration?
>>
>>
>>  Yes (due to ploop copy which only copies the modified blocks).
>>
>>
>>  2. I tried migrating a ploop device where I increased the --diskspace
>> to 5G,
>> and found that the suspend time taken by live migration increased to 57
>> seconds
>> (mainly undump and restore increased)...
>> whereas a 2G diskspace was taking 2-3 seconds suspend time... Is this
>> expected?
>>
>>
>>  No. Undump and restore times depends mostly on amount of RAM used by a
>> container.
>>
>> Having said that, live migration stages influence each other, although
>> it's less so
>> in the latest vzctl release (I won't go into details here if you allow me
>> -- just make sure
>> you test with vzctl 4.8).
>>
>>
>>  3. I tried running a write intensive workload, and found that beyond
>> 100-150Kbps,
>> the suspend time during live migration rapidly increased? Is this an
>> expected trend?
>>
>>
>>  Sure. With increased writing speed, the amount of data that needs to be
>> copied after CT
>> is suspended increases.
>>
>>
>>  I am using vzctl 4.7, and ploop 1.11 in centos 6.5
>>
>>
>>  You need to update vzctl and ploop and rerun your tests, there should be
>> some improvement (in particular with respect to issue #3).
>>
>>
>>  Thanks
>> Nipun
>>
>>
>> _______________________________________________
>> Users mailing listUsers at openvz.orghttps://lists.openvz.org/mailman/listinfo/users
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at openvz.org
>> https://lists.openvz.org/mailman/listinfo/users
>>
>>
>
>
> _______________________________________________
> Users mailing listUsers at openvz.orghttps://lists.openvz.org/mailman/listinfo/users
>
>
>
> _______________________________________________
> Users mailing list
> Users at openvz.org
> https://lists.openvz.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/users/attachments/20141123/bf4b4d48/attachment-0001.html>


More information about the Users mailing list