[Users] Performance degradation on 042stab113.X

Wed Mar 30 08:38:17 PDT 2016

Hi Vasily,

I do indeed use simfs / ext4 / cfq. Only a backup of each containers
private areas is done with vzdump and then transferred to a backup server
with ncftpput.  Compressing the data is OK while transferring the dump over
local network peak the load so the issue is with (read) IO. I’m trying to
find out why it was fine before and cause problem now. Those nodes are in
heavy production so it’s hard to do testing (including downgrading kernel).

Thanks for all the information on futur roadmap. I’m glad that the work as
already begun on RHEL 6.8 rebase. I read the beta technical notes last week
and some upgrades seem great. Do you consider 042stab114.5 stable even if
it’s in the testing repo? I might try it tomorrow and see how it goes.

Regards,

Karl

On Wed, Mar 30, 2016 at 5:48 AM, Vasily Averin <vvs at virtuozzo.com> wrote:

> Dear Karl,
>
> thank you for explanation.
> however some details are still not it clear.
>
> I believe you use simfs containers (otherwise you can do not worry about
> PSBM-34244,
> using of 113.12 kernels also confirms it)
> but it isn't clear how exactly you backup your nodes.
> Do you dump whole partition with containers or just copy containers
> private areas somehow?
> What filesystem you have on partition with containers.
> What is backup storage in your case?
>
> Anyway seems you do not freeze filesystem with containers before backup.
> This functionality was broken in RHEL6 kernels quite long time,
> and Red Hat fixed it in 2.6.32-504.x and 573.x kernels.
>
> https://access.redhat.com/solutions/1506563
>
> Probably these fixes affect your testcase.
>
> I'm not sure of course,
> may be it isn't and some other fixes are guilty:
> Red Hat added >7000 new patches into 2.6.32-573.x kernels
> many our patches was changed during re-base,
> and many new patches was added.
> There was to many changes between 108.x and 113.x kernels.
>
> Our tests did not detected significant performance degradation,
> but it means nothing, most likely we just did not measured your testcase.
>
> I do not expect that situation will be changed on 113.21 kernel,
> seems we did not fixed similar issues last time.
>
> Yes, you-re right, our 042stab114.x kernels will be based
> on last released RHEL6.7 kernel 2.6.32-573.22.1.el6.
> its validation is in progress at present,
> and I hope we'll publish it in nearest future.
>
> However I did not found any related bugfixes in new RHEL6 kernels,
> and doubt that it helps you.
>
> Also we're going to make 115.x kernel based on RHEL6 update8 beta kernel
> 2.6.32-621.el6,
> it have no chances to be released in stable branch but its testing helps
> us to speed-up
> our rebase to RHEL6.8 release kernel (we expect RHEL6u8 will be released
> in end of May).
>
> The work on 115.x kernel is in progress, and I hope it should be done in
> next few days.
>
> So I would like to propose you following plan:
> please check how works 113.21, 114.x and 115.x kernels, (may be it works
> already)
> if issue will be still present, please reproduce the problem once again,
> crash affected host,
> create new bug in jira and push me again. I'll send you private link for
> vmcore uploading.
> Investigation of kernel crash dump file probably allows me to find
> bottleneck in your case.
>
> Thank you,
>         Vasily Averin
>
> On 29.03.2016 21:03, Karl Johnson wrote:
> > Hi Vasily,
> >
> > Every weekend I do backups of all CT which take a lot of IO. It
> > didn't affect much load average before 108 but as soon as I upgraded
> > to 113, load got very high and nodes became sluggish during backups.
> > It might be something else but I was looking for feedback if someone
> > else had the same issue. I will continue to troubleshoot this issue.
> > Meanwhile, I will upgrade them from 113.12 to 113.21 and see how it
> > goes even if there's nothing related to this in the changelog.
> >
> > Thanks for the reply,
> >
> > Karl
> >
> > On Tue, Mar 29, 2016 at 5:21 AM, Vasily Averin <vvs at virtuozzo.com
> <mailto:vvs at virtuozzo.com>> wrote:
> >
> >     Dear Karl,
> >
> >     no, we know nothing about possible performance degradation between
> >     042stab108.x and 042stab113.x kernels.
> >     High load average and CPU peaks  are not a problems per se,
> >     it can be caused by increased activity on your nodes.
> >
> >     Could you please explain in more details,
> >     why you believe you have a problem on your nodes?
> >
> >     Thank you,
> >             Vasily Averin
> >
> >     On 28.03.2016 20:28, Karl Johnson wrote:
> >     > Hello,
> >     >
> >     > Did anyone notice performance degradation after upgrading vzkernel
> to
> >     > 042stab113.X? I’ve been running 042stab108.5 on few nodes for a
> while
> >     > with no issue and upgraded to 042stab113.12 few weeks ago to fix an
> >     > important CVE and rebase to latest rhel6 kernel.
> >     >
> >     > Since the upgrade from 108.5 to 113.12, I noticed much higher load
> >     > average on those upgraded OpenVZ nodes, mostly when IO is heavily
> >     > used. High CPU peaks are much more frequent. I would be curious to
> >     > know if someone else has the same issue. I wouldn’t downgrade
> because
> >     > of security fix PSBM-34244.
> >     >
> >     > Regards,
> >     >
> >     > Karl
> >     >
> >     >
> >     > _______________________________________________
> >     > Users mailing list
> >     > Users at openvz.org <mailto:Users at openvz.org>
> >     > https://lists.openvz.org/mailman/listinfo/users
> >     >
> >     _______________________________________________
> >     Users mailing list
> >     Users at openvz.org <mailto:Users at openvz.org>
> >     https://lists.openvz.org/mailman/listinfo/users
> >
> >
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users at openvz.org
> > https://lists.openvz.org/mailman/listinfo/users
> >
> _______________________________________________
> Users mailing list
> Users at openvz.org
> https://lists.openvz.org/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/users/attachments/20160330/2f2af061/attachment.html>