[Users] Performance degradation on 042stab113.X
Karl Johnson
karljohnson.it at gmail.com
Mon Apr 4 12:11:54 PDT 2016
Hi Vasily,
I've upgraded two nodes last week from 113.12 to 113.21 and it seems
better. Backups last weekend took the same time as it was on <=108.8. I'll
still keep an eye on this and also on the development of 115 in OpenVZ Jira.
Thanks!
Karl
On Thu, Mar 31, 2016 at 4:13 AM, Vasily Averin <vvs at virtuozzo.com> wrote:
> On 30.03.2016 18:38, Karl Johnson wrote:
> > Hi Vasily,
> >
> > I do indeed use simfs / ext4 / cfq. Only a backup of each containers
> > private areas is done with vzdump and then transferred to a backup
> > server with ncftpput. Compressing the data is OK while transferring
> > the dump over local network peak the load so the issue is with (read)
> > IO. I’m trying to find out why it was fine before and cause problem
> > now. Those nodes are in heavy production so it’s hard to do testing
> > (including downgrading kernel).
>
> Few lists of blocked processes taken on alt+sysrq+W "magic sysrq" key can
> be useful,
> it allows to see who is blocked, and it allows to see dynamic of process,
> but it does not explain who causes this traffic jam.
>
> I'm sorry, but another ways of troubleshooting are much more destuctive.
> Moreover even kernel crash dump does not guarantee success in your case.
> It allows to see whole picture with all details,
> but it does not allow to understand the dynamic of process.
>
> > Thanks for all the information on futur roadmap. I’m glad that the
> > work as already begun on RHEL 6.8 rebase. I read the beta technical
> > notes last week and some upgrades seem great. Do you consider
> > 042stab114.5 stable even if it’s in the testing repo? I might try it
> > tomorrow and see how it goes.
>
> In fact we do not know yet.
>
> 114.x kernels includes ~30 new patches from Red Hat and ~10 our ones,
> and we had few minor rejects only during re-base.
> At the first glance it should not cause problems,
> but first 114.x kernel was crashed on boot,
> and 114.4 was crashed after CT suspend-resume.
> In both cases we was need to re-work our patches.
>
> 042stab114.5 kernel work well on my test node right now,
> but it is not ready for production yet and requires careful re-testing.
> So if you have some specific workload, we would be very grateful
> for any testing and bugreports.
> It allows us to know about hidden bugs before release.
>
> thank you,
> Vasily Averin
>
> > Regards,
> >
> > Karl
> >
> > On Wed, Mar 30, 2016 at 5:48 AM, Vasily Averin <vvs at virtuozzo.com
> <mailto:vvs at virtuozzo.com>> wrote:
> >
> > Dear Karl,
> >
> > thank you for explanation.
> > however some details are still not it clear.
> >
> > I believe you use simfs containers (otherwise you can do not worry
> about PSBM-34244,
> > using of 113.12 kernels also confirms it)
> > but it isn't clear how exactly you backup your nodes.
> > Do you dump whole partition with containers or just copy containers
> private areas somehow?
> > What filesystem you have on partition with containers.
> > What is backup storage in your case?
> >
> > Anyway seems you do not freeze filesystem with containers before
> backup.
> > This functionality was broken in RHEL6 kernels quite long time,
> > and Red Hat fixed it in 2.6.32-504.x and 573.x kernels.
> >
> > https://access.redhat.com/solutions/1506563
> >
> > Probably these fixes affect your testcase.
> >
> > I'm not sure of course,
> > may be it isn't and some other fixes are guilty:
> > Red Hat added >7000 new patches into 2.6.32-573.x kernels
> > many our patches was changed during re-base,
> > and many new patches was added.
> > There was to many changes between 108.x and 113.x kernels.
> >
> > Our tests did not detected significant performance degradation,
> > but it means nothing, most likely we just did not measured your
> testcase.
> >
> > I do not expect that situation will be changed on 113.21 kernel,
> > seems we did not fixed similar issues last time.
> >
> > Yes, you-re right, our 042stab114.x kernels will be based
> > on last released RHEL6.7 kernel 2.6.32-573.22.1.el6.
> > its validation is in progress at present,
> > and I hope we'll publish it in nearest future.
> >
> > However I did not found any related bugfixes in new RHEL6 kernels,
> > and doubt that it helps you.
> >
> > Also we're going to make 115.x kernel based on RHEL6 update8 beta
> kernel 2.6.32-621.el6,
> > it have no chances to be released in stable branch but its testing
> helps us to speed-up
> > our rebase to RHEL6.8 release kernel (we expect RHEL6u8 will be
> released in end of May).
> >
> > The work on 115.x kernel is in progress, and I hope it should be
> done in next few days.
> >
> > So I would like to propose you following plan:
> > please check how works 113.21, 114.x and 115.x kernels, (may be it
> works already)
> > if issue will be still present, please reproduce the problem once
> again, crash affected host,
> > create new bug in jira and push me again. I'll send you private link
> for vmcore uploading.
> > Investigation of kernel crash dump file probably allows me to find
> bottleneck in your case.
> >
> > Thank you,
> > Vasily Averin
> >
> > On 29.03.2016 21:03, Karl Johnson wrote:
> > > Hi Vasily,
> > >
> > > Every weekend I do backups of all CT which take a lot of IO. It
> > > didn't affect much load average before 108 but as soon as I
> upgraded
> > > to 113, load got very high and nodes became sluggish during
> backups.
> > > It might be something else but I was looking for feedback if
> someone
> > > else had the same issue. I will continue to troubleshoot this
> issue.
> > > Meanwhile, I will upgrade them from 113.12 to 113.21 and see how it
> > > goes even if there's nothing related to this in the changelog.
> > >
> > > Thanks for the reply,
> > >
> > > Karl
> > >
> > > On Tue, Mar 29, 2016 at 5:21 AM, Vasily Averin <vvs at virtuozzo.com
> <mailto:vvs at virtuozzo.com> <mailto:vvs at virtuozzo.com <mailto:
> vvs at virtuozzo.com>>> wrote:
> > >
> > > Dear Karl,
> > >
> > > no, we know nothing about possible performance degradation
> between
> > > 042stab108.x and 042stab113.x kernels.
> > > High load average and CPU peaks are not a problems per se,
> > > it can be caused by increased activity on your nodes.
> > >
> > > Could you please explain in more details,
> > > why you believe you have a problem on your nodes?
> > >
> > > Thank you,
> > > Vasily Averin
> > >
> > > On 28.03.2016 20:28, Karl Johnson wrote:
> > > > Hello,
> > > >
> > > > Did anyone notice performance degradation after upgrading
> vzkernel to
> > > > 042stab113.X? I’ve been running 042stab108.5 on few nodes
> for a while
> > > > with no issue and upgraded to 042stab113.12 few weeks ago to
> fix an
> > > > important CVE and rebase to latest rhel6 kernel.
> > > >
> > > > Since the upgrade from 108.5 to 113.12, I noticed much
> higher load
> > > > average on those upgraded OpenVZ nodes, mostly when IO is
> heavily
> > > > used. High CPU peaks are much more frequent. I would be
> curious to
> > > > know if someone else has the same issue. I wouldn’t
> downgrade because
> > > > of security fix PSBM-34244.
> > > >
> > > > Regards,
> > > >
> > > > Karl
> > > >
> > > >
> > > > _______________________________________________
> > > > Users mailing list
> > > > Users at openvz.org <mailto:Users at openvz.org> <mailto:
> Users at openvz.org <mailto:Users at openvz.org>>
> > > > https://lists.openvz.org/mailman/listinfo/users
> > > >
> > > _______________________________________________
> > > Users mailing list
> > > Users at openvz.org <mailto:Users at openvz.org> <mailto:
> Users at openvz.org <mailto:Users at openvz.org>>
> > > https://lists.openvz.org/mailman/listinfo/users
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Users mailing list
> > > Users at openvz.org <mailto:Users at openvz.org>
> > > https://lists.openvz.org/mailman/listinfo/users
> > >
> > _______________________________________________
> > Users mailing list
> > Users at openvz.org <mailto:Users at openvz.org>
> > https://lists.openvz.org/mailman/listinfo/users
> >
> >
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users at openvz.org
> > https://lists.openvz.org/mailman/listinfo/users
> >
> _______________________________________________
> Users mailing list
> Users at openvz.org
> https://lists.openvz.org/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/users/attachments/20160404/cb62e689/attachment.html>
More information about the Users
mailing list