[Users] Lots of interrupts since latest el6 kernel
Karl Johnson
karljohnson.it at gmail.com
Sun Jun 2 23:17:43 MSK 2019
Quick follow-up, I tried kernel 042stab137.1 with and without nohz=off,
same issue, 3 cores taking 100% each and load at 3.00+:
[root at core1 ~]# ps auxf|grep ksoftirqd|grep 99
root 9 99.1 0.0 0 0 ? R 14:54 13:10 \_
[ksoftirqd/1]
root 17 99.1 0.0 0 0 ? R 14:54 13:10 \_
[ksoftirqd/3]
root 33 99.1 0.0 0 0 ? R 14:54 13:10 \_
[ksoftirqd/7]
[root at core1 ~]# cat /proc/loadavg
3.22 3.61 2.83 4/975 20478
I've downgraded to 2.6.32-042stab133.2 and everything is fine, load at
0.00, no CPU usage. There's something wrong between kernel 133.2 and 137.1.
I haven't tested them all.
Karl
On Fri, May 31, 2019 at 1:55 AM Vasily Averin <vvs at virtuozzo.com> wrote:
> On 5/30/19 10:39 PM, Karl Johnson wrote:
> > Hello,
> >
> > It's always related to swapper and ksoftirqd:
> "swapper" is idle thread, it is called if CPU does not have any active
> tasks
> it would be interesting to look at state of "ksoftirqd" processes several
> times, to see any changes.
>
> In provided example I see that this process was captured during processing
> of top-level function handles soft interrupts:
> do_softirq()-> call_softirq(). Usually these function handles network
> packets and I expected your example will contain more deep calltraces.
> Probably this happen next time.
>
> Anyway, these calltraces shows that CPUs are NOT 100% busy by processing
> of timer interrupts,
> so in general the situation looks like expected: in current theory
> ksoftirq processes handles network traffic.
>
> Thank you,
> Vasily Averin
>
> > Some examples here: https://pastebin.com/wn0nCwce
> >
> > Karl
> >
> > On Thu, May 30, 2019 at 3:11 PM Vasily Averin <vvs at virtuozzo.com
> <mailto:vvs at virtuozzo.com>> wrote:
> >
> > Dear Karl,
> > thank you for reporting the problem.
> >
> > no, it is not known issue.
> > moreover, I doubt it is related to real hardware interrupts,
> > soft-interrupts handles delayed procedures like processing of
> network packets.
> >
> > For troubleshooting is to look at stack of affected running
> processes via /proc/<pid>/stack
> > alternatively you can use magic sysrq key
> > # echo l > /proc/sysrq-trigger
> > it should dump current state of all running processors.
> > you can do it few times to monitor state of affected processes.
> >
> > Thank you,
> > Vasily Averin
> >
> >
> > On 5/30/19 7:54 PM, Karl Johnson wrote:
> > > Hello,
> > >
> > > I've upgraded from 2.6.32-042stab133.2 to 2.6.32-042stab138.1 and
> since boot, 2 cores are using 100% cpu on ksoftirqd:
> > >
> > > root 21 99.9 0.0 0 0 ? R May29
> 1178:07 \_ [ksoftirqd/4]
> > > root 25 99.9 0.0 0 0 ? R May29
> 1177:51 \_ [ksoftirqd/5]
> > >
> > > From /proc/interrupts I can see that it's caused by
> IR-IO-APIC-edge timer:
> > >
> > > CPU0 CPU1 CPU2 CPU3 CPU4
> CPU5 CPU6 CPU7
> > > 0: 136922 103603 26928 27528 112318229
> 71888343 73755 285735 IR-IO-APIC-edge timer
> > >
> > > kernel /vmlinuz-2.6.32-042stab138.1 ro
> root=UUID=7367aa0f-8216-44ca-9cc4-affed22bbd9c rd_NO_LUKS rd_NO_LVM
> LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto
> KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM nohz=off nopti
> > >
> > > Any way to troubleshoot this? Is it a known issue?
> > >
> > > Karl
> > >
> > >
> > > _______________________________________________
> > > Users mailing list
> > > Users at openvz.org <mailto:Users at openvz.org>
> > > https://lists.openvz.org/mailman/listinfo/users
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/users/attachments/20190602/be037402/attachment.html>
More information about the Users
mailing list