[Users] Lots of interrupts since latest el6 kernel

Sun Jun 2 23:17:43 MSK 2019

Quick follow-up, I tried kernel 042stab137.1 with and without nohz=off,
same issue, 3 cores taking 100% each and load at 3.00+:

[root at core1 ~]# ps auxf|grep ksoftirqd|grep 99
root           9 99.1  0.0      0     0 ?        R    14:54  13:10  \_
[ksoftirqd/1]
root          17 99.1  0.0      0     0 ?        R    14:54  13:10  \_
[ksoftirqd/3]
root          33 99.1  0.0      0     0 ?        R    14:54  13:10  \_
[ksoftirqd/7]

[root at core1 ~]# cat /proc/loadavg
3.22 3.61 2.83 4/975 20478

I've downgraded to 2.6.32-042stab133.2 and everything is fine, load at
0.00, no CPU usage. There's something wrong between kernel 133.2 and 137.1.
I haven't tested them all.

Karl

On Fri, May 31, 2019 at 1:55 AM Vasily Averin <vvs at virtuozzo.com> wrote:

> On 5/30/19 10:39 PM, Karl Johnson wrote:
> > Hello,
> >
> > It's always related to swapper and ksoftirqd:
> "swapper" is idle thread, it is called if CPU does not have any active
> tasks
> it would be interesting to look at state of "ksoftirqd" processes several
> times, to see any changes.
>
> In provided example I see that this process was captured during processing
> of top-level function handles soft interrupts:
> do_softirq()-> call_softirq(). Usually these function handles network
> packets and I expected your example will contain more deep calltraces.
> Probably this happen next time.
>
> Anyway, these calltraces shows that CPUs are NOT 100% busy by processing
> of timer interrupts,
> so in general the situation looks like expected: in current theory
> ksoftirq processes handles network traffic.
>
> Thank you,
>         Vasily Averin
>
> > Some examples here: https://pastebin.com/wn0nCwce
> >
> > Karl
> >
> > On Thu, May 30, 2019 at 3:11 PM Vasily Averin <vvs at virtuozzo.com
> <mailto:vvs at virtuozzo.com>> wrote:
> >
> >     Dear Karl,
> >     thank you for reporting the problem.
> >
> >     no, it is not known issue.
> >     moreover, I doubt it is related to real hardware interrupts,
> >     soft-interrupts handles delayed procedures like processing of
> network packets.
> >
> >     For troubleshooting is to look at stack of affected running
> processes via /proc/<pid>/stack
> >     alternatively you can use magic sysrq key
> >     # echo l > /proc/sysrq-trigger
> >     it should dump current state of all running processors.
> >     you can do it few times to monitor state of affected processes.
> >
> >     Thank you,
> >             Vasily Averin
> >
> >
> >     On 5/30/19 7:54 PM, Karl Johnson wrote:
> >     > Hello,
> >     >
> >     > I've upgraded from 2.6.32-042stab133.2 to 2.6.32-042stab138.1 and
> since boot, 2 cores are using 100% cpu on ksoftirqd:
> >     >
> >     > root          21 99.9  0.0      0     0 ?        R    May29
> 1178:07  \_ [ksoftirqd/4]
> >     > root          25 99.9  0.0      0     0 ?        R    May29
> 1177:51  \_ [ksoftirqd/5]
> >     >
> >     > From /proc/interrupts I can see that it's caused by
> IR-IO-APIC-edge      timer:
> >     >
> >     >            CPU0       CPU1       CPU2       CPU3       CPU4
> CPU5       CPU6       CPU7
> >     >   0:     136922     103603      26928      27528  112318229
> 71888343      73755     285735  IR-IO-APIC-edge      timer
> >     >
> >     > kernel /vmlinuz-2.6.32-042stab138.1 ro
> root=UUID=7367aa0f-8216-44ca-9cc4-affed22bbd9c rd_NO_LUKS rd_NO_LVM
> LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto
>  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM nohz=off nopti
> >     >
> >     > Any way to troubleshoot this? Is it a known issue?
> >     >
> >     > Karl
> >     >
> >     >
> >     > _______________________________________________
> >     > Users mailing list
> >     > Users at openvz.org <mailto:Users at openvz.org>
> >     > https://lists.openvz.org/mailman/listinfo/users
> >     >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/users/attachments/20190602/be037402/attachment.html>