<div dir="ltr"><div>Quick follow-up, I tried kernel 042stab137.1 with and without nohz=off, same issue, 3 cores taking 100% each and load at 3.00+:</div><div><br></div><div>[root@core1 ~]# ps auxf|grep ksoftirqd|grep 99<br>root 9 99.1 0.0 0 0 ? R 14:54 13:10 \_ [ksoftirqd/1]<br>root 17 99.1 0.0 0 0 ? R 14:54 13:10 \_ [ksoftirqd/3]<br>root 33 99.1 0.0 0 0 ? R 14:54 13:10 \_ [ksoftirqd/7]<br><br>[root@core1 ~]# cat /proc/loadavg <br>3.22 3.61 2.83 4/975 20478</div><div><br></div><div>I've downgraded to 2.6.32-042stab133.2 and everything is fine, load at 0.00, no CPU usage. There's something wrong between kernel 133.2 and 137.1. I haven't tested them all.</div><div><br></div><div>Karl<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, May 31, 2019 at 1:55 AM Vasily Averin <<a href="mailto:vvs@virtuozzo.com">vvs@virtuozzo.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 5/30/19 10:39 PM, Karl Johnson wrote:<br>
> Hello,<br>
> <br>
> It's always related to swapper and ksoftirqd:<br>
"swapper" is idle thread, it is called if CPU does not have any active tasks<br>
it would be interesting to look at state of "ksoftirqd" processes several times, to see any changes.<br>
<br>
In provided example I see that this process was captured during processing of top-level function handles soft interrupts:<br>
do_softirq()-> call_softirq(). Usually these function handles network packets and I expected your example will contain more deep calltraces.<br>
Probably this happen next time.<br>
<br>
Anyway, these calltraces shows that CPUs are NOT 100% busy by processing of timer interrupts,<br>
so in general the situation looks like expected: in current theory ksoftirq processes handles network traffic.<br>
<br>
Thank you,<br>
Vasily Averin<br>
<br>
> Some examples here: <a href="https://pastebin.com/wn0nCwce" rel="noreferrer" target="_blank">https://pastebin.com/wn0nCwce</a><br>
> <br>
> Karl<br>
> <br>
> On Thu, May 30, 2019 at 3:11 PM Vasily Averin <<a href="mailto:vvs@virtuozzo.com" target="_blank">vvs@virtuozzo.com</a> <mailto:<a href="mailto:vvs@virtuozzo.com" target="_blank">vvs@virtuozzo.com</a>>> wrote:<br>
> <br>
> Dear Karl,<br>
> thank you for reporting the problem.<br>
> <br>
> no, it is not known issue.<br>
> moreover, I doubt it is related to real hardware interrupts,<br>
> soft-interrupts handles delayed procedures like processing of network packets.<br>
> <br>
> For troubleshooting is to look at stack of affected running processes via /proc/<pid>/stack<br>
> alternatively you can use magic sysrq key<br>
> # echo l > /proc/sysrq-trigger<br>
> it should dump current state of all running processors.<br>
> you can do it few times to monitor state of affected processes.<br>
> <br>
> Thank you,<br>
> Vasily Averin<br>
> <br>
> <br>
> On 5/30/19 7:54 PM, Karl Johnson wrote:<br>
> > Hello,<br>
> ><br>
> > I've upgraded from 2.6.32-042stab133.2 to 2.6.32-042stab138.1 and since boot, 2 cores are using 100% cpu on ksoftirqd:<br>
> ><br>
> > root 21 99.9 0.0 0 0 ? R May29 1178:07 \_ [ksoftirqd/4]<br>
> > root 25 99.9 0.0 0 0 ? R May29 1177:51 \_ [ksoftirqd/5]<br>
> ><br>
> > From /proc/interrupts I can see that it's caused by IR-IO-APIC-edge timer:<br>
> ><br>
> > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 <br>
> > 0: 136922 103603 26928 27528 112318229 71888343 73755 285735 IR-IO-APIC-edge timer<br>
> ><br>
> > kernel /vmlinuz-2.6.32-042stab138.1 ro root=UUID=7367aa0f-8216-44ca-9cc4-affed22bbd9c rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM nohz=off nopti<br>
> ><br>
> > Any way to troubleshoot this? Is it a known issue?<br>
> ><br>
> > Karl<br>
> ><br>
> ><br>
> > _______________________________________________<br>
> > Users mailing list<br>
> > <a href="mailto:Users@openvz.org" target="_blank">Users@openvz.org</a> <mailto:<a href="mailto:Users@openvz.org" target="_blank">Users@openvz.org</a>><br>
> > <a href="https://lists.openvz.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.openvz.org/mailman/listinfo/users</a><br>
> ><br>
> <br>
</blockquote></div>