[Users] Routing problems using SMP kernel

Steve Hodges shodges at iinet.net.au
Mon Aug 27 13:38:06 EDT 2007


On 27/08/2007 10:57 PM, Kirill Korotaev wrote:
> Steve,
>
> Sure, SMP shouldn't affect your routing and it is very strange. I guess >90% of people
> are running SMP kernels.
>
> >From your report it is totally unclear what OVZ kernel version is (e.g. something like 028stab039)
> and where this kernel was got from. Have you built it yourself?
> Can you please provide a bit more details on what is working and what not?
> Why have you decided that it is rounting to blame to?
>   

it's 2.6.18-028stab035.1-ovz-smp obtained from deb 
http://debian.systs.org/ stable openvz

when I use the normal kernel I can ping from the VE to the HN and to 
other VE's on this HN, to my other HN and to an external site (google.com)

when I use the smp kernel (no other change) I can ping from the VE to 
the NH and to other VEs on this HN, but not the other HN or to external 
sites

in all cases pinging from the HN is ok.

from the VE, if I try to to a traceroute to the HN it shows the HN as 
the first hop (with either smp or normal kernel).  If I traceroute to my 
other HN, I just get endless * * * lines with the smp kernel (it doesn't 
even show the HN as the first hop).  With the normal kernel it shows the 
HN, then the destination of the ping (the other HN in this case).

Is that a routing issue?  dunno?  but it looks like it might be.  I was 
actually leaning toward it being a hardware fault until I noticed the 
anomaly in the traceroute.

I'm not sure if having 2 nics in the box has any bearing on it.

with the smp kernel I also note checksum errors when I do a ping -R. I 
don't get those errors using the non-smp kernel.

OK, this gets extremely weird. I just checked the kernel I'm running and 
it is still the smp version.  and that is after I executed:

aptitude install ovzkernel-2.6.18
aptitude remove ovzkernel-2.6.18-smp
shutdown -r now

I am now concerned that this problem will recurr if I am forced to reboot.  It can't be as simple as the reboot fixing it as I rebooted several times while I was having the problem and it didn't go away.

I wonder if I have just entered the twighlight zone?

Steve
> Thanks,
> Kirill
>
> Steve Hodges wrote:
>   
>> After getting most of my problems solved I decided to move my test 
>> environment onto the production server.
>>
>> The server is a dual xeon which, with hyperthreading, appears (to Linux) 
>> to have 4 processors.  So, when I built this machine I decided to use 
>> the ovzkernel-2.6.18-smp
>>
>> The rebuild caused me all sorts of routing problems which I have managed 
>> to track down to being caused by the kernel.  I just replaced the kernel 
>> with ovzkernel-2.6.18 
>>
>> aptitude install ovzkernel-2.6.18
>> aptitude remove ovzkernel-2.6.18-smp
>> shutdown -r now
>>
>> problem solvered!
>>
>> It seems pretty odd that the smp kernel sould cause this, but I really 
>> don't know what is different about that kernel.
>>
>> The symptoms were similar to the ones I had before I set the netmask of 
>> the venets correctly, but more extreme.  Whereas the netmask issue 
>> seemed to cause packets to go out of the wrong interface, this problem 
>> seemed to stop packets getting out of the server at all.
>>
>> If there are any questions about the symptoms, I will be able to swap 
>> back to that kernel for the next day or so to test things out.
>>
>> What will the impact be of running the non-smp kernel on a 
>> multi-processir machine?  Will I only effectively use a single processor?
>>
>> Steve
>> _______________________________________________
>> Users mailing list
>> Users at openvz.org
>> https://openvz.org/mailman/listinfo/users
>>
>>     
>
>
>   


More information about the Users mailing list