[Devel] atl1 (Attansic L1 Gigabit Ethernet) driver & TCP weirdness

Vasily Averin vvs at parallels.com
Sun Jan 30 00:18:15 PST 2011


On 01/30/2011 11:08 AM, Vasily Averin wrote:
> On 01/29/2011 08:50 PM, Solar Designer wrote:
>> On Sat, Jan 29, 2011 at 07:08:25PM +0300, Vasily Averin wrote:
>>> On 01/29/2011 06:52 PM, Solar Designer wrote:
>>>> It appears that linux-2.6.18-atl1-1.0.41.0.patch in RHEL5 branch OpenVZ
>>>> kernels is your addition, not Red Hat's, correct?
>>>
>>> yes, it's our patch, prepared 4 years ago, sources were taken from
>>> http://atl1.sourceforge.net/
>>
>> Thank you for confirming this.
>>
>> I've just tested - the problem is still seen with
>> 2.6.18-238.1.1.el5.028stab083.1.owl2 (our build of
>> 2.6.18-238.1.1.el5.028stab083.1 with minor changes).
>>
>> This is on x86_64.
>>
>> Do you have any machine with an atl1 NIC where you could try to
>> reproduce the issue (if you care)? So far, it's just my guess that the
>> issue is atl1-related - because it is seen on the only machine with
>> atl1, but not elsewhere. Testing on a second machine with atl1 could
>> confirm this.
>
> 4 years ago we had such hardware, I´ll recheck it after weekend.
> could you please check 2.1.3 driver version?
> ftp://ftp.hogchain.net/pub/linux/attansic/atl1/centos5.2/

btw. could you please try to decrease MTU to 1492?
  http://atl1.sourceforge.net/
...
5. If you see this message (or lots of them) in your system log:

atl1 0000:02:00.0: hw csum wrong, pkt_flag:1600, err_flag:80

it generally means you're encountering a bug in the L1 hardware that isn't 
handled well by the atl1 driver. Basically, the L1 hardware treats a fragmented 
IP packet as an error, when, in fact, it may not be erroneous at all. Fragmented 
packets can occur, for example, if your MTU size is set too large. In my own 
case, my DSL modem/router has its MTU size set to 1500 bytes, but it needs 8 of 
those bytes for its own use. If my Linux box also has its MTU size set to 1500, 
then inbound and outbound packets will be fragmented at the DSL router so it can 
add (or remove) its 8 bits. Here's where the hardware bug comes into play. When 
the L1 hardware receives a fragmented packet, it sets an error flag in one of 
its registers, and the driver, upon seeing the error bit, spews the "hw csum 
wrong" message. This has been fixed in the 2.6.27 kernel and beyond. The 
fragmentation will still occur, but the atl1 driver won't spew the error message 
that contributes to network slowdown.

You can avoid this condition and significantly improve network performance by 
adjusting the MTU size downward on your atl1 box. I use an MTU size of 1492 
bytes on all my systems that sit behind the DSL modem/router. This MTU size 
leaves room for the DSL modem/router to add its 8 bits and avoid fragmentation 
altogether.




More information about the Devel mailing list