[Devel] atl1 (Attansic L1 Gigabit Ethernet) driver & TCP weirdness
Vasily Averin
vvs at parallels.com
Sun Jan 30 00:18:15 PST 2011
On 01/30/2011 11:08 AM, Vasily Averin wrote:
> On 01/29/2011 08:50 PM, Solar Designer wrote:
>> On Sat, Jan 29, 2011 at 07:08:25PM +0300, Vasily Averin wrote:
>>> On 01/29/2011 06:52 PM, Solar Designer wrote:
>>>> It appears that linux-2.6.18-atl1-1.0.41.0.patch in RHEL5 branch OpenVZ
>>>> kernels is your addition, not Red Hat's, correct?
>>>
>>> yes, it's our patch, prepared 4 years ago, sources were taken from
>>> http://atl1.sourceforge.net/
>>
>> Thank you for confirming this.
>>
>> I've just tested - the problem is still seen with
>> 2.6.18-238.1.1.el5.028stab083.1.owl2 (our build of
>> 2.6.18-238.1.1.el5.028stab083.1 with minor changes).
>>
>> This is on x86_64.
>>
>> Do you have any machine with an atl1 NIC where you could try to
>> reproduce the issue (if you care)? So far, it's just my guess that the
>> issue is atl1-related - because it is seen on the only machine with
>> atl1, but not elsewhere. Testing on a second machine with atl1 could
>> confirm this.
>
> 4 years ago we had such hardware, I´ll recheck it after weekend.
> could you please check 2.1.3 driver version?
> ftp://ftp.hogchain.net/pub/linux/attansic/atl1/centos5.2/
btw. could you please try to decrease MTU to 1492?
http://atl1.sourceforge.net/
...
5. If you see this message (or lots of them) in your system log:
atl1 0000:02:00.0: hw csum wrong, pkt_flag:1600, err_flag:80
it generally means you're encountering a bug in the L1 hardware that isn't
handled well by the atl1 driver. Basically, the L1 hardware treats a fragmented
IP packet as an error, when, in fact, it may not be erroneous at all. Fragmented
packets can occur, for example, if your MTU size is set too large. In my own
case, my DSL modem/router has its MTU size set to 1500 bytes, but it needs 8 of
those bytes for its own use. If my Linux box also has its MTU size set to 1500,
then inbound and outbound packets will be fragmented at the DSL router so it can
add (or remove) its 8 bits. Here's where the hardware bug comes into play. When
the L1 hardware receives a fragmented packet, it sets an error flag in one of
its registers, and the driver, upon seeing the error bit, spews the "hw csum
wrong" message. This has been fixed in the 2.6.27 kernel and beyond. The
fragmentation will still occur, but the atl1 driver won't spew the error message
that contributes to network slowdown.
You can avoid this condition and significantly improve network performance by
adjusting the MTU size downward on your atl1 box. I use an MTU size of 1492
bytes on all my systems that sit behind the DSL modem/router. This MTU size
leaves room for the DSL modem/router to add its 8 bits and avoid fragmentation
altogether.
More information about the Devel
mailing list