[Devel] atl1 (Attansic L1 Gigabit Ethernet) driver & TCP weirdness

Solar Designer solar at openwall.com
Sat Jan 29 07:52:23 PST 2011


Hi,

It appears that linux-2.6.18-atl1-1.0.41.0.patch in RHEL5 branch OpenVZ
kernels is your addition, not Red Hat's, correct?

I have a machine with this NIC chip (onboard).  It uses this driver,
which mostly works fine.  However, there's an OpenVZ-specific bug where
TCP data transfer throughput from the host system is extremely poor
(like 30 KB/sec), whereas the throughput from OpenVZ containers on the
same system is OK (11 MB/sec, which is just right for 100 Mbps).

Investigating this with tcpdump, I see that when I transfer stuff from
the host system, the sending machine (the one with atl1) sometimes skips
a 4-byte block inbetween TCP segments.  Specifically, it sends a 1456
byte segment followed by a 1460 byte one, with 4 bytes skipped.  The
connection gets stalled for 200 ms.  Then the buggy system retransmits
the segment (or maybe both, I don't recall) at full 1460 bytes, and the
transfer goes further, until this happens again after just a few packets.

tcpdump on transfers from OpenVZ containers on the same system shows all
1460-byte segments.

This happens with at least 2.6.18-194.26.1.el5.028stab079.1.owl2 (the
kernel build we had in Owl 3.0 release).  The exact same kernel build
works just fine on plenty of other machines.  This is the only machine
exhibiting the problem, and is also the only one with an atl1 NIC.

Any ideas?  I did not try matching this against the code yet.

Thanks,

Alexander




More information about the Devel mailing list