[CRIU] [PATCH] tcp: fix retransmission in repair mode

Andrew Vagin avagin at parallels.com
Wed Nov 14 05:40:02 EST 2012


On Tue, Nov 13, 2012 at 11:48:58PM +0400, Pavel Emelyanov wrote:
> On 11/13/2012 10:53 PM, Andrew Vagin wrote:
> > On Tue, Nov 13, 2012 at 10:44:32PM +0400, Andrew Vagin wrote:
> >> On Tue, Nov 13, 2012 at 07:39:26PM +0400, Pavel Emelyanov wrote:
> >>> On 11/13/2012 07:29 PM, Andrey Vagin wrote:
> >>>> From: Andrey Vagin <avagin at parallels.com>
> >>>>
> >>>> Currently if a socket has a few packet in a write queue,
> >>>> a kernel bug may be triggered:
> >>>>
> >>>> kernel BUG at net/ipv4/tcp_output.c:2330!
> >>>> RIP: 0010:[<ffffffff8155784f>] tcp_retransmit_skb+0x5ff/0x610
> >>>>
> >>>> According to the initial realization v3.4-rc2-963-gc0e88ff,
> >>>> all skb-s should look like already posted. This patch fixes code
> >>>> according with this sentence.
> >>>>
> >>>> Here are three points, which were not done in the initial patch:
> >>>> 1. A tcp send head should not be changed
> >>>> 2. Initialize TSO state of a skb
> >>>> 3. Reset the retransmission time
> >>>>
> >>>> Signed-off-by: Andrey Vagin <avagin at openvz.org>
> >>>> ---
> >>>>  include/net/tcp.h     | 2 ++
> >>>>  net/ipv4/tcp.c        | 6 ++++++
> >>>>  net/ipv4/tcp_output.c | 3 ++-
> >>>>  3 files changed, 10 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/include/net/tcp.h b/include/net/tcp.h
> >>>> index 6feeccd..7582d4a 100644
> >>>> --- a/include/net/tcp.h
> >>>> +++ b/include/net/tcp.h
> >>>> @@ -544,6 +544,8 @@ extern bool tcp_syn_flood_action(struct sock *sk,
> >>>>  extern void tcp_push_one(struct sock *, unsigned int mss_now);
> >>>>  extern void tcp_send_ack(struct sock *sk);
> >>>>  extern void tcp_send_delayed_ack(struct sock *sk);
> >>>> +extern int tcp_init_tso_segs(const struct sock *sk, struct sk_buff *skb,
> >>>> +			     unsigned int mss_now);
> >>>>  
> >>>>  /* tcp_input.c */
> >>>>  extern void tcp_cwnd_application_limited(struct sock *sk);
> >>>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> >>>> index f32c02e..11d070e 100644
> >>>> --- a/net/ipv4/tcp.c
> >>>> +++ b/net/ipv4/tcp.c
> >>>> @@ -1196,6 +1196,12 @@ new_segment:
> >>>>  			TCP_SKB_CB(skb)->end_seq += copy;
> >>>>  			skb_shinfo(skb)->gso_segs = 0;
> >>>>  
> >>>> +			if (tp->repair) {
> >>>> +				/* skb should look like already posted */
> >>>> +				tcp_init_send_head(sk);
> >>>
> >>> When does new skb occur in the send_head
> >> tcp_sendmsg
> >> 	sk_stream_alloc_skb
> >> 	skb_entail(sk, skb);
> >> 		tcp_add_write_queue_tail(sk, skb);
> >> 			sk->sk_send_head = skb;
> > I think all hacks should be in a one place, it is easier for reading.
> 
> The skb_entail is called from ->sendpages, so either the check whether or
> not to update the sk_send_head should be there, or the call to reset the
> send_head should be in the repair-off code.
I have a better idea. Your variant requered to hack sendpages and all
place which call tcp_push.

I suggest to fix tcp_write_xmit. If a socket is in repair mode, this
function will not send data to network, but will do all other action.

It's a usuall way how a packet in queued in a retransit queue.

What do you think about this?
> 
> >>>
> >>>> +				tcp_init_tso_segs(sk, skb, mss_now);
> >>>> +				tp->packets_out += tcp_skb_pcount(skb);
> >>>> +			}
> >>>>  			from += copy;
> >>>>  			copied += copy;
> >>>>  			if ((seglen -= copy) == 0 && iovlen == 0)
> >>>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> >>>> index cfe6ffe..3e95bc7 100644
> >>>> --- a/net/ipv4/tcp_output.c
> >>>> +++ b/net/ipv4/tcp_output.c
> >>>> @@ -1579,7 +1579,7 @@ static inline unsigned int tcp_cwnd_test(const struct tcp_sock *tp,
> >>>>   * This must be invoked the first time we consider transmitting
> >>>>   * SKB onto the wire.
> >>>>   */
> >>>> -static int tcp_init_tso_segs(const struct sock *sk, struct sk_buff *skb,
> >>>> +int tcp_init_tso_segs(const struct sock *sk, struct sk_buff *skb,
> >>>>  			     unsigned int mss_now)
> >>>>  {
> >>>>  	int tso_segs = tcp_skb_pcount(skb);
> >>>> @@ -3140,6 +3140,7 @@ void tcp_send_window_probe(struct sock *sk)
> >>>>  		tcp_sk(sk)->snd_wl1 = tcp_sk(sk)->rcv_nxt - 1;
> >>>>  		tcp_sk(sk)->snd_nxt = tcp_sk(sk)->write_seq;
> >>>>  		tcp_xmit_probe_skb(sk, 0);
> >>>> +		tcp_rearm_rto(sk);
> >>>
> >>> This should be done in setsockopt after calling tcp_send_window_probe
> >>> since it has nothing to do with sending the probe.
> >>
> >> Ok
> >>>
> >>>>  	}
> >>>>  }
> >>>>  
> >>>>
> >>>
> >>>
> > .
> > 
> 
> 


More information about the CRIU mailing list