[CRIU] criu - Restoring TCP connections and timestamps

Dilip Daya dilip.daya at hp.com
Tue Nov 20 11:21:57 EST 2012


Re: tcp_time_stamp when restoring TCP connections

tcp_time_stamp:
This is the low-order 32 bits of the jiffies counter and is used to
generate the timestamp in the TCP timestamp option.  The timestamp
option is included in almost every outgoing TCP packet and is used for
two purposes.

For PAWS, it acts something like a high-order extension of the packet
sequence number.  If the receiver sees a timestamp value less than what
it previously got on the connection (modulo 2^31), it assumes the
sequence number is from a previous wrap and discards the packet.  The
timestamp generator must therefore never decrease.

The jiffies counters on separate systems can be any value when restore
occurs, so the restore system has a good chance of generating timestamps
less than that of the original system and the this will start dropping
packets.  This problem was discussed in the paper "TCP Connection
Passing" by Werner Almesberger, which is the basis of previous
connection migration code for Linux.  The solution given in that paper
is to save an offset for each connection that is applied to
tcp_time_stamp when generating the TCP timestamp option.  The offset is
calculated to ensure the timestamps never decrease.

Connection Repair does not appear to do anything special for timestamp
generation and I'm wondering how it gets away with that.

=> So looking at Linux kernel code snippets, does the following take
   care of the this issue?

As per:
- http://lxr.linux.no/linux+v3.6.7/net/ipv4/tcp_output.c#L3094
...
...
3094 /* This routine sends a packet with an out of date sequence
3095  * number. It assumes the other end will try to ACK it.
3096  *
3097  * Question: what should we make while urgent mode?
3098  * 4.4BSD forces sending single byte of data. We cannot send
3099  * out of window data, because we have SND.NXT==SND.MAX...
3100  * 
3101  * Current solution: to send TWO zero-length segments in urgent
mode:
3102  * one is with SEG.SEQ=SND.UNA to deliver urgent pointer, another
is
3103  * out-of-date with SND.UNA-1 to probe window.
3104  */
3105 static int tcp_xmit_probe_skb(struct sock *sk, int urgent)
3106 {
3107   struct tcp_sock *tp = tcp_sk(sk);
3108   struct sk_buff *skb;
3109
3110   /* We don't queue it, tcp_transmit_skb() sets ownership. */
3111   skb = alloc_skb(MAX_TCP_HEADER, sk_gfp_atomic(sk, GFP_ATOMIC));
3112   if (skb == NULL)
3113           return -1;
3114
3115   /* Reserve space for headers and set control bits. */
3116   skb_reserve(skb, MAX_TCP_HEADER);
3117   /* Use a previous sequence.  This should cause the other
3118    * end to send an ack.  Don't queue or clone SKB, just
3119    * send it.
3120    */
3121   tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPHDR_ACK);
3122   TCP_SKB_CB(skb)->when = tcp_time_stamp;          <<<<<<<<<<<<<<<
3123   return tcp_transmit_skb(sk, skb, 0, GFP_ATOMIC);
3124 }
...


=> When turning OFF TCP_REPAIR mode for outgoing/restore packets,
   timestamp generation is simply assigned the current jiffies value
   before the packet is sent to tcp_transmit_skb().

   - http://lxr.linux.no/linux+v3.6.7/include/net/tcp.h#L654
     ...
     654 #define tcp_time_stamp          ((__u32)(jiffies))



and


=> http://lxr.linux.no/linux+v3.6.7/include/net/tcp.h#L1153
...
1145 static inline bool tcp_paws_check(const struct tcp_options_received
*rx_opt,
1146                           int paws_win)
1147 {
1148 if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win)
1149     return true;
1150 if (unlikely(get_seconds() >= rx_opt->ts_recent_stamp +
TCP_PAWS_24DAYS))
1151     return true;
1152 /*
1153  * Some OSes send SYN and SYNACK messages with tsval=0 tsecr=0,
1154  * then following tcp messages have valid values. Ignore 0 value,
1155  * or else 'negative' tsval might forbid us to accept their
packets.
1156  */
1157 if (!rx_opt->ts_recent)
1158         return true;
1159 return false;
1160 }
...



Thanking you in advance.
-DilipD.



More information about the CRIU mailing list