[CRIU] [Users] socket will take at least 0.5 seconds to recovery after docker restore done

Yanbao Cui yygcui at gmail.com
Tue Jul 14 06:15:23 PDT 2015


Sorry for mistake.
For UDP, I mean the sever can receive the packet from client again.

Actually, I have analysis the tcpdump output, in my case, the client try to
reconnect to the server again, but can not receive SYN+ACK, so it
re-transmission after 1 second according to the client rule, and then try
again.

Pavel Emelyanov <xemul at parallels.com>于2015年7月14日 周二 19:37写道:

> On 07/13/2015 03:58 PM, Yanbao Cui wrote:
> > Hi,
> >
> > Detail my scenarios as follows:
> >
> > 1. Run a docker container on node A, the applications in it are VNC
> server and a simple UDP test program.
> >
> > 2. Checkpoint it, and then restore it another node B
> >
> > 3. After restore successful, we found the UDP test program need at least
> 0.5 seconds (if only this test
> > program in the container) to reconnect to peer.
>
> UDP to reconnect?
>
> > Also the TCP which is used by VNC, we found it need more time to
> reconnected from tcpdump.
>
> Can you check with tcpdump what the packet flow is? For TCP this can also
> be due to the
> window probe packet lost :( This probably should be fixed in the kernel.
>
> > in this scenario, it need about 1.5 seconds to recovery
> >
> > 4. we ping the container ip with interval 0.1 when checkpoint/restore,
> and find that it only hang about 20ms during this procedure
> >
> > summarize my test results:
> >
> > the restore need about 0.5 seconds
> >
> > after restore successful, it also need at least 0.5 seconds to recover
> the established connection which created before checkpoint. for the new
> created connection, such as ping, it can response immediately.
> >
> > and I think there is no particular system call hangs.
> >
> >
> > On Mon, Jul 13, 2015 at 8:12 PM, Pavel Emelyanov <xemul at parallels.com
> <mailto:xemul at parallels.com>> wrote:
> >
> >     On 07/12/2015 11:57 AM, Vasily Averin wrote:
> >     > Re-addressed to criu mailing list
> >     >
> >     > On 12.07.2015 10:48, Yanbao Cui wrote:
> >     >> Hi,
> >     >>
> >     >> I am working on the docker checkpoint/restore use CRIU, and the
> time consuming is the key point we concerned.
> >     >>
> >     >> I found a strange phenomenon that the created socket need
> additional at least 0.5 seconds to reconnect after
> >     >> the docker restore done.
> >
> >     Can you shed more light on this -- is there any particular system
> call that hangs for 0.5
> >     seconds or is it just a "criu restore" time you observe?
> >
> >     >>
> >     >> But if I create a new socket after restoring the docker , it will
> connect to the peer immediately.
> >     >>
> >     >> No matter the connection uses TCP or UDP.
> >     >>
> >     >> From the CRIU source code, I found that it will create a new
> socket with SO_REUSEADDR, and change the original fd to this new one. but I
> have no idea that why it need more time to recovery.
> >     >>
> >     >> Could someone help me to explain it or give me some points?
> >     >>
> >     >> Thanks very much!
> >     >>
> >     >> --
> >     >> Best Regards
> >     >> Cui Yanbao | 崔言宝
> >     >> --
> >     >> 龍生玖天,豈能安於凡塵!
> >     > _______________________________________________
> >     > CRIU mailing list
> >     > CRIU at openvz.org <mailto:CRIU at openvz.org>
> >     > https://lists.openvz.org/mailman/listinfo/criu
> >     >
> >
> >
> >
> >
> > --
> > Best Regards
> > Cui Yanbao | 崔言宝
> > --
> > 龍生玖天,豈能安於凡塵!
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20150714/e6694040/attachment.html>


More information about the CRIU mailing list