[CRIU] [Users] socket will take at least 0.5 seconds to recovery after docker restore done

Pavel Emelyanov xemul at parallels.com
Tue Jul 14 04:37:03 PDT 2015


On 07/13/2015 03:58 PM, Yanbao Cui wrote:
> Hi,
> 
> Detail my scenarios as follows:
> 
> 1. Run a docker container on node A, the applications in it are VNC server and a simple UDP test program.
> 
> 2. Checkpoint it, and then restore it another node B
> 
> 3. After restore successful, we found the UDP test program need at least 0.5 seconds (if only this test 
> program in the container) to reconnect to peer. 

UDP to reconnect?

> Also the TCP which is used by VNC, we found it need more time to reconnected from tcpdump.

Can you check with tcpdump what the packet flow is? For TCP this can also be due to the
window probe packet lost :( This probably should be fixed in the kernel.

> in this scenario, it need about 1.5 seconds to recovery
> 
> 4. we ping the container ip with interval 0.1 when checkpoint/restore, and find that it only hang about 20ms during this procedure
> 
> summarize my test results:
> 
> the restore need about 0.5 seconds
> 
> after restore successful, it also need at least 0.5 seconds to recover the established connection which created before checkpoint. for the new created connection, such as ping, it can response immediately.
> 
> and I think there is no particular system call hangs.
> 
> 
> On Mon, Jul 13, 2015 at 8:12 PM, Pavel Emelyanov <xemul at parallels.com <mailto:xemul at parallels.com>> wrote:
> 
>     On 07/12/2015 11:57 AM, Vasily Averin wrote:
>     > Re-addressed to criu mailing list
>     >
>     > On 12.07.2015 10:48, Yanbao Cui wrote:
>     >> Hi,
>     >>
>     >> I am working on the docker checkpoint/restore use CRIU, and the time consuming is the key point we concerned.
>     >>
>     >> I found a strange phenomenon that the created socket need additional at least 0.5 seconds to reconnect after
>     >> the docker restore done.
> 
>     Can you shed more light on this -- is there any particular system call that hangs for 0.5
>     seconds or is it just a "criu restore" time you observe?
> 
>     >>
>     >> But if I create a new socket after restoring the docker , it will connect to the peer immediately.
>     >>
>     >> No matter the connection uses TCP or UDP.
>     >>
>     >> From the CRIU source code, I found that it will create a new socket with SO_REUSEADDR, and change the original fd to this new one. but I have no idea that why it need more time to recovery.
>     >>
>     >> Could someone help me to explain it or give me some points?
>     >>
>     >> Thanks very much!
>     >>
>     >> --
>     >> Best Regards
>     >> Cui Yanbao | 崔言宝
>     >> --
>     >> 龍生玖天,豈能安於凡塵!
>     > _______________________________________________
>     > CRIU mailing list
>     > CRIU at openvz.org <mailto:CRIU at openvz.org>
>     > https://lists.openvz.org/mailman/listinfo/criu
>     >
> 
> 
> 
> 
> -- 
> Best Regards
> Cui Yanbao | 崔言宝
> --
> 龍生玖天,豈能安於凡塵!



More information about the CRIU mailing list