[CRIU] Options when restoring a socket
Ross Boucher
rboucher at gmail.com
Tue Apr 21 11:54:38 PDT 2015
The tcp restore relies on the other side of the tcp connection still being
open though, right? In my case that won't be very easy. It would be much
easier for me to just re-establish the connection if I could notify myself
somehow that the process is restored. (I'm playing around right now with
running a timer in another thread to try and do that...)
On Tue, Apr 21, 2015 at 11:52 AM, Pavel Emelyanov <xemul at parallels.com>
wrote:
> On 04/21/2015 07:46 PM, Ross Boucher wrote:
> > Thanks, this is all very interesting. How does the story change for tcp
> sockets?
>
> For TCP we can "lock" the connection so that the peer doesn't see the
> socket gets closed. We either put iptables rule that blocks all the
> packets or (in case of containers) unplug container's virtual NIC from
> the network. So while the connection is locked we can kill the socket,
> then create it back. And the TCP-repair thing helps us "connect" the
> socket back w/o actually doing the connect.
>
> For unix socket we don't have ability to "lock" the connection in the
> first place. So once we dumped the task we cannot keep peer from noticing
> this. This thing was the main reason for not implementing this.
>
> -- Pavel
>
> > On Tue, Apr 21, 2015 at 5:03 AM, Pavel Emelyanov <xemul at parallels.com
> <mailto:xemul at parallels.com>> wrote:
> >
> > On 04/19/2015 04:48 AM, Ross Boucher wrote:
> > > Hey everyone,
> >
> > Hi, Ross.
> >
> > > I've been trying to figure out both what happens when you
> checkpoint an open socket
> > > and what my options are for restoring that socket (or maybe doing
> something else at
> > > that point in time). It might be best to just describe the program
> I have and what
> > > I want to accomplish.
> > >
> > > I have two programs communicating over a socket. Program A opens a
> socket and listens
> > > for connections, and then program B connects to it. They
> essentially exchange messages
> > > forever in a pattern something like:
> > >
> > > A -> B send next message
> > > B -> A ok, here's the next message
> > >
> > > Obviously, in between, A performs some actions. The goal is to
> checkpoint A after each
> > > message is processed and before the next is received (while
> leaving the process
> > > running), so that we can restore to any previous state and
> reprocess possibly changed
> > > messages.
> >
> > First thing that comes to mind is that --track-mem thing definitely
> makes sense for
> > such frequent C/R-s. But that's a side note :)
> >
> > > It's completely fine for our use case to have to re-establish that
> socket connection,
> > > we don't actually need or want to try and magically use the same
> socket (since program
> > > B has probably moved on to other things in between).
> >
> > Hm.. OK.
> >
> > > Is this a use case for a criu plugin?
> >
> > Well, I would say it is, but there are two things about it. First is
> that we don't have any
> > hooks in CRIU code for sockets, so patching will be required. And
> the second is -- many
> > people are asking for handling the connected unix socket, so I think
> we'd better patch
> > criu itself to provide some sane API rather than make everybody
> invent their own plugins :)
> >
> > So, I see two options for such an API.
> >
> > The first is to extend the --ext-unix-sk to accept socket ID as an
> argument that would
> > force even stream sockets to be marked as "external" and not block
> the dump. On restore
> > the --ext-unix-sk with $ID would make CRIU connect the restored unix
> socket back to its
> > original path. Optionally we can make the argument look like
> $ID:$PATH and connect to
> > $PATH instead.
> >
> > The other option would be to still teach dump accept the
> --ext-unix-sk $ID parameter. On
> > restore you can create the connection yourself and then pass it into
> CRIU with --inherit-fd
> > argument. We already have fd inheritance for pipes and files [1],
> but we can patch CRIU
> > to support it for sockets too.
> >
> > [1] http://criu.org/Inheriting_FDs_on_restore
> >
> > > I've tried playing around with the ext-unix-sk flag but I haven't
> quite figured anything
> > > out yet.
> >
> > The --ext-unix-sk is for datagram sockets as they are stateless and
> we can just close and
> > re-open one back w/o any risk that the peer notices it.
> >
> > > Any help would be appreciated. Thanks!
> >
> > -- Pavel
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20150421/d3b97edb/attachment.html>
More information about the CRIU
mailing list