[CRIU] [PATCH v1 05/12] gc: implement unlocking of tcp connections

Tycho Andersen tycho.andersen at canonical.com
Mon Aug 8 07:42:43 PDT 2016


On Mon, Aug 08, 2016 at 04:13:39PM +0300, Pavel Emelyanov wrote:
> On 08/04/2016 07:26 PM, Tycho Andersen wrote:
> > Hi Eugene,
> > 
> > On Thu, Aug 04, 2016 at 07:01:36PM +0300, Eugene Batalov wrote:
> >> 2016-08-04 18:53 GMT+03:00 Tycho Andersen <tycho.andersen at canonical.com>:
> >>
> >>> On Thu, Aug 04, 2016 at 06:42:06PM +0300, Eugene Batalov wrote:
> >>>> 2016-08-04 18:08 GMT+03:00 Tycho Andersen <tycho.andersen at canonical.com>
> >>> :
> >>>>
> >>>>> Hi Eugene,
> >>>>>
> >>>>> On Thu, Aug 04, 2016 at 02:49:23PM +0300, Eugene Batalov wrote:
> >>>>>> Hi Tycho,
> >>>>>>
> >>>>>>>> +int gc_network_unlock(void)
> >>>>>>>> +{
> >>>>>>>> +     /*
> >>>>>>>> +      * Unshared ps tree net ns is destroyed after successful
> >>> dump.
> >>>>>>>> +      * No need to call network_unlock_internal.
> >>>>>>>> +      * Also don't call ACT_NET_UNLOCK script because we don't
> >>>>>>>> +      * resume/restore ps tree - this call would break
> >>>>>>>> +      * ACT_NET_UNLOCK semantics.
> >>>>>>>> +      */
> >>>>>>>> +     return rst_unlock_tcp_connections();
> >>>>>>>
> >>>>>>> What about cpt_unlock_tcp_connections()? IIUC this list is not
> >>>>>>> persisted, and so if we leave around the network lock stuff, we
> >>> would
> >>>>>>> never turn of TCP repair mode.
> >>>>>>>
> >>>>>>> Tycho
> >>>>>>>
> >>>>>> Let's consider the moment when we start criu gc or criu restore. At
> >>>>>> this moment dumpee ps tree doesn't exist and ps tree sockets don't
> >>>>>> exist.
> >>>>>
> >>>>> Hmm. Perhaps I misunderstood then. I thought the point was to be able
> >>>>> to use --leave-stopped (so the ps tree would still exist), and then we
> >>>>> could at a later time run `criu gc` to unlock things again and clean
> >>>>> this up.
> >>>>>
> >>>> Looks like this is the use case you need to support. Let's look at
> >>> current
> >>>> implementation. Does it satisfies your needs?
> >>>
> >>> I worked around my issue in another way, so I think we can ignore my
> >>> needs for the purpose of `criu gc`. If we wanted to clean up after a
> >>> dump that left the network locked, I don't think this set would do it
> >>> completely, but it sounds like we may not care about that.
> >>>
> >> Could you propose an example when this patch set doesn't unlock the network?
> > 
> > Sorry, I was speaking of the case I already mentioned, about trying to
> > use criu gc to clean up after a criu dump --leave-stopped. I think
> > this patchset doesn't handle that case because of the missing
> > cpt_unlock_tcp_connections(); but it sounds like we don't care about
> > that, so never mind :)
> 
> Well, yes, _cleaning_ the stopped tasks would be just killing them :)
> Otherwise it should be called 'resuming' and I guess I know what you
> need it for ;) -- live migration?
> 
> Actually, if you look at how p.haul works, it doesn't make criu exit
> after final dump, it gets the final notifiction from it and goes on
> the restore node for criu restore. If it fails, the notification
> is aborted and criu just rolls back.

Yep, that's what I ended up doing :)

Tycho


More information about the CRIU mailing list