[Devel] Re: C/R without "leaks"

Oren Laadan orenl at cs.columbia.edu
Fri Apr 17 02:48:09 PDT 2009



Greg Kurz wrote:
> On Thu, 2009-04-16 at 14:39 -0400, Oren Laadan wrote:
>> Any connection in that case is, of course, lost, and it's up to the
>> application to do something about it. If the application relies on
>> the state of the connection, it will have to give up (e.g. sshd, and
>> ssh, die).
>>
> 
> And that's a good thing since that's exactly what users expect from
> sshd : to give up the connection when something goes wrong. I wouldn't
> trust a sshd with the ability to initiate connections on its own...
> 
> And anyway, I still don't see the scenario where C/R a sshd is useful...

You mean an sshd with an open connection probably; the server itself
is clearly useful to be able to c/r.

> Please someone (Alexey ?), provide a detailed use case where people
> would want to checkpoint or migrate live TCP connections... Discussion
> on containers@ is very interesting but really lacks of
> what-is-the-bigger-picture arguments... These huge patchsets are very
> tricky and intrusive... who wants them mainline ? what's the use of
> C/R ?
> 

A canonical example would a virtual-private-server: instead of doing
server consolidation with a virtual machine, your do with containers.
In a sense, containers lets you chop the OS into independent isolated
pieces. You ca use a linux box to run multiple virtual execution
environments (containers), each running services of your choice. They
could range from a sshd for users, to apache servers, to database
servers to users' vnc sessions, etc.

Now comes the that you really need to take the machine down, for
whatever reason. With c/r of live connections you can live-migrate
these containers to another machine (on the same subnet) that will
"steal" the IP as well, and voila - no service disruption.

Such scenarios are the focus of Alexey.

I'm also very interested in these scenarios, and I'm _also_ thinking
of other scenarios, where either (a) an entire container is not
necessary (example: user running long computation on laptop and wants
to save it before a reboot), or (b) the program would like to make
adjustments to its state compared to the time it was saved (example:
change the location of an output log file depending on the machine
on which your are running).

Unfortunately, if we plan for and require, as per Alexey, that c/r
would only work for whole-containers, these two cases will not be
addressed.

Oren.

>> However, there are many application that can withstand connection
>> lost without crashing. They simply retry (web browser, irc client,
>> db clients). With time, there may be more applications that are
>> 'c/r-aware'.
>>
> 
> HPC jobs are definitely good candidates.
> 
>> Moreover, in some cases you could, on restart, use a wrapper to
>> create a new connection to somewhere (*), then ask restart(2) to
>> use that socket instead of the original, such that from the user
>> point of view things continue to work well, transparently.
>>
> 
> Yes.
> 
>> (*) that somewhere, could be the original peer, or another server,
>> if it has a way to somehow continue a cut connection, or a special
>> wrapper server that you right for that purpose.
>>
>> Oren.
>>
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list