[Devel] Re: [RFC v14][PATCH 00/54] Kernel based checkpoint/restart

Serge E. Hallyn serue at us.ibm.com
Tue May 5 06:49:20 PDT 2009


Quoting Louis Rilling (Louis.Rilling at kerlabs.com):
> On 04/05/09  8:01 -0500, Serge E. Hallyn wrote:
> > Quoting Oren Laadan (orenl at cs.columbia.edu):
> > > > I see one drawback with this approach if you allow checkpoint of
> > > > application that is not isolated in a container. In that case, you may
> > > > want to select which IPC objects to dump to not dump all the IPC objects
> > > > living in the system. Indeed, this is why we have chosen in Kerrighed to
> > > > checkpoint IPC objects independently of tasks, since we have no
> > > > container/namespaces support currently.
> > > 
> > > I assume that in this case it will be the application itself that
> > > will somehow tell the system which specific sysvipc objects (ids) it
> > > cares about.
> > > 
> > > (I'm not sure how would the system otherwise know what to dump and
> > > what to leave out).
> > > 
> > > I originally proposed the construct of cradvise() syscall to handle
> > > exactly those cases where the application would like to advise the
> > > kernel about certain resources. So, extending the previous example,
> > > a task may call something like:
> > > 
> > >    cradvise(CHECKPOINT_SYSVIPC_SHM, false);  /* generally skip shm */
> > >    cradvise(CHECKPOINT_SYSVIPC_SHMID, id, true);  /* but include this */
> > > 
> > > or:
> > >    cradvise(CHECKPOINT_SYSVIPC_SHM, true);  /* generally include shm */
> > >    cradvise(CHECKPOINT_SYSVIPC_SHMID, id, false);  /* but skip this */
> > > 
> > > Anyway, these are just examples of the concept and what sort of generic
> > > interface can be used to implement it; don't pick on the details...
> > > 
> > > Oren.
> > 
> > Oren, I have to be honest:  I could of course be wrong, but imo there
> > is 0 chance of such a bigger-and-uglier-than-ioctl syscall as cradvise
> > being accepted upstream.  There may be good uses for it, but I think
> > it's worthwhile thinking of ways around it whenever possible.
> > 
> > In this particular case, wouldn't it be better to do something like:
> > 
> > 	1. freeze + checkpoint full application + container (== C1)
> > 	2. continue application, which does a clone(CLONE_COPYIPC) (*1)
> > 	3. application removes all shms except the one to be
> > 	checkpointed
> > 	4. freeze + checkpoint application again ( == C2)
> > 	5. restart applicaiton from C1
> > 
> 
> Besides COW issues mentioned by Oren in his reply, this approach does not
> seem to provide the required flexibility. The point is to avoid checkpointing
> some IPC objects together with the application,

... avoided at step 3 ...

> but we still need those IPC
> objects, and the application still uses them.

... step 5 ...

> Moreover, on restart the
> administrator should be able to first install the required IPC objects, e.g.
> re-create them from scratch, or restore them from another checkpoint, and second
> restart the application, linking it to the previously
> re-created/restored/whatever SHMs.

Of course he can do that.

Anyway I'm not setting off to implement the clone(COPY_IPC)
functionality, and Oren might be right that cradvise would
be deemed different from ioctl.  I just thought I'd give a
warning, and (being a productive type :) give an alternative...

By the way, another alternative to all of the cr_advise()
stuff is to have userspace programs carve up your checkpoint
images.  It's been talked about before, but I believe Nathan
in particular is worried about what this says about kernel-user
API.

-serge
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list