[Devel] Re: [RFC v14][PATCH 00/54] Kernel based checkpoint/restart
Serge E. Hallyn
serue at us.ibm.com
Tue May 5 06:49:20 PDT 2009
Quoting Louis Rilling (Louis.Rilling at kerlabs.com):
> On 04/05/09 8:01 -0500, Serge E. Hallyn wrote:
> > Quoting Oren Laadan (orenl at cs.columbia.edu):
> > > > I see one drawback with this approach if you allow checkpoint of
> > > > application that is not isolated in a container. In that case, you may
> > > > want to select which IPC objects to dump to not dump all the IPC objects
> > > > living in the system. Indeed, this is why we have chosen in Kerrighed to
> > > > checkpoint IPC objects independently of tasks, since we have no
> > > > container/namespaces support currently.
> > >
> > > I assume that in this case it will be the application itself that
> > > will somehow tell the system which specific sysvipc objects (ids) it
> > > cares about.
> > >
> > > (I'm not sure how would the system otherwise know what to dump and
> > > what to leave out).
> > >
> > > I originally proposed the construct of cradvise() syscall to handle
> > > exactly those cases where the application would like to advise the
> > > kernel about certain resources. So, extending the previous example,
> > > a task may call something like:
> > >
> > > cradvise(CHECKPOINT_SYSVIPC_SHM, false); /* generally skip shm */
> > > cradvise(CHECKPOINT_SYSVIPC_SHMID, id, true); /* but include this */
> > >
> > > or:
> > > cradvise(CHECKPOINT_SYSVIPC_SHM, true); /* generally include shm */
> > > cradvise(CHECKPOINT_SYSVIPC_SHMID, id, false); /* but skip this */
> > >
> > > Anyway, these are just examples of the concept and what sort of generic
> > > interface can be used to implement it; don't pick on the details...
> > >
> > > Oren.
> >
> > Oren, I have to be honest: I could of course be wrong, but imo there
> > is 0 chance of such a bigger-and-uglier-than-ioctl syscall as cradvise
> > being accepted upstream. There may be good uses for it, but I think
> > it's worthwhile thinking of ways around it whenever possible.
> >
> > In this particular case, wouldn't it be better to do something like:
> >
> > 1. freeze + checkpoint full application + container (== C1)
> > 2. continue application, which does a clone(CLONE_COPYIPC) (*1)
> > 3. application removes all shms except the one to be
> > checkpointed
> > 4. freeze + checkpoint application again ( == C2)
> > 5. restart applicaiton from C1
> >
>
> Besides COW issues mentioned by Oren in his reply, this approach does not
> seem to provide the required flexibility. The point is to avoid checkpointing
> some IPC objects together with the application,
... avoided at step 3 ...
> but we still need those IPC
> objects, and the application still uses them.
... step 5 ...
> Moreover, on restart the
> administrator should be able to first install the required IPC objects, e.g.
> re-create them from scratch, or restore them from another checkpoint, and second
> restart the application, linking it to the previously
> re-created/restored/whatever SHMs.
Of course he can do that.
Anyway I'm not setting off to implement the clone(COPY_IPC)
functionality, and Oren might be right that cradvise would
be deemed different from ioctl. I just thought I'd give a
warning, and (being a productive type :) give an alternative...
By the way, another alternative to all of the cr_advise()
stuff is to have userspace programs carve up your checkpoint
images. It's been talked about before, but I believe Nathan
in particular is worried about what this says about kernel-user
API.
-serge
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list