[Devel] Re: checkpoint/restart ABI
Serge E. Hallyn
serue at us.ibm.com
Tue Aug 12 07:49:05 PDT 2008
Quoting Peter Chubb (peterc at gelato.unsw.edu.au):
> >>>>> "Jeremy" == Jeremy Fitzhardinge <jeremy at goop.org> writes:
>
> Jeremy> Dave Hansen wrote:
> >> Arnd, Jeremy and Oren,
> >>
>
>
> Jeremy> * multiple processes * pipes * UNIX domain sockets * INET
> Jeremy> sockets (both inter and intra machine) * unlinked open files *
> Jeremy> checkpointing file content * closed files (ie, files which
> Jeremy> aren't currently open, but will be soon, esp tmp files) *
> Jeremy> shared memory * (Peter, what have I forgotten?)
>
> File sharing; multiple threads with wierd sharing arrangements (think:
> clone with various parameters, followed by exec in some of the threads
> but not others); MERT/system-V shared memory, semaphores and message
> queues; devices (audio, framebuffer, etc), HugeTLBFS, numa issues
> (pinning, memory layout), processes being debugged (so,
> checkpoint.restart a gdb/target pair), futexes, etc., etc. Linux
> process state keeps expanding.
>
> Jeremy> Having gone through this before, I don't think an all-kernel
> Jeremy> solution can work except for the most simple cases.
>
> I agree ... it's better to put mechanisms into the kernel that can
> then be used by a user-space programme to actually do the
> checkpointing and restarting.
>
> Beefing up ptrace or fixing /proc to be a real debugging interface
> would be a start ... when you can get at *all* the info you need,
Except we don't really want to export all the info you need for a
complete restartable checkpoint. And especially not make it
generally writable.
We have also started down that path using ptrace (see cryo, at
git://git.sr71.net/~hallyn/cryodev.git).
Right before the containers mini-summit, where the general agreement was
that a complete in-kernel solution ought to be pursued, I had tried
a restart using a binary format that read a checkpoint file and used
cryo (userspace using ptrace) for the rest of the restart, only
because there was no other reasonable way to set tsk->did_exec on
restart.
> quickly and easily, the userspace checkpoint falls out fairly
> naturally. You still have to work out an extensible file format to
> store stuff, and how to restore all that state you've so lovingly
> collected.
>
> Jeremy> Lightweight filesystem checkpointing, such as btrfs provides,
> Jeremy> would seem like a powerful mechanism for handling a lot of the
> Jeremy> filesystem state problems. It would have been useful when we
> Jeremy> did this...
>
> And how! saving bits of files was very timeconsuming.
Yes, we're looking forward to using btrfs' snapshots :)
-serge
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list