[Devel] Re: build breaks when checkpoint unimplemented by arch
Oren Laadan
orenl at cs.columbia.edu
Tue Jul 7 12:03:22 PDT 2009
Nathan Lynch wrote:
> Oren Laadan <orenl at cs.columbia.edu> writes:
>> On Tue, 7 Jul 2009, Nathan Lynch wrote:
>>
>>> Oren Laadan <orenl at cs.columbia.edu> writes:
>>>> That's what I tried initially, but the problem is that sigset_t may
>>>> be defined differently for userspace - see /usr/include/asm/sigset_t.h.
>>>> In fact, for x86_32, it it is different, defined as 'unsigned long'
>>>> (and NSIG defined as 32, so only 32 bits).
>>> I noticed this, but I figured only the kernel definition was salient.
>>> Apart from debugging checkpoint/restart, why would userspace need the
>>> definition of struct ckpt_hdr_sigset?
>> I expect user space tools to at least:
>>
>> - Assist in debugging c/r
>>
>> - Assist users in reporting problems with c/r (especially since they
>> themselves do not debug or hack)
>>
>> - Convert checkpoint images from one kernel version to another
>>
>> - Provide information about a checkpoint image, and even allow its
>> manipulation. This can assist developers in debugging their programs
>> (e.g. to debug a crash you need to run a program for 30 minutes so it
>> ets up its state; instead of repeatedly running it, you run it once,
>> checkpoint, and then debug from a restarted version. A tool could
>> allow you to peek/poke inside the checkpoint and even modify data in
>> it).
>>
>> - Or a tool that converts a checkpoint image to a core dump so it
>> can be inspected with gdb.
>>
>> I'm pretty sure others will find other uses to it...
>
> But I asked specifically about ckpt_hdr_sigset.
>
>
>>> For that matter, why would userspace need the definitions of most of the
>>> structures in checkpoint_hdr.h? (Again, debugging purposes don't count:
>>> ckptinfo or similar developer utilities can be included with the
>>> kernel.)
>> Keeping the checkpoint header format understandable by user space (and
>> immune to 32-64 variations) has been a requirement since day 1.
>
> I guess I wasn't around that day. It seems backwards to expose the
> format of every checkpoint record in the ABI regardless of whether
> plausible use cases exist. Linux has a well-established pattern of
> introducing interfaces without sufficient testing or documentation[1],
> and I expect C/R will adhere to tradition. Making the ABI obese in the
> hope of anticipating every conceivable use will just provide more
> opportunities to screw up.
>
> [1] http://userweb.kernel.org/~mtk/papers/lce2007/What_we_lose_without_words.pdf
I could not agree more !
The intent of exposure to userspace is not to establish an ABI, but
solely to allow *specialized* c/r-related user tools to understand
such data, per kernel version.
On the contrary: it is expected to change between kernel versions
and break compatibility with older version, on a regular basis.
That is why we plan to do conversion of checkpoint images between
kernel version in userspace.
I view it as a "window" for userspace to glance at how checkpoint
image for a specific kernel version is defined. And it comes as is,
no-strings-attached, with nothing but a promise to likely break it
on the next release.
This begs the question: how to make sure that this message is clear
and is not misinterpreted ? Or (and I'm no API expert) - perhaps
there is a better way...
Oren.
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list