[Devel] Re: [PATCH 1/3] powerpc: bare minimum checkpoint/restart implementation
Nathan Lynch
ntl at pobox.com
Thu Jan 29 13:40:35 PST 2009
Hey Oren, thanks for taking a look.
Oren Laadan wrote:
>
> Nathan Lynch wrote:
> >
> > What doesn't work:
> > * restarting a 32-bit task from a 64-bit task and vice versa
>
> Is there a test to bail if we attempt to checkpoint such tasks ?
No, but I'll add one if it looks too hard to fix for the next round.
> > +struct cr_hdr_cpu {
> > + struct pt_regs pt_regs;
>
> It has been suggested (as done in x86/32 code) not to use 'struct pt_regs'
> because it "can (and has) changed on x86" and because "it only container
> the registers that the kernel trashes, not all usermode registers".
>
> https://lists.linux-foundation.org/pipermail/containers/2008-August/012355.html
Yeah, I considered that discussion, but the situation is different for
powerpc (someone on linuxppc-dev smack me if I'm wrong here :)
pt_regs is part of the ABI, and it encompasses all user mode registers
except for floating point, which are handled separately.
> > + /* relevant fields from thread_struct */
> > + double fpr[32][TS_FPRWIDTH];
>
> Can TS_FPRWIDTH change between sub-archs or kernel versions ? If so, it
> needs to be stated explicitly.
>
> > + unsigned int fpscr;
> > + int fpexc_mode;
> > + /* unsigned int align_ctl; this is never updated? */
> > + unsigned long dabr;
>
> Are these fields always guarantee to compile to the same number of bytes
> regardless of 32/64 bit choice of compiler (or sub-arch?) ?
>
> In the x86(32/64) architecture we use types with explicit size such as
> __u32 and the like to ensure that it always compiled to the same
> size.
Yeah, I'll have to fix these up.
> > +static void cr_hdr_init(struct cr_hdr *hdr, __s16 type, __s16 len, __u32 parent)
> > +{
> > + hdr->type = type;
> > + hdr->len = len;
> > + hdr->parent = parent;
> > +}
> > +
>
> This function is rather generic and useful to non-arch-dependent and other
> architectures code. Perhaps put in a separate patch ?
Alright. By the way, why are cr_hdr->type and cr_hdr->len signed
types?
> > +int cr_write_cpu(struct cr_ctx *ctx, struct task_struct *t)
> > +{
> > + struct cr_hdr_cpu *cpu_hdr;
> > + struct pt_regs *pt_regs;
> > + struct cr_hdr cr_hdr;
> > + u32 parent;
> > + int ret;
> > +
> > + cpu_hdr = cr_hbuf_get(ctx, sizeof(*cpu_hdr));
> > + if (!cpu_hdr)
> > + return -ENOMEM;
> > +
> > + parent = task_pid_vnr(t);
> > +
> > + cr_hdr_init(&cr_hdr, CR_HDR_CPU, sizeof(*cpu_hdr), parent);
> > +
> > + /* pt_regs: GPRs, MSR, etc */
> > + pt_regs = task_pt_regs(t);
> > + cpu_hdr->pt_regs = *pt_regs;
> > +
> > + /* FP state */
> > + memcpy(cpu_hdr->fpr, t->thread.fpr, sizeof(cpu_hdr->fpr));
>
> As note above, is sizeof(cpu_hdr->fpr) the same on all chips ?
It can differ depending on kernel configuration.
> > +/* restart APIs */
> > +
>
> The restart APIs belong in a separate file: arch/powerpc/mm/restart.c
Explain why, please? This isn't a lot of code, and it seems likely
that checkpoint and restart paths will share data structures and tend
to be modified together over time.
> > + pr_debug("%s: unexpected thread_hdr contents: 0x%lx\n",
> > + __func__, (unsigned long)thread_hdr->unimplemented);
>
> Given the macro for 'pr_fmt' in include/linux/checkpoint.h, the use of
> __func__ is redunant.
It seems to me that defining your own pr_fmt in a "public" header like
that is inappropriate, or at least unconventional. Any file that
happens to include linux/checkpoint.h will have any prior definitions
of pr_fmt overridden, no?
> > + regs = task_pt_regs(current);
> > + *regs = cpu_hdr->pt_regs;
> > +
> > + regs->msr = sanitize_msr(regs->msr);
> > +
> > + /* FP state */
> > + memcpy(current->thread.fpr, cpu_hdr->fpr, sizeof(current->thread.fpr));
> > + current->thread.fpscr.val = cpu_hdr->fpscr;
> > + current->thread.fpexc_mode = cpu_hdr->fpexc_mode;
> > +
> > + /* debug registers */
> > + current->thread.dabr = cpu_hdr->dabr;
>
> I'm unfamiliar with powerpc; is it necessary to sanitize any of the registers
> here ? For instance, can the user cause harm with specially crafted values
> of some registers ?
I had this in mind with the treatment of MSR, but I'll check on the
others, thanks.
> > +int cr_read_mm_context(struct cr_ctx *ctx, struct mm_struct *mm, int rparent)
> > +{
> > + struct cr_hdr_mm_context *mm_hdr;
> > + int ret;
> > +
> > + mm_hdr = cr_hbuf_get(ctx, sizeof(*mm_hdr));
> > + if (!mm_hdr)
> > + return -ENOMEM;
> > +
> > + ret = cr_read_obj_type(ctx, mm_hdr, sizeof(*mm_hdr),
> > + CR_HDR_MM_CONTEXT);
> > + if (ret != rparent)
> > + goto out;
>
> Seems like 'ret' isn't set to an error value if the 'goto' executes.
It returns whatever error value cr_read_obj_type() returns. Hrm. I
guess if the image is garbage, cr_read_obj_type can potentially return
a non-error value that still isn't the desired value, is that right?
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list