[Devel] Re: [PATCH 1/1] fill vdso with syscall32_setup_pages if TIF_IA32 on x86_64

Serge E. Hallyn serue at us.ibm.com
Tue Feb 9 06:54:24 PST 2010


Quoting Oren Laadan (orenl at cs.columbia.edu):
> 
> 
> Serge E. Hallyn wrote:
> >Quoting Oren Laadan (orenl at cs.columbia.edu):
> >>
> >>Serge E. Hallyn wrote:
> >>>Quoting Oren Laadan (orenl at cs.columbia.edu):
> >>>>Serge E. Hallyn wrote:
> >>>>>Quoting Oren Laadan (orenl at cs.columbia.edu):
> >>>>>>Serge E. Hallyn wrote:
> >>>>>>>Quoting Oren Laadan (orenl at cs.columbia.edu):
> >>>>>>>>Serge E. Hallyn wrote:
> >>>>>>>>>Quoting Oren Laadan (orenl at cs.columbia.edu):
> >>>>>>>>>>Cool !
> >>>>>>>>>>
> >>>>>>>>>>So what do we have working now for 64 bit kernel (for 32 bit kernel
> >>>>>>>>>>we know it works...):
> >>>>>>>>>>
> >>>>>>>>>>	'restart'	checkpointed
> >>>>>>>>>>	 program	  program
> >>>>>>>>>>	----------------------------------------
> >>>>>>>>>>	  64bit		  64bit		-> works
> >>>>>>>>>>	  32bit		  32bit		-> works
> >>>>>>>>>>
> >>>>>>>>>>	  64bit		  32bit		-> ?????
> >>>>>s/?????/Rejected/
> >>>>>
> >>>>>CKPT_ARCH_ID is of course different for X86_32 than X86_64, so
> >>>>>we refuse restart in restore_read_header().
> >>>>>
> >>>>>-serge
> >>>>>
> >>>>lol ... that's actually funny !
> >>>>
> >>>>Anyway, in light of the IRC discussions, here are the cases again:
> >>>>
> >>>>
> >>>>original	original	restart		target
> >>>>program		kernel		program		kernel
> >>>>--------	---------	--------	--------
> >>>>64 bit		64 bit		64 bit		64 bit	  [0] works
> >>>>
> >>>>32 bit		32 bit		32 bit		32 bit	  [0] works
> >>>>32 bit		64 bit		32 bit		64 bit	  [0] works
> >>>>
> >>>>32 bit		32 bit		32 bit		64 bit	  [1]
> >>>>32 bit		64 bit		32 bit		32 bit	  [1]
> >>>>
> >>>>32 bit		any		64 bit		64 bit	  [2]
> >>>>64 bit		64 bit		32 bit		64 bit	  [2]
> >>>>
> >>>>[0] The first 3 cases are "homogeneous", with conditions equal at
> >>>>checkpoint and restart. AFAIK, they work.
> >>>>
> >>>>[1] The next two cases consider 32 bit program, and vary only the
> >>>>environment - the kernel may change from 32 to 64 or back. We want
> >>>>them to work.
> >>>>
> >>>>IIUC, your comment above means that they don't work because the
> >>>>CKPT_ARCH_ID is a mismatch. The fix should be trivial - either
> >>>>make 'restart' modify it, or make the kernel tolerate it.
>      ^^^^^^^^^^^^^^^^^^^^^^^^
> ---->
> 
> >>>Well, you'd think so, but we also check for uts->machine, and want
> >>>to eventually check for kernel config, both of which are obviously
> >>>different.
> >>Then we'll have to take that in account when we get to also
> >>check those other fields.
> >>
> >>>After I comment out the obvious offending checks, it still fails to
> >>>restart from x8632->x86-64.  I can spend some time next week figuring
> >>>out what we're not quite doing right as there shouldn't be a
> >>>problem really.  But do we definately want to go out of our way to try
> >>>and mask out the differences in this case, while trying to detect
> >>>cpu differences between two x86-32's for instance?
> >>I agree, there shouldn't be a problem really, and I expect this to
> >>be a very useful feature for migration/fault-tolerance.
> >
> >May be, but then perhaps this is the first case where we should be
> >using a userspace checkpoing image rewriter to help us out.  Otherwise
> >we'll need to hardcode in the kernel that a task which was
> >checkpointed on X86_32 should, on x86_64, have TIF_IA32 added to
> >the thread_flags but may be restarted;  etc.  Should be doable, but
> >kind of ugly...
> 
> Indeed. I offered that path above :)
> 
> Since we are going to need the bit-ness of a task for the tree
> creation as well, how about:
> 
> 1) Add the bit-ness property to the pids_arr[], e.g. as a flags
> field (we may need use it for other stuff later).
> 
> 2) 'restart' already examines and possibly modifies pids_arr[],
> so in transition from 32->64 it will add that flag, and in the
> opposite transition it will check/remove that flag.
>
> 3) 'restart' will also change the header architecture as needed.
> 
> 4) The kernel will verify that the bitness reported in pids_arr[]
> is the same as the actual process. (This is just a sanity check,
> of course).
> 
> Later we'll also make 'restart' use that bit-ness information to
> decide whether an exec() is needed to change own bit-ness.

It'll mean yet another arch-dependent hook used early in the
checkpoint path, but if we want to restarted mixed-bit containers
i guess it's what we'll need.

Still I really don't think it's all that mean to just say we
don't support it:  at checkpoint we refuse with a meaningful
log message including pids of task which are COMPAT, and the
end-user can use that info to checkpoint those applications
separately as subtrees, kill them, then checkpoint the container,
then restart the applications.

If to my surprise there turn out to be people who care, then
we can make the necessary changes to accomodate them.  But IMO
we have enough to worry about right now.

-serge
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list