[CRIU] [PATCH 1/3] rst: Don't allocate page for child stack

Pavel Emelyanov xemul at parallels.com
Fri Sep 12 08:01:07 PDT 2014


On 09/12/2014 06:22 PM, Andrew Vagin wrote:
> Look at this code from glibc:
> ENTRY (__clone)
>         /* Sanity check arguments.  */
>         movq    $-EINVAL,%rax
>         testq   %rdi,%rdi               /* no NULL function pointers */
>         jz      SYSCALL_ERROR_LABEL
>         testq   %rsi,%rsi               /* no NULL stack pointers */
>         jz      SYSCALL_ERROR_LABEL
> 
>         /* Insert the argument onto the new stack.  */
>         subq    $16,%rsi
>         movq    %rcx,8(%rsi)

So? It will put arguments at the stack head. In my case the ca is
on top of the stack anyway, and the task pointer is on the top
of ca, so nothing will get corrupted.

> On Fri, Sep 12, 2014 at 06:13:29PM +0400, Andrew Vagin wrote:
>> On Fri, Sep 12, 2014 at 04:42:43PM +0400, Pavel Emelyanov wrote:
>>> When clone-ing kids we can set their stack on current, as
>>> it will anyway be COW-ed later.
>>
>> glibc() will modify a new stack. Look at this:
>>
>> [avagin at localhost ~]$ cat test.c 
>> #define _GNU_SOURCE
>> #include <unistd.h>
>> #include <sys/mman.h>
>> #include <signal.h>
>> #include <stdlib.h>
>>
>> int child(void *args)
>> {
>> 	exit(0);
>> }
>>
>> int main()
>> {
>> 	void *stack;
>>
>> 	stack = mmap(NULL, 4096, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
>> 	clone(child, stack, SIGCHLD, 0);
>>
>> 	return 0;
>> }
>> [avagin at localhost ~]$ strace ./a.out 
>> execve("./a.out", ["./a.out"], [/* 60 vars */]) = 0
>> ...
>> mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa331ab5000
>> --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7fa331ab4ff8} ---
>> +++ killed by SIGSEGV (core dumped) +++
>> Segmentation fault (core dumped)
>>
>>
>>>
>>> Signed-off-by: Pavel Emelyanov <xemul at parallels.com>
>>> ---
>>>  cr-restore.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/cr-restore.c b/cr-restore.c
>>> index 4d5ccd5..ee15b6c 100644
>>> --- a/cr-restore.c
>>> +++ b/cr-restore.c
>>> @@ -945,7 +945,6 @@ static int restore_one_task(int pid, CoreEntry *core)
>>>  
>>>  /* All arguments should be above stack, because it grows down */
>>>  struct cr_clone_arg {
>>> -	char stack[PAGE_SIZE] __attribute__((aligned (8)));
>>>  	char stack_ptr[0];
>>>  	struct pstree_item *item;
>>>  	unsigned long clone_flags;
>>> @@ -993,8 +992,8 @@ static void maybe_clone_parent(struct pstree_item *item,
>>>  
>>>  static inline int fork_with_pid(struct pstree_item *item)
>>>  {
>>> -	int ret = -1, fd;
>>>  	struct cr_clone_arg ca;
>>> +	int ret = -1, fd;
>>>  	pid_t pid = item->pid.virt;
>>>  
>>>  	if (item->state != TASK_HELPER) {
>>> @@ -1037,6 +1036,8 @@ static inline int fork_with_pid(struct pstree_item *item)
>>>  	ca.item = item;
>>>  	ca.clone_flags = item->rst->clone_flags;
>>>  
>>> +	BUG_ON(ca.clone_flags & CLONE_VM);
>>> +
>>>  	pr_info("Forking task with %d pid (flags 0x%lx)\n", pid, ca.clone_flags);
>>>  
>>>  	if (!(ca.clone_flags & CLONE_NEWPID)) {
>>> -- 
>>> 1.8.4.2
>>>
>>> _______________________________________________
>>> CRIU mailing list
>>> CRIU at openvz.org
>>> https://lists.openvz.org/mailman/listinfo/criu
>> _______________________________________________
>> CRIU mailing list
>> CRIU at openvz.org
>> https://lists.openvz.org/mailman/listinfo/criu
> .
> 



More information about the CRIU mailing list