[CRIU] [PATCH] aio: Restore aio ring content

Pavel Emelyanov xemul at virtuozzo.com
Thu Mar 10 11:51:51 PST 2016


On 03/10/2016 05:39 PM, Kirill Tkhai wrote:
> 
> 
> On 10.03.2016 16:46, Pavel Emelyanov wrote:
>>
>>> @@ -1038,8 +1039,10 @@ long __export_restore_task(struct task_restore_args *args)
>>>  			goto core_restore_end;
>>>  		}
>>>  
>>> -		if (ctx == raio->addr) /* Lucky bastards we are! */
>>> -			continue;
>>> +		count = raio->len/sizeof(unsigned long);
>>> +		for (i = 0; i < count; i++)
>>> +			((unsigned long *)ctx)[i] = ring[i];
>>> +		sys_munmap(ring, raio->len);
>>
>> Ring pages are connected to in-kernel structures, where's the guarantee, that
>> unmap + mmap of new stuff keeps this linkage?
> 
> Ring pages contain header and array of statuses. The header is
> 
> struct aio_ring {
>         unsigned        id;     /* kernel internal index number */
>         unsigned        nr;     /* number of io_events */
>         unsigned        head;   /* Written to by userland or under ring_lock
>                                  * mutex by aio_read_events_ring(). */
>         unsigned        tail;
> 
>         unsigned        magic;
>         unsigned        compat_features;
>         unsigned        incompat_features;
>         unsigned        header_length;  /* size of aio_ring */
> 
> 
>         struct io_event         io_events[0];
> };
> 
> It does not contain any linkage information. All pointers are relative, id is
> user address of ring buffer. nr and last 4 members is not used by kernel.

fs/aio.c's aio_setup_ring():

        ctx->mmap_base = do_mmap_pgoff(ctx->aio_ring_file, 0, ctx->mmap_size,
                                       PROT_READ | PROT_WRITE,
                                       MAP_SHARED, 0, &unused);

the ring's mapping is file mapping with aio special file with all the
consequences of such thing.

> The array of statuses, which goes after the header in ring buffer, does not
> contain linkage information too.
> 
> The patch just reinitializes the ring buffer. A task may do this in its usual life:
> write anything to ring buffer, and kernel is OK with this. Kernel uses ->head only.
> 
>> Other than this, why can't we write directly into created by io_setup region?
> 
> We should restore ring buffer content from somewhere. Patch makes ring buffer memory
> is saved like anonymous memory, and restores the same way. So we don't need restore
> ring buffer from a file on the disk.
> 
>>>  
>>>  		/*
>>>  		 * If we failed to get the proper nr_req right and
>>>
>>> _______________________________________________
>>> CRIU mailing list
>>> CRIU at openvz.org
>>> https://lists.openvz.org/mailman/listinfo/criu
>>> .
>>>
>>
> .
> 



More information about the CRIU mailing list