[CRIU] [PATCH] aio: Restore aio ring content
Pavel Emelyanov
xemul at virtuozzo.com
Thu Mar 10 11:51:51 PST 2016
On 03/10/2016 05:39 PM, Kirill Tkhai wrote:
>
>
> On 10.03.2016 16:46, Pavel Emelyanov wrote:
>>
>>> @@ -1038,8 +1039,10 @@ long __export_restore_task(struct task_restore_args *args)
>>> goto core_restore_end;
>>> }
>>>
>>> - if (ctx == raio->addr) /* Lucky bastards we are! */
>>> - continue;
>>> + count = raio->len/sizeof(unsigned long);
>>> + for (i = 0; i < count; i++)
>>> + ((unsigned long *)ctx)[i] = ring[i];
>>> + sys_munmap(ring, raio->len);
>>
>> Ring pages are connected to in-kernel structures, where's the guarantee, that
>> unmap + mmap of new stuff keeps this linkage?
>
> Ring pages contain header and array of statuses. The header is
>
> struct aio_ring {
> unsigned id; /* kernel internal index number */
> unsigned nr; /* number of io_events */
> unsigned head; /* Written to by userland or under ring_lock
> * mutex by aio_read_events_ring(). */
> unsigned tail;
>
> unsigned magic;
> unsigned compat_features;
> unsigned incompat_features;
> unsigned header_length; /* size of aio_ring */
>
>
> struct io_event io_events[0];
> };
>
> It does not contain any linkage information. All pointers are relative, id is
> user address of ring buffer. nr and last 4 members is not used by kernel.
fs/aio.c's aio_setup_ring():
ctx->mmap_base = do_mmap_pgoff(ctx->aio_ring_file, 0, ctx->mmap_size,
PROT_READ | PROT_WRITE,
MAP_SHARED, 0, &unused);
the ring's mapping is file mapping with aio special file with all the
consequences of such thing.
> The array of statuses, which goes after the header in ring buffer, does not
> contain linkage information too.
>
> The patch just reinitializes the ring buffer. A task may do this in its usual life:
> write anything to ring buffer, and kernel is OK with this. Kernel uses ->head only.
>
>> Other than this, why can't we write directly into created by io_setup region?
>
> We should restore ring buffer content from somewhere. Patch makes ring buffer memory
> is saved like anonymous memory, and restores the same way. So we don't need restore
> ring buffer from a file on the disk.
>
>>>
>>> /*
>>> * If we failed to get the proper nr_req right and
>>>
>>> _______________________________________________
>>> CRIU mailing list
>>> CRIU at openvz.org
>>> https://lists.openvz.org/mailman/listinfo/criu
>>> .
>>>
>>
> .
>
More information about the CRIU
mailing list