[CRIU] [PATCH v2 3/3] aio: Restore aio ring content
Pavel Emelyanov
xemul at virtuozzo.com
Mon Mar 21 02:49:08 PDT 2016
On 03/21/2016 11:06 AM, Kirill Tkhai wrote:
>
>
> On 21.03.2016 09:19, Pavel Emelyanov wrote:
>> On 03/18/2016 01:31 PM, Kirill Tkhai wrote:
>>>
>>>
>>> On 17.03.2016 22:34, Pavel Emelyanov wrote:
>>>>
>>>>>>>> I'm not sure this is safe. How would pre-dumps act on rings?
>>>>>>>
>>>>>>> Could you please explain what kind of problems are possible here?
>>>>>>> I don't see a memory predump.
>>>>>>
>>>>>> The vma_entry_is_private() check is too generic. E.g. such vmas are being
>>>>>> soft-dirty-tracked. Do we want the same for AIO rings? I bet we don't :)
>>>>>
>>>>> For user AIO ring buffer looks like an anonymous memory. There are no difference
>>>>> between them, it's writable and modifiable. So if we track anonymous memory,
>>>>> we have to track AIO ring buffer too.
>>>>
>>>> Will it get tracked by the kernel's soft-dirty bits? I heavily doubt it.
>>>
>>> It's tracked. Below is the prove.
>>>
>>> #define _GNU_SOURCE
>>> #include <stdio.h>
>>> #include <unistd.h>
>>> #include <sys/syscall.h>
>>> #include <linux/aio_abi.h>
>>> #include <fcntl.h>
>>> #include <inttypes.h>
>>>
>>> inline int io_setup(unsigned nr, aio_context_t *ctxp)
>>> {
>>> return syscall(__NR_io_setup, nr, ctxp);
>>> }
>>>
>>> #define PME_SOFT_DIRTY (1ULL << 55)
>>> #define PAGE_SHIFT 12
>>> #define PAGE_SIZE (1UL << PAGE_SHIFT)
>>> #define u64 uint64_t
>>>
>>> int main()
>>> {
>>> aio_context_t ctx = 0;
>>> int ret, fd, pm2;
>>> u64 pmap;
>>>
>>> ret = io_setup(128, &ctx);
>>> if (ret < 0) {
>>> perror("io_setup error");
>>> return -1;
>>> }
>>>
>>> fd = open("/proc/self/clear_refs", O_WRONLY);
>>> if (fd < 0) {
>>> perror("clear_refs open");
>>> return -1;
>>> }
>>>
>>> if (write(fd, "4", 1) != 1) {
>>> perror("clear_refs write");
>>> return -1;
>>> }
>>> close(fd);
>>>
>>> pm2 = open("/proc/self/pagemap", O_RDONLY);
>>> if (pm2 < 0) {
>>> perror("Can't open pagemap file");
>>> return -1;
>>> }
>>>
>>> ((char *)ctx)[0] = '\0';
>>> lseek(pm2, ctx / PAGE_SIZE * sizeof(u64), SEEK_SET);
>>> ret = read(pm2, &pmap, sizeof(pmap));
>>> if (ret < 0)
>>> perror("Read pmap err!");
>>> close(pm2);
>>> if (pmap & PME_SOFT_DIRTY)
>>> printf("Dirty tracking exists on aio\n");
>>> else
>>> printf("Shit happens\n");
>>
>> That's not prove. Kernel also updates the ring when completing requests, but
>> you don't check this case.
>
> It's "inflight" requests. We used to do not handle them, and the patch does not
> change anything in this moment.
No it's not in-flight. You make pre-dump and pick the ring page, then
app does aio req, it gets completed and kernel updates the ring. You
go with the 2nd predump or dump and the ring page is _not_ marked as
soft-dirty. Boom! You've just lost the completed request.
> I'm going to add plugin to parasite_check_aios(), and to wait inflight requests
> from there.
>
>> Anyway, I don't think treating aio ring buffer as regular anonymous memory
>> is good idea.
>
> What do you suggest? Add vma_entry_is_private(entry) | vma_entry_is(entry, VMA_AREA_AIORING)
> every place we used it?
Not every. My current opinion is that soft-dirty tracking should NOT
be done for AIO rings.
-- Pavel
More information about the CRIU
mailing list