[Devel] [RFC PATCH vz9 v6 56/62] dm-ploop: support REQ_FUA for data pios

Alexander Atanasov alexander.atanasov at virtuozzo.com
Fri Dec 20 16:13:57 MSK 2024


On 20.12.24 14:51, Andrey Zhadchenko wrote:
>
>
> On 12/13/24 15:15, Alexander Atanasov wrote:
>> On 13.12.24 15:22, Andrey Zhadchenko wrote:
>>>
>>>
>>> On 12/13/24 14:12, Alexander Atanasov wrote:
>>>> On 13.12.24 14:58, Andrey Zhadchenko wrote:
>>>>> No, REQ_FUA does not oblige us to make sync. REQ_FUA is translated 
>>>>> into IOCB_DSYNC. It is IOCB layer job to datasync this exact request.
>>>>> Also REQ_FUA usually is coupled with REQ_PREFLUSH and we already 
>>>>> handle that.
>>>>>
>>>>
>>>> We do this only for metadata in previous patches - pure data pios 
>>>> do not handle it. And you are right about REQ_PREFLUSH but i think 
>>>> the vfs layer will handle it along the sync - which is translated 
>>>> to PREFLUSH+FUA
>>>>
>>>
>>> I think in "[RFC PATCH vz9 v6 17/62] dm-ploop: set IOCB_DSYNC on all 
>>> FUA requests" we just save rq->cmd_flags and set IOCB_DSYNC if 
>>> initial request contained REQ_FUA. Does this not cover pure data pios?
>>>
>>> I suppose you are talking about an extra code to mark piwb if one of 
>>> data pios with REQ_FUA triggered metadata change.
>>
>> I checked now and i think that patch would stay but i need to fix the
>> patch that copies to cmd_flags to bi_op.
>> I see REQ_FUA is not cleared anywhere but it should be. why?
>> split_and_chain_pio will clone it to all resulting split pios,
>> in turn ploop_call_rw_iter will set IOCB_DSYNC on all pios from that 
>> split, which results in a vfs_fsync (in an aio worker, so one by one).
>> But we want to do only one vfs_sync for the whole split, right before 
>> we call endio_cb.
>
> Are you sure that aio worker does the sync on every FUA request? I 
> think it is up to the driver in the end.
>

iirc I tracked it down to which was called on every iocb that was sync, 
which was set via DSYNC.

static void aio_fsync_work(struct work_struct *work)

         iocb->ki_res.res = vfs_fsync(iocb->fsync.file, 
iocb->fsync.datasync);



However this is more important :


fs/iomap/direct-io.c:


                 /* for data sync or sync, we need sync completion 
processing */
                 if (iocb->ki_flags & IOCB_DSYNC)
                         dio->flags |= IOMAP_DIO_NEED_SYNC;



ssize_t iomap_dio_complete(struct iomap_dio *dio)

{

        ....

         if (ret > 0) {
                 iocb->ki_pos += ret;

                 /*
                  * If this is a DSYNC write, make sure we push it to stable
                  * storage now that we've written data.
                  */
                 if (dio->flags & IOMAP_DIO_NEED_SYNC)
                         ret = generic_write_sync(iocb, ret);
                 if (ret > 0)
                         ret += dio->done_before;
         }

}


Which is resorting to sync_write and we want to do only dio all the way 
to avoid having dirty pages from buffered io. (we get ENOTBLKDEV due to 
dirty page we get otherwise)


> But if if does, we probably need to add sync() for split requests and 
> do not add anything for single-pio (just leave REQ_FUA).


What i did is to maskout REQ_FUA on the bi_op that is used to init a 
split pio.

We need to sync before we call endio on the initial PIO which is done 
after the whole split is ready, i.e. the original pio is first.


-- 
Regards,
Alexander Atanasov



More information about the Devel mailing list