[Devel] [RFC PATCH vz9 v6 59/62] dm-ploop: fix how ENOTBLK is handled

Alexander Atanasov alexander.atanasov at virtuozzo.com
Tue Dec 17 16:59:24 MSK 2024


On 13.12.24 15:24, Andrey Zhadchenko wrote:
> 
> 
> On 12/13/24 14:20, Alexander Atanasov wrote:
>> On 13.12.24 15:17, Andrey Zhadchenko wrote:
>>>
>>>
>>> On 12/5/24 22:56, Alexander Atanasov wrote:
>>>> direct IO  write result ENOTBLK or 0(in ext4 case) means
>>>> retry IO in buffered mode. We wrongly assumed that it is
>>>> a short write and handled it incorrectly
>>>>
>>>> Since we can not retry in buffered mode, code is not ready
>>>> for it. Take a different route. This error happens if
>>>> page invalidation fails, which is a very rare situation.
>>>> So call synchronize_rcu() and just resubmit pio.
>>>>
>>>> https://virtuozzo.atlassian.net/browse/VSTOR-91821
>>>> Suggested-by: Alexey Kuznetsov <kuznet at virtuozzo.com>
>>>> Signed-off-by: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
>>>> ---
>>>>   drivers/md/dm-ploop-map.c | 25 ++++++++++++++++---------
>>>>   1 file changed, 16 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
>>>> index a03e1af3fd87..482022d6b60b 100644
>>>> --- a/drivers/md/dm-ploop-map.c
>>>> +++ b/drivers/md/dm-ploop-map.c
>>>> @@ -1312,24 +1312,31 @@ static void ploop_data_rw_complete(struct 
>>>> pio *pio)
>>>>       bool completed;
>>>>       if (pio->ret != pio->bi_iter.bi_size) {
>>>> -        if (pio->ret >= 0) {
>>>> -            /* Partial IO */
>>>> -            WARN_ON_ONCE(pio->ret == 0);
>>>> -            /* Do not resubmit zero length pio */
>>>> +        if (pio->ret >= 0 || pio->ret == -ENOTBLK) {
>>>> +            /* Partial IO or request to retry in buffered mode */
>>>> +            if (pio->ret == 0 || pio->ret == -ENOTBLK) {
>>>> +                /*
>>>> +                 * ENOTBLK means we should retry in buffered io
>>>> +                 * but we can not, so try again in DIO
>>>> +                 * ext4 returns 0 for ENOTBLK
>>>> +                 */
>>>> +                struct ploop *ploop = pio->ploop;
>>>> +
>>>> +                PL_ERR("ret = 0 bi_size=%d\n", pio->bi_iter.bi_size);
>>>> +                synchronize_rcu();
>>>> +                ploop_queue_resubmit(pio);
>>>
>>> Can this re-occur the second or third time? Should we limit how many 
>>> times we can resubmit?
>>
>>
>> It should not, it happens very rarely - even if we add a limit we do 
>> not have what else to do in that case, can not do buffered IO which 
>> will results in dirty pages and make this appear even more .
>>
> 
> Well I think we should end the pio with error then. Otherwise we may 
> have forever stuck request


I've added a retry logic and end pio with IO error if it fails 3 times -
so there is no stuck request.

-- 
Regards,
Alexander Atanasov



More information about the Devel mailing list