[Devel] [RFC PATCH vz9 v6 59/62] dm-ploop: fix how ENOTBLK is handled

Andrey Zhadchenko andrey.zhadchenko at virtuozzo.com
Fri Dec 13 16:24:25 MSK 2024



On 12/13/24 14:20, Alexander Atanasov wrote:
> On 13.12.24 15:17, Andrey Zhadchenko wrote:
>>
>>
>> On 12/5/24 22:56, Alexander Atanasov wrote:
>>> direct IO  write result ENOTBLK or 0(in ext4 case) means
>>> retry IO in buffered mode. We wrongly assumed that it is
>>> a short write and handled it incorrectly
>>>
>>> Since we can not retry in buffered mode, code is not ready
>>> for it. Take a different route. This error happens if
>>> page invalidation fails, which is a very rare situation.
>>> So call synchronize_rcu() and just resubmit pio.
>>>
>>> https://virtuozzo.atlassian.net/browse/VSTOR-91821
>>> Suggested-by: Alexey Kuznetsov <kuznet at virtuozzo.com>
>>> Signed-off-by: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
>>> ---
>>>   drivers/md/dm-ploop-map.c | 25 ++++++++++++++++---------
>>>   1 file changed, 16 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
>>> index a03e1af3fd87..482022d6b60b 100644
>>> --- a/drivers/md/dm-ploop-map.c
>>> +++ b/drivers/md/dm-ploop-map.c
>>> @@ -1312,24 +1312,31 @@ static void ploop_data_rw_complete(struct pio 
>>> *pio)
>>>       bool completed;
>>>       if (pio->ret != pio->bi_iter.bi_size) {
>>> -        if (pio->ret >= 0) {
>>> -            /* Partial IO */
>>> -            WARN_ON_ONCE(pio->ret == 0);
>>> -            /* Do not resubmit zero length pio */
>>> +        if (pio->ret >= 0 || pio->ret == -ENOTBLK) {
>>> +            /* Partial IO or request to retry in buffered mode */
>>> +            if (pio->ret == 0 || pio->ret == -ENOTBLK) {
>>> +                /*
>>> +                 * ENOTBLK means we should retry in buffered io
>>> +                 * but we can not, so try again in DIO
>>> +                 * ext4 returns 0 for ENOTBLK
>>> +                 */
>>> +                struct ploop *ploop = pio->ploop;
>>> +
>>> +                PL_ERR("ret = 0 bi_size=%d\n", pio->bi_iter.bi_size);
>>> +                synchronize_rcu();
>>> +                ploop_queue_resubmit(pio);
>>
>> Can this re-occur the second or third time? Should we limit how many 
>> times we can resubmit?
> 
> 
> It should not, it happens very rarely - even if we add a limit we do not 
> have what else to do in that case, can not do buffered IO which will 
> results in dirty pages and make this appear even more .
> 

Well I think we should end the pio with error then. Otherwise we may 
have forever stuck request


More information about the Devel mailing list