[Devel] [PATCH vz9 v1 06/63] dm-ploop: convert enospc handling to use lockless lists

Tue Feb 4 10:55:13 MSK 2025

On 2/4/25 15:44, Pavel Tikhomirov wrote:
> 
> 
> On 2/3/25 15:57, Alexander Atanasov wrote:
>> On 3.02.25 9:49, Pavel Tikhomirov wrote:
>>>
>>>
>>> On 2/3/25 15:42, Alexander Atanasov wrote:
>>>> On 3.02.25 9:27, Pavel Tikhomirov wrote:
>>>>>
>>>>>
>>>>> On 2/3/25 14:45, Alexander Atanasov wrote:
>>>>>> On 3.02.25 8:01, Pavel Tikhomirov wrote:
>>>>>>>
>>>>>>>> @@ -166,7 +171,6 @@ static bool ploop_try_delay_enospc(struct 
>>>>>>>> ploop_rq *prq, struct pio *pio)
>>>>>>>>       bool delayed = true;
>>>>>>>>       unsigned long flags;
>>>>>>>> -    spin_lock_irqsave(&ploop->deferred_lock, flags);
>>>>>>>>       if (unlikely(ploop->wants_suspend)) {
>>>>>>>>           delayed = false;
>>>>>>>>           goto unlock;
>>>>>>>> @@ -176,10 +180,11 @@ static bool ploop_try_delay_enospc(struct 
>>>>>>>> ploop_rq *prq, struct pio *pio)
>>>>>>>>       pr_err_once(PL_FMT("underlying disk is almost full"),
>>>>>>>>           ploop_device_name(ploop));
>>>>>>>> +    spin_lock_irqsave(&ploop->deferred_lock, flags);
>>>>>>>>       ploop->event_enospc = true;
>>>>>>>> -    list_add_tail(&pio->list, &ploop->enospc_pios);
>>>>>>>> -unlock:
>>>>>>>>       spin_unlock_irqrestore(&ploop->deferred_lock, flags);
>>>>>>>> +    llist_add((struct llist_node *)(&pio->list), &ploop- 
>>>>>>>> >enospc_pios);
>>>>>>>> +unlock:
>>>>>>>>       if (delayed)
>>>>>>>>           mod_timer(&ploop->enospc_timer, jiffies + 
>>>>>>>> PLOOP_ENOSPC_TIMEOUT);
>>>>>>>
>>>>>>> Can you please explain why we need to take defered_lock around 
>>>>>>> ploop- >event_enospc setting after your patch? (It looks that 
>>>>>>> this lock does not do anything now.)
>>>>>>>
>>>>>>
>>>>>> see static int ploop_get_event(...), without lock event can be missed
>>>>>
>>>>> That is not an explanation. How exactly can it be missed?
>>>>
>>>>
>>>
>>> Other cpu can set "ploop->event_enospc = true" here before lock (i.e. 
>>> it was set to true twice), that would lead to emiting only one event 
>>> for two sets.
>>>
>>>>          spin_lock_irq(&ploop->deferred_lock);
>>>>          if (ploop->event_enospc) {                    <- while emit 
>>>> other cpu can set
>>>>                  ret = (DMEMIT("event_ENOSPC\n")) ? 1 : 0;
>>>>                  if (ret)
>>>>                          ploop->event_enospc = false; <- next 
>>>> cleaars here -> second event lost
>>>
>>> Not lost, you just emit once for two sets, which as you can see from 
>>> above comment is possible even with locks everywhere. So your 
>>> explanation does not prove that we need this lock AFAICS.
>>
>>
>> It is okay to emit once for two events set befere we EMIT.
>> It is not okay to miss an event set after we emit.
> 
> What is the difference between event set before EMIT and event set just 
> after emit before reseting event_enospc to false? Why should they be 
> handled differently?
> 
>>
>> 1. Event(s) one or more
>> 2. Start to EMIT
>> 3. Userspace gets first event does what it does to handle it.
>> 4. A new event comes - userspace is not notified
> 
> You would probably agree if all happens without concurrency exactly as 
> you've wrote, a new event will also be notified.
> 
> So we talk about the case of concurrency, that means that in one thread 
> event_enospc = true in ploop_try_delay_enospc was set after DMEMIT in 
> ploop_get_event and before event_enospc = false in ploop_get_event in 
> other thread.
> 
> In this case userspace is notified about both events in one emit, just 
> about one event it is notified slightly earlier when it actually set 
> true to variable. The question is - Why is it bad? I don't see why.

Ok in case ploop_get_event hangs after emit and before setting false, 
without lock multiple ploop_try_delay_enospc can happen much later but 
don't trigger emit.

I agree that lock helps with that case, and makes all algorithm kinda 
more interactive.

> 
>> 5. End of EMIT
>>
>> at 4. we have a lost event - an event userspace will not receive.
>> Cost is minimal as we do not expect to see much of this so the 
>> correctness is more important here.
>>
>>
>> And this can happen for sure - we have a lots of writes , they get 
>> queued. Userspace frees some space , writes are retried but again 
>> space is over because they need more space than it is freed. Without 
>> the second event we get stuck here, userspace will not free more space.
>> And pios will hang an retry.
>>
>>>
>>>>          }
>>>>          spin_unlock_irq(&ploop->deferred_lock);
>>>
>>>
>>>
>>>>
>>>>
>>>>
>>>
>>
> 

-- 
Best regards, Tikhomirov Pavel
Senior Software Developer, Virtuozzo.