[Devel] [PATCH vz9 v1 06/63] dm-ploop: convert enospc handling to use lockless lists

Alexander Atanasov alexander.atanasov at virtuozzo.com
Tue Feb 4 11:37:39 MSK 2025


On 4.02.25 10:19, Pavel Tikhomirov wrote:
> 
> 
> On 2/4/25 15:57, Alexander Atanasov wrote:
>> On 4.02.25 9:44, Pavel Tikhomirov wrote:
>>>
>>>
>>> On 2/3/25 15:57, Alexander Atanasov wrote:
>>>> On 3.02.25 9:49, Pavel Tikhomirov wrote:
>>>>>
>>>>>
>>>>> On 2/3/25 15:42, Alexander Atanasov wrote:
>>>>>> On 3.02.25 9:27, Pavel Tikhomirov wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2/3/25 14:45, Alexander Atanasov wrote:
>>>>>>>> On 3.02.25 8:01, Pavel Tikhomirov wrote:
>>>>>>>>>
>>>>>>>>>> @@ -166,7 +171,6 @@ static bool ploop_try_delay_enospc(struct 
>>>>>>>>>> ploop_rq *prq, struct pio *pio)
>>>>>>>>>>       bool delayed = true;
>>>>>>>>>>       unsigned long flags;
>>>>>>>>>> -    spin_lock_irqsave(&ploop->deferred_lock, flags);
>>>>>>>>>>       if (unlikely(ploop->wants_suspend)) {
>>>>>>>>>>           delayed = false;
>>>>>>>>>>           goto unlock;
>>>>>>>>>> @@ -176,10 +180,11 @@ static bool 
>>>>>>>>>> ploop_try_delay_enospc(struct ploop_rq *prq, struct pio *pio)
>>>>>>>>>>       pr_err_once(PL_FMT("underlying disk is almost full"),
>>>>>>>>>>           ploop_device_name(ploop));
>>>>>>>>>> +    spin_lock_irqsave(&ploop->deferred_lock, flags);
>>>>>>>>>>       ploop->event_enospc = true;
>>>>>>>>>> -    list_add_tail(&pio->list, &ploop->enospc_pios);
>>>>>>>>>> -unlock:
>>>>>>>>>>       spin_unlock_irqrestore(&ploop->deferred_lock, flags);
>>>>>>>>>> +    llist_add((struct llist_node *)(&pio->list), &ploop- 
>>>>>>>>>> >enospc_pios);
>>>>>>>>>> +unlock:
>>>>>>>>>>       if (delayed)
>>>>>>>>>>           mod_timer(&ploop->enospc_timer, jiffies + 
>>>>>>>>>> PLOOP_ENOSPC_TIMEOUT);
>>>>>>>>>
>>>>>>>>> Can you please explain why we need to take defered_lock around 
>>>>>>>>> ploop- >event_enospc setting after your patch? (It looks that 
>>>>>>>>> this lock does not do anything now.)
>>>>>>>>>
>>>>>>>>
>>>>>>>> see static int ploop_get_event(...), without lock event can be 
>>>>>>>> missed
>>>>>>>
>>>>>>> That is not an explanation. How exactly can it be missed?
>>>>>>
>>>>>>
>>>>>
>>>>> Other cpu can set "ploop->event_enospc = true" here before lock 
>>>>> (i.e. it was set to true twice), that would lead to emiting only 
>>>>> one event for two sets.
>>>>>
>>>>>>          spin_lock_irq(&ploop->deferred_lock);
>>>>>>          if (ploop->event_enospc) {                    <- while 
>>>>>> emit other cpu can set
>>>>>>                  ret = (DMEMIT("event_ENOSPC\n")) ? 1 : 0;
>>>>>>                  if (ret)
>>>>>>                          ploop->event_enospc = false; <- next 
>>>>>> cleaars here -> second event lost
>>>>>
>>>>> Not lost, you just emit once for two sets, which as you can see 
>>>>> from above comment is possible even with locks everywhere. So your 
>>>>> explanation does not prove that we need this lock AFAICS.
>>>>
>>>>
>>>> It is okay to emit once for two events set befere we EMIT.
>>>> It is not okay to miss an event set after we emit.
>>>
>>> What is the difference between event set before EMIT and event set 
>>> just after emit before reseting event_enospc to false? Why should 
>>> they be handled differently?
>>
>>
>> it is a _new event_ and it needs to be emitted.
>>
>>
>>>> 1. Event(s) one or more
>>>> 2. Start to EMIT
>>>> 3. Userspace gets first event does what it does to handle it.
>>>> 4. A new event comes - userspace is not notified
>>>
>>> You would probably agree if all happens without concurrency exactly 
>>> as you've wrote, a new event will also be notified.
>>
>> We have concurency - two+ threads and userspace
>>>
>>> So we talk about the case of concurrency, that means that in one 
>>> thread event_enospc = true in ploop_try_delay_enospc was set after 
>>> DMEMIT in ploop_get_event and before event_enospc = false in 
>>> ploop_get_event in other thread.
>>>
>>> In this case userspace is notified about both events in one emit, 
>>> just about one event it is notified slightly earlier when it actually 
>>> set true to variable. The question is - Why is it bad? I don't see why.
>>
>> Why do you think it is okay to notify only once for both events, even 
>> after second ocured after we send the notification?
>> We can not make assumptions of how it will be processed.
> 
> Because DMEMIT is just a scnprintf to result, which is AFAICS a buffer 
> inside dm_ioctl structure and it is allocated in ctl_ioctl->copy_params 
> via kvmalloc on the same call-stack. So I think that this emit is only 
> actually acknowleged after ploop_get_event had finished. So even for 
> later event the emit is acknodleged after this event had set 
> event_enospc to true even without lock.


If we are going to rely on this then we can remove all locks around
using event_enospc. I prefer to keep them since it is not a hot path.

> 
>>
>>>
>>>> 5. End of EMIT
>>>>
>>>> at 4. we have a lost event - an event userspace will not receive.
>>>> Cost is minimal as we do not expect to see much of this so the 
>>>> correctness is more important here.
>>>>
>>>>
>>>> And this can happen for sure - we have a lots of writes , they get 
>>>> queued. Userspace frees some space , writes are retried but again 
>>>> space is over because they need more space than it is freed. Without 
>>>> the second event we get stuck here, userspace will not free more space.
>>>> And pios will hang an retry.
>>>>
>>>>>
>>>>>>          }
>>>>>>          spin_unlock_irq(&ploop->deferred_lock);
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 

-- 
Regards,
Alexander Atanasov



More information about the Devel mailing list