[Devel] [PATCH vz9 v1 06/63] dm-ploop: convert enospc handling to use lockless lists

Tue Feb 4 11:55:30 MSK 2025

On 2/4/25 16:37, Alexander Atanasov wrote:
> On 4.02.25 10:19, Pavel Tikhomirov wrote:
>>
>>
>> On 2/4/25 15:57, Alexander Atanasov wrote:
>>> On 4.02.25 9:44, Pavel Tikhomirov wrote:
>>>>
>>>>
>>>> On 2/3/25 15:57, Alexander Atanasov wrote:
>>>>> On 3.02.25 9:49, Pavel Tikhomirov wrote:
>>>>>>
>>>>>>
>>>>>> On 2/3/25 15:42, Alexander Atanasov wrote:
>>>>>>> On 3.02.25 9:27, Pavel Tikhomirov wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2/3/25 14:45, Alexander Atanasov wrote:
>>>>>>>>> On 3.02.25 8:01, Pavel Tikhomirov wrote:
>>>>>>>>>>
>>>>>>>>>>> @@ -166,7 +171,6 @@ static bool ploop_try_delay_enospc(struct 
>>>>>>>>>>> ploop_rq *prq, struct pio *pio)
>>>>>>>>>>>       bool delayed = true;
>>>>>>>>>>>       unsigned long flags;
>>>>>>>>>>> -    spin_lock_irqsave(&ploop->deferred_lock, flags);
>>>>>>>>>>>       if (unlikely(ploop->wants_suspend)) {
>>>>>>>>>>>           delayed = false;
>>>>>>>>>>>           goto unlock;
>>>>>>>>>>> @@ -176,10 +180,11 @@ static bool 
>>>>>>>>>>> ploop_try_delay_enospc(struct ploop_rq *prq, struct pio *pio)
>>>>>>>>>>>       pr_err_once(PL_FMT("underlying disk is almost full"),
>>>>>>>>>>>           ploop_device_name(ploop));
>>>>>>>>>>> +    spin_lock_irqsave(&ploop->deferred_lock, flags);
>>>>>>>>>>>       ploop->event_enospc = true;
>>>>>>>>>>> -    list_add_tail(&pio->list, &ploop->enospc_pios);
>>>>>>>>>>> -unlock:
>>>>>>>>>>>       spin_unlock_irqrestore(&ploop->deferred_lock, flags);
>>>>>>>>>>> +    llist_add((struct llist_node *)(&pio->list), &ploop- 
>>>>>>>>>>> >enospc_pios);
>>>>>>>>>>> +unlock:
>>>>>>>>>>>       if (delayed)
>>>>>>>>>>>           mod_timer(&ploop->enospc_timer, jiffies + 
>>>>>>>>>>> PLOOP_ENOSPC_TIMEOUT);
>>>>>>>>>>
>>>>>>>>>> Can you please explain why we need to take defered_lock around 
>>>>>>>>>> ploop- >event_enospc setting after your patch? (It looks that 
>>>>>>>>>> this lock does not do anything now.)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> see static int ploop_get_event(...), without lock event can be 
>>>>>>>>> missed
>>>>>>>>
>>>>>>>> That is not an explanation. How exactly can it be missed?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Other cpu can set "ploop->event_enospc = true" here before lock 
>>>>>> (i.e. it was set to true twice), that would lead to emiting only 
>>>>>> one event for two sets.
>>>>>>
>>>>>>>          spin_lock_irq(&ploop->deferred_lock);
>>>>>>>          if (ploop->event_enospc) {                    <- while 
>>>>>>> emit other cpu can set
>>>>>>>                  ret = (DMEMIT("event_ENOSPC\n")) ? 1 : 0;
>>>>>>>                  if (ret)
>>>>>>>                          ploop->event_enospc = false; <- next 
>>>>>>> cleaars here -> second event lost
>>>>>>
>>>>>> Not lost, you just emit once for two sets, which as you can see 
>>>>>> from above comment is possible even with locks everywhere. So your 
>>>>>> explanation does not prove that we need this lock AFAICS.
>>>>>
>>>>>
>>>>> It is okay to emit once for two events set befere we EMIT.
>>>>> It is not okay to miss an event set after we emit.
>>>>
>>>> What is the difference between event set before EMIT and event set 
>>>> just after emit before reseting event_enospc to false? Why should 
>>>> they be handled differently?
>>>
>>>
>>> it is a _new event_ and it needs to be emitted.
>>>
>>>
>>>>> 1. Event(s) one or more
>>>>> 2. Start to EMIT
>>>>> 3. Userspace gets first event does what it does to handle it.
>>>>> 4. A new event comes - userspace is not notified
>>>>
>>>> You would probably agree if all happens without concurrency exactly 
>>>> as you've wrote, a new event will also be notified.
>>>
>>> We have concurency - two+ threads and userspace
>>>>
>>>> So we talk about the case of concurrency, that means that in one 
>>>> thread event_enospc = true in ploop_try_delay_enospc was set after 
>>>> DMEMIT in ploop_get_event and before event_enospc = false in 
>>>> ploop_get_event in other thread.
>>>>
>>>> In this case userspace is notified about both events in one emit, 
>>>> just about one event it is notified slightly earlier when it 
>>>> actually set true to variable. The question is - Why is it bad? I 
>>>> don't see why.
>>>
>>> Why do you think it is okay to notify only once for both events, even 
>>> after second ocured after we send the notification?
>>> We can not make assumptions of how it will be processed.
>>
>> Because DMEMIT is just a scnprintf to result, which is AFAICS a buffer 
>> inside dm_ioctl structure and it is allocated in ctl_ioctl- 
>> >copy_params via kvmalloc on the same call-stack. So I think that this 
>> emit is only actually acknowleged after ploop_get_event had finished. 
>> So even for later event the emit is acknodleged after this event had 
>> set event_enospc to true even without lock.
> 
> 
> If we are going to rely on this then we can remove all locks around
> using event_enospc. I prefer to keep them since it is not a hot path.

OK.

> 
>>
>>>
>>>>
>>>>> 5. End of EMIT
>>>>>
>>>>> at 4. we have a lost event - an event userspace will not receive.
>>>>> Cost is minimal as we do not expect to see much of this so the 
>>>>> correctness is more important here.
>>>>>
>>>>>
>>>>> And this can happen for sure - we have a lots of writes , they get 
>>>>> queued. Userspace frees some space , writes are retried but again 
>>>>> space is over because they need more space than it is freed. 
>>>>> Without the second event we get stuck here, userspace will not free 
>>>>> more space.
>>>>> And pios will hang an retry.
>>>>>
>>>>>>
>>>>>>>          }
>>>>>>>          spin_unlock_irq(&ploop->deferred_lock);
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 

-- 
Best regards, Tikhomirov Pavel
Senior Software Developer, Virtuozzo.