[Devel] [PATCH vz9 v1 06/63] dm-ploop: convert enospc handling to use lockless lists
Pavel Tikhomirov
ptikhomirov at virtuozzo.com
Tue Feb 4 11:19:26 MSK 2025
On 2/4/25 15:57, Alexander Atanasov wrote:
> On 4.02.25 9:44, Pavel Tikhomirov wrote:
>>
>>
>> On 2/3/25 15:57, Alexander Atanasov wrote:
>>> On 3.02.25 9:49, Pavel Tikhomirov wrote:
>>>>
>>>>
>>>> On 2/3/25 15:42, Alexander Atanasov wrote:
>>>>> On 3.02.25 9:27, Pavel Tikhomirov wrote:
>>>>>>
>>>>>>
>>>>>> On 2/3/25 14:45, Alexander Atanasov wrote:
>>>>>>> On 3.02.25 8:01, Pavel Tikhomirov wrote:
>>>>>>>>
>>>>>>>>> @@ -166,7 +171,6 @@ static bool ploop_try_delay_enospc(struct
>>>>>>>>> ploop_rq *prq, struct pio *pio)
>>>>>>>>> bool delayed = true;
>>>>>>>>> unsigned long flags;
>>>>>>>>> - spin_lock_irqsave(&ploop->deferred_lock, flags);
>>>>>>>>> if (unlikely(ploop->wants_suspend)) {
>>>>>>>>> delayed = false;
>>>>>>>>> goto unlock;
>>>>>>>>> @@ -176,10 +180,11 @@ static bool ploop_try_delay_enospc(struct
>>>>>>>>> ploop_rq *prq, struct pio *pio)
>>>>>>>>> pr_err_once(PL_FMT("underlying disk is almost full"),
>>>>>>>>> ploop_device_name(ploop));
>>>>>>>>> + spin_lock_irqsave(&ploop->deferred_lock, flags);
>>>>>>>>> ploop->event_enospc = true;
>>>>>>>>> - list_add_tail(&pio->list, &ploop->enospc_pios);
>>>>>>>>> -unlock:
>>>>>>>>> spin_unlock_irqrestore(&ploop->deferred_lock, flags);
>>>>>>>>> + llist_add((struct llist_node *)(&pio->list), &ploop-
>>>>>>>>> >enospc_pios);
>>>>>>>>> +unlock:
>>>>>>>>> if (delayed)
>>>>>>>>> mod_timer(&ploop->enospc_timer, jiffies +
>>>>>>>>> PLOOP_ENOSPC_TIMEOUT);
>>>>>>>>
>>>>>>>> Can you please explain why we need to take defered_lock around
>>>>>>>> ploop- >event_enospc setting after your patch? (It looks that
>>>>>>>> this lock does not do anything now.)
>>>>>>>>
>>>>>>>
>>>>>>> see static int ploop_get_event(...), without lock event can be
>>>>>>> missed
>>>>>>
>>>>>> That is not an explanation. How exactly can it be missed?
>>>>>
>>>>>
>>>>
>>>> Other cpu can set "ploop->event_enospc = true" here before lock
>>>> (i.e. it was set to true twice), that would lead to emiting only one
>>>> event for two sets.
>>>>
>>>>> spin_lock_irq(&ploop->deferred_lock);
>>>>> if (ploop->event_enospc) { <- while
>>>>> emit other cpu can set
>>>>> ret = (DMEMIT("event_ENOSPC\n")) ? 1 : 0;
>>>>> if (ret)
>>>>> ploop->event_enospc = false; <- next
>>>>> cleaars here -> second event lost
>>>>
>>>> Not lost, you just emit once for two sets, which as you can see from
>>>> above comment is possible even with locks everywhere. So your
>>>> explanation does not prove that we need this lock AFAICS.
>>>
>>>
>>> It is okay to emit once for two events set befere we EMIT.
>>> It is not okay to miss an event set after we emit.
>>
>> What is the difference between event set before EMIT and event set
>> just after emit before reseting event_enospc to false? Why should they
>> be handled differently?
>
>
> it is a _new event_ and it needs to be emitted.
>
>
>>> 1. Event(s) one or more
>>> 2. Start to EMIT
>>> 3. Userspace gets first event does what it does to handle it.
>>> 4. A new event comes - userspace is not notified
>>
>> You would probably agree if all happens without concurrency exactly as
>> you've wrote, a new event will also be notified.
>
> We have concurency - two+ threads and userspace
>>
>> So we talk about the case of concurrency, that means that in one
>> thread event_enospc = true in ploop_try_delay_enospc was set after
>> DMEMIT in ploop_get_event and before event_enospc = false in
>> ploop_get_event in other thread.
>>
>> In this case userspace is notified about both events in one emit, just
>> about one event it is notified slightly earlier when it actually set
>> true to variable. The question is - Why is it bad? I don't see why.
>
> Why do you think it is okay to notify only once for both events, even
> after second ocured after we send the notification?
> We can not make assumptions of how it will be processed.
Because DMEMIT is just a scnprintf to result, which is AFAICS a buffer
inside dm_ioctl structure and it is allocated in ctl_ioctl->copy_params
via kvmalloc on the same call-stack. So I think that this emit is only
actually acknowleged after ploop_get_event had finished. So even for
later event the emit is acknodleged after this event had set
event_enospc to true even without lock.
>
>>
>>> 5. End of EMIT
>>>
>>> at 4. we have a lost event - an event userspace will not receive.
>>> Cost is minimal as we do not expect to see much of this so the
>>> correctness is more important here.
>>>
>>>
>>> And this can happen for sure - we have a lots of writes , they get
>>> queued. Userspace frees some space , writes are retried but again
>>> space is over because they need more space than it is freed. Without
>>> the second event we get stuck here, userspace will not free more space.
>>> And pios will hang an retry.
>>>
>>>>
>>>>> }
>>>>> spin_unlock_irq(&ploop->deferred_lock);
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
--
Best regards, Tikhomirov Pavel
Senior Software Developer, Virtuozzo.
More information about the Devel
mailing list