[Devel] [PATCH vz9 v1 06/63] dm-ploop: convert enospc handling to use lockless lists
Pavel Tikhomirov
ptikhomirov at virtuozzo.com
Tue Feb 4 11:55:30 MSK 2025
On 2/4/25 16:37, Alexander Atanasov wrote:
> On 4.02.25 10:19, Pavel Tikhomirov wrote:
>>
>>
>> On 2/4/25 15:57, Alexander Atanasov wrote:
>>> On 4.02.25 9:44, Pavel Tikhomirov wrote:
>>>>
>>>>
>>>> On 2/3/25 15:57, Alexander Atanasov wrote:
>>>>> On 3.02.25 9:49, Pavel Tikhomirov wrote:
>>>>>>
>>>>>>
>>>>>> On 2/3/25 15:42, Alexander Atanasov wrote:
>>>>>>> On 3.02.25 9:27, Pavel Tikhomirov wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2/3/25 14:45, Alexander Atanasov wrote:
>>>>>>>>> On 3.02.25 8:01, Pavel Tikhomirov wrote:
>>>>>>>>>>
>>>>>>>>>>> @@ -166,7 +171,6 @@ static bool ploop_try_delay_enospc(struct
>>>>>>>>>>> ploop_rq *prq, struct pio *pio)
>>>>>>>>>>> bool delayed = true;
>>>>>>>>>>> unsigned long flags;
>>>>>>>>>>> - spin_lock_irqsave(&ploop->deferred_lock, flags);
>>>>>>>>>>> if (unlikely(ploop->wants_suspend)) {
>>>>>>>>>>> delayed = false;
>>>>>>>>>>> goto unlock;
>>>>>>>>>>> @@ -176,10 +180,11 @@ static bool
>>>>>>>>>>> ploop_try_delay_enospc(struct ploop_rq *prq, struct pio *pio)
>>>>>>>>>>> pr_err_once(PL_FMT("underlying disk is almost full"),
>>>>>>>>>>> ploop_device_name(ploop));
>>>>>>>>>>> + spin_lock_irqsave(&ploop->deferred_lock, flags);
>>>>>>>>>>> ploop->event_enospc = true;
>>>>>>>>>>> - list_add_tail(&pio->list, &ploop->enospc_pios);
>>>>>>>>>>> -unlock:
>>>>>>>>>>> spin_unlock_irqrestore(&ploop->deferred_lock, flags);
>>>>>>>>>>> + llist_add((struct llist_node *)(&pio->list), &ploop-
>>>>>>>>>>> >enospc_pios);
>>>>>>>>>>> +unlock:
>>>>>>>>>>> if (delayed)
>>>>>>>>>>> mod_timer(&ploop->enospc_timer, jiffies +
>>>>>>>>>>> PLOOP_ENOSPC_TIMEOUT);
>>>>>>>>>>
>>>>>>>>>> Can you please explain why we need to take defered_lock around
>>>>>>>>>> ploop- >event_enospc setting after your patch? (It looks that
>>>>>>>>>> this lock does not do anything now.)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> see static int ploop_get_event(...), without lock event can be
>>>>>>>>> missed
>>>>>>>>
>>>>>>>> That is not an explanation. How exactly can it be missed?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Other cpu can set "ploop->event_enospc = true" here before lock
>>>>>> (i.e. it was set to true twice), that would lead to emiting only
>>>>>> one event for two sets.
>>>>>>
>>>>>>> spin_lock_irq(&ploop->deferred_lock);
>>>>>>> if (ploop->event_enospc) { <- while
>>>>>>> emit other cpu can set
>>>>>>> ret = (DMEMIT("event_ENOSPC\n")) ? 1 : 0;
>>>>>>> if (ret)
>>>>>>> ploop->event_enospc = false; <- next
>>>>>>> cleaars here -> second event lost
>>>>>>
>>>>>> Not lost, you just emit once for two sets, which as you can see
>>>>>> from above comment is possible even with locks everywhere. So your
>>>>>> explanation does not prove that we need this lock AFAICS.
>>>>>
>>>>>
>>>>> It is okay to emit once for two events set befere we EMIT.
>>>>> It is not okay to miss an event set after we emit.
>>>>
>>>> What is the difference between event set before EMIT and event set
>>>> just after emit before reseting event_enospc to false? Why should
>>>> they be handled differently?
>>>
>>>
>>> it is a _new event_ and it needs to be emitted.
>>>
>>>
>>>>> 1. Event(s) one or more
>>>>> 2. Start to EMIT
>>>>> 3. Userspace gets first event does what it does to handle it.
>>>>> 4. A new event comes - userspace is not notified
>>>>
>>>> You would probably agree if all happens without concurrency exactly
>>>> as you've wrote, a new event will also be notified.
>>>
>>> We have concurency - two+ threads and userspace
>>>>
>>>> So we talk about the case of concurrency, that means that in one
>>>> thread event_enospc = true in ploop_try_delay_enospc was set after
>>>> DMEMIT in ploop_get_event and before event_enospc = false in
>>>> ploop_get_event in other thread.
>>>>
>>>> In this case userspace is notified about both events in one emit,
>>>> just about one event it is notified slightly earlier when it
>>>> actually set true to variable. The question is - Why is it bad? I
>>>> don't see why.
>>>
>>> Why do you think it is okay to notify only once for both events, even
>>> after second ocured after we send the notification?
>>> We can not make assumptions of how it will be processed.
>>
>> Because DMEMIT is just a scnprintf to result, which is AFAICS a buffer
>> inside dm_ioctl structure and it is allocated in ctl_ioctl-
>> >copy_params via kvmalloc on the same call-stack. So I think that this
>> emit is only actually acknowleged after ploop_get_event had finished.
>> So even for later event the emit is acknodleged after this event had
>> set event_enospc to true even without lock.
>
>
> If we are going to rely on this then we can remove all locks around
> using event_enospc. I prefer to keep them since it is not a hot path.
OK.
>
>>
>>>
>>>>
>>>>> 5. End of EMIT
>>>>>
>>>>> at 4. we have a lost event - an event userspace will not receive.
>>>>> Cost is minimal as we do not expect to see much of this so the
>>>>> correctness is more important here.
>>>>>
>>>>>
>>>>> And this can happen for sure - we have a lots of writes , they get
>>>>> queued. Userspace frees some space , writes are retried but again
>>>>> space is over because they need more space than it is freed.
>>>>> Without the second event we get stuck here, userspace will not free
>>>>> more space.
>>>>> And pios will hang an retry.
>>>>>
>>>>>>
>>>>>>> }
>>>>>>> spin_unlock_irq(&ploop->deferred_lock);
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
--
Best regards, Tikhomirov Pavel
Senior Software Developer, Virtuozzo.
More information about the Devel
mailing list