[Devel] [PATCH vz9 v2] ploop: port and fix the standby mode feature.
Kui Liu
Kui.Liu at acronis.com
Wed Nov 2 05:32:11 MSK 2022
-----Original Message-----
From: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
Date: Tuesday, 1 November 2022 at 11:57 PM
To: Konstantin Khorenko <khorenko at virtuozzo.com>, "Denis V. Lunev" <den at virtuozzo.com>, Devel <devel at openvz.org>
Cc: Kui Liu <Kui.Liu at acronis.com>, Alexey Kuznetsov <kuznet at acronis.com>
Subject: Re: [PATCH vz9 v2] ploop: port and fix the standby mode feature.
On 1.11.22 17:27, Konstantin Khorenko wrote:
> On 01.11.2022 08:36, Alexander Atanasov wrote:
>> On 31.10.22 21:27, Denis V. Lunev wrote:
>>> On 10/31/22 19:12, Alexander Atanasov wrote:
>
> [snip]
>
>>>> @@ -1163,6 +1166,26 @@ static void ploop_queue_resubmit(struct pio
>>>> *pio)
>>>> queue_work(ploop->wq, &ploop->worker);
>>>> }
>>>> +static void ploop_check_standby_mode(struct ploop *ploop, long res)
>>>> +{
>>>> + struct request_queue *q = ploop_blk_queue(ploop);
>>>> + int prev;
>>>> +
>>>> + if (!blk_queue_standby_en(q))
>>>> + return;
>>>> +
>>>> + /* move to standby if delta lease was stolen or mount is gone */
>>>> + if (res != -EBUSY && res != -ENOTCONN && res != -EIO)
>>>> + return;
>>>> +
>>>> + spin_lock_irq(&q->queue_lock);
>>>> + prev = blk_queue_flag_test_and_set(QUEUE_FLAG_STANDBY, q);
>>>> + spin_unlock_irq(&q->queue_lock);
>>>> +
>>>> + if (!prev)
>>>> + pr_info("ploop: switch into standby mode\n");
>
> i think we should have same messages both in vz7 and vz9.
>
> In vz7 it is:
> if (!prev)
> printk("ploop%d was switched into "
> "the standby mode\n", plo->index);
With the patch for the log messages it will be.
ploop: dm-NNNN: was switched into standby mode
i will change it to match the wording in vz7.
>
> Well, i understand that ploop id part of the message has changed - it's ok,
> but i suggest to keep the rest of the message the same.
>
> Or if we want to change them - to change them in both versions
> synchronously.
Let's make them like in vz7. I don't know why it was changed in the
original port to vz9 but if there is a reason we can change it too.
>
>>>
>>> some ploop identifier is needed here to determine device name
>>
>> I'll add the device name here. dm-ploop logs without device name
>> everywhere - i'll check that too.
>>
>
> [snip]
>
>>>> @@ -406,6 +408,14 @@ static int ploop_ctr(struct dm_target *ti,
>>>> unsigned int argc, char **argv)
>>>> ti->private = ploop;
>>>> ploop->ti = ti;
>>>> + if (blk_queue_standby_en(ploop_blk_queue(ploop))) {
>>>> + blk_queue_flag_clear(QUEUE_FLAG_STANDBY,
>>>> ploop_blk_queue(ploop));
>>>> + if (static_branch_unlikely(&ploop_standby_check)) {
>>> this point should NOT be reached. If it is reached - SCST has violated
>>> the protocol at my opinion
>>
>> If any other device wants to user the flag and it sets it before _ctr it
>> will catch it and enable the key. I can change it to a WARN_ON if _en is
>> set but key is not.
>
> 1. i think the behavior should be the same both in vz7 and vz9.
> In vz7 we don't have a semantic "well, under certain circumstances
> if you set QUEUE_FLAG_STANDBY_EN queue bit, it will automatically enable
> the global static key", so let's not implement this logic in vz9 as well.
the difference between ploop and dm-ploop is how the queue is created.
vz7 ploop creates the queue itself , so no one can pass the flag
vz9 receives an already created queue - so we must take care of the
flags, how to do so is still under question. But since the creation of
devices is different we can not keep it 1:1.
> 2. i do not think we need to clear QUEUE_FLAG_STANDBY bit in the request
> queue, because it should be zeroed on allocation:
>
> struct request_queue *blk_alloc_queue(int node_id)
> {
> ...
> q = kmem_cache_alloc_node(blk_requestq_cachep,
> GFP_KERNEL | __GFP_ZERO, node_id);
>
>
For both 1. and 2. - we do not know where the queue comes from - with
device mapper stacking devices one over other i think we might receive a
queue that is from another device and it can have the _en flag set.
Thinking about this
ploop0 over scst
ploop1 over ploop0
i have to verify if this is possible but i think it is.
If a queue gets moved it might not be freshly allocated and have the bit
set, so to be safe clear the flag - this is from the original patch and
i guess that is why it was there.
[LIU]: QUEUE_FLAG_STANDBY bit MUST be cleared here, as it is the only place where
the ploop device recovers from standby mode. In our use case, once a ploop device enters standby
mode, the userspace will initiate recovery by replacing the top delta file without destroying
the device, which is why the bit is clear in 'repalce_delta' in vz7.
And in vz9, replace delta is achieved by table reload, which reallocates a new ploop instance,
but keeps underlying mapped_device unchanged, thus request_queue unchanged, so we have
to clear the bit here.
If we receive a queue with the flag set and the static key not set
there are some options:
- enable the key and go on
- print a warning and clear the _en bit
- other ?
> 3. BTW, we talk here about standby mode for dm-ploop,
> and in vz9 we have dm-qcow2 as well (which is going to be default BTW),
> so should we introduce the standby mode for it as well?
If it is required for dm-qcow2 too we should think how to implement it.
[LIU] I think it depends whether we are going to use dm-qcow2 instead of ploop in SCST. I don’t see
why we need to move to dm-qcow2 unless ploop can and will be completely replaced by dm-qcow2.
--
Regards,
Alexander Atanasov
More information about the Devel
mailing list