[Devel] [PATCH vz9 v2] ploop: port and fix the standby mode feature.

Konstantin Khorenko khorenko at virtuozzo.com
Thu Nov 3 21:21:28 MSK 2022


On 02.11.2022 12:19, Konstantin Khorenko wrote:
> On 02.11.2022 03:32, Kui Liu wrote:
>> -----Original Message-----
>> From: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
>> Date: Tuesday, 1 November 2022 at 11:57 PM
>> To: Konstantin Khorenko <khorenko at virtuozzo.com>, "Denis V. Lunev" <den at virtuozzo.com>, Devel <devel at openvz.org>
>> Cc: Kui Liu <Kui.Liu at acronis.com>, Alexey Kuznetsov <kuznet at acronis.com>
>> Subject: Re: [PATCH vz9 v2] ploop: port and fix the standby mode feature.
>>
>>       On 1.11.22 17:27, Konstantin Khorenko wrote:
>>       > On 01.11.2022 08:36, Alexander Atanasov wrote:
>>       >> On 31.10.22 21:27, Denis V. Lunev wrote:
>>       >>> On 10/31/22 19:12, Alexander Atanasov wrote:
> 
> [snip]
> 
>>       >>>> @@ -406,6 +408,14 @@ static int ploop_ctr(struct dm_target *ti,
>>       >>>> unsigned int argc, char **argv)
>>       >>>>        ti->private = ploop;
>>       >>>>        ploop->ti = ti;
>>       >>>> +    if (blk_queue_standby_en(ploop_blk_queue(ploop))) {
>>       >>>> +        blk_queue_flag_clear(QUEUE_FLAG_STANDBY,
>>       >>>> ploop_blk_queue(ploop));
>>       >>>> +        if (static_branch_unlikely(&ploop_standby_check)) {
>>       >>> this point should NOT be reached. If it is reached - SCST has violated
>>       >>> the protocol at my opinion
>>       >>
>>       >> If any other device wants to user the flag and it sets it before _ctr it
>>       >> will catch it and enable the key. I can change it to a WARN_ON if _en is
>>       >> set but key is not.
>>       >
>>       > 1. i think the behavior should be the same both in vz7 and vz9.
>>       >     In vz7 we don't have a semantic "well, under certain circumstances
>>       > if you set QUEUE_FLAG_STANDBY_EN queue bit, it will automatically enable
>>       > the global static key", so let's not implement this logic in vz9 as well.
>>
>>       the difference between ploop and dm-ploop is how the queue is created.
>>       vz7 ploop creates the queue itself , so no one can pass the flag
>>       vz9 receives an already created queue - so we must take care of the
>>       flags, how to do so is still under question. But since the creation of
>>       devices is different we can not keep it 1:1.
>>
>>
>>       > 2. i do not think we need to clear QUEUE_FLAG_STANDBY bit in the request
>>       > queue, because it should be zeroed on allocation:
>>       >
>>       > struct request_queue *blk_alloc_queue(int node_id)
>>       > {
>>       > ...
>>       >          q = kmem_cache_alloc_node(blk_requestq_cachep,
>>       >                                  GFP_KERNEL | __GFP_ZERO, node_id);
>>       >
>>       >
>>
>>       For both 1. and 2. - we do not know where the queue comes from - with
>>       device mapper stacking devices one over other i think we might receive a
>>       queue that is from another device and it can have the _en flag set.
>>       Thinking about this
>>
>>       ploop0 over scst
>>       ploop1 over ploop0
>>
>>       i have to verify if this is possible but i think it is.
>>       If a queue gets moved it might not be freshly allocated and have the bit
>>       set, so to be safe clear the flag - this is from the original patch and
>>       i guess that is why it was there.
>>
>> [LIU]:  QUEUE_FLAG_STANDBY bit MUST be cleared here, as it is the only place where
>> the ploop device recovers from standby mode.  In our use case, once a ploop device enters standby
>>    mode, the userspace will initiate recovery by replacing the top delta file without destroying
>> the device, which is why the bit is clear in 'repalce_delta' in vz7.
>> And in vz9, replace delta is achieved by  table reload, which reallocates a new ploop instance,
>> but keeps underlying mapped_device unchanged, thus request_queue unchanged,   so we have
>> to clear the bit here.
> 
> Sasha, Liu, thank you for the explanation, now the necessity to drop the QUEUE_FLAG_STANDBY flag is
> clear to me.
> 
> Let's do the following:
> 
> * if we found QUEUE_FLAG_STANDBY set, we
>     - clear the QUEUE_FLAG_STANDBY flag
>     - if the ploop_standby_check static key is FALSE,
>       - switch ploop_standby_check static key to TRUE
>       - print a WARN_ON "XXX"
>     - if the QUEUE_FLAG_STANDBY_ON is not set
>       - set the QUEUE_FLAG_STANDBY_ON flag
>       - print a WARN_ON "YYY"
> 
> * if we found QUEUE_FLAG_STANDBY unset
>     - if (the QUEUE_FLAG_STANDBY_ON is set &&
>          ploop_standby_check static key is FALSE)
>       - switch ploop_standby_check static key to TRUE
>       - print a WARN_ON "ZZZ"

JFYI: had a talk with Sasha, decided:
if we detect an inconsistency with QUEUE_FLAG_STANDBY{,_EN} bits / global static key,
   - issue a warning
   - do not touch any bits / static key value

Rationale: in case we don't have SCST-based ploops over vStorage and
somehow we missed the fact that some other code used same bits QUEUE_FLAG_STANDBY{,_EN}, we will not 
break anything.

Well, this is not a robust solution though - in this case SCST will rewrite QUEUE_FLAG_STANDBY{,_EN} 
bits anyway, but
it looks like it does not matter much if we fix inconsistency or not - the things will go wrong anyway.

> Messages XXX/YYY/ZZZ should be different so we can fully understand the situation upon a message.
> 
>>       If we receive a queue with the flag set and the static key not set
>>       there are some options:
>>       - enable the key and go on
>>       - print a warning and clear the _en bit
>>       - other ?
>>
>>     
>>       > 3. BTW, we talk here about standby mode for dm-ploop,
>>       >     and in vz9 we have dm-qcow2 as well (which is going to be default BTW),
>>       >     so should we introduce the standby mode for it as well?
>>
>>       If it is required for dm-qcow2  too we should think how to implement it.
>>
>> [LIU] I think it depends whether we are going to use dm-qcow2  instead of ploop  in SCST. I don’t see
>> why we need to move to dm-qcow2 unless ploop can and will be completely replaced by dm-qcow2.
> 
> Well, in vz9 we'll have both dm-qcow2 and dm-ploop supported with default dm-qcow2.
> In vz10 we will probably drop dm-ploop, at least - we will try hard.
> 
> So please decide on your side when you are going to start supporting dm-qcow2 for iSCSI implementation.
> 
> Thank you.
> 
> --
> Konstantin


More information about the Devel mailing list