[Devel] [PATCH vz9 v2] ploop: port and fix the standby mode feature.

Kui Liu Kui.Liu at acronis.com
Wed Nov 2 05:32:11 MSK 2022



-----Original Message-----
From: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
Date: Tuesday, 1 November 2022 at 11:57 PM
To: Konstantin Khorenko <khorenko at virtuozzo.com>, "Denis V. Lunev" <den at virtuozzo.com>, Devel <devel at openvz.org>
Cc: Kui Liu <Kui.Liu at acronis.com>, Alexey Kuznetsov <kuznet at acronis.com>
Subject: Re: [PATCH vz9 v2] ploop: port and fix the standby mode feature.

    On 1.11.22 17:27, Konstantin Khorenko wrote:
    > On 01.11.2022 08:36, Alexander Atanasov wrote:
    >> On 31.10.22 21:27, Denis V. Lunev wrote:
    >>> On 10/31/22 19:12, Alexander Atanasov wrote:
    > 
    > [snip]
    > 
    >>>> @@ -1163,6 +1166,26 @@ static void ploop_queue_resubmit(struct pio 
    >>>> *pio)
    >>>>        queue_work(ploop->wq, &ploop->worker);
    >>>>    }
    >>>> +static void ploop_check_standby_mode(struct ploop *ploop, long res)
    >>>> +{
    >>>> +    struct request_queue *q = ploop_blk_queue(ploop);
    >>>> +    int prev;
    >>>> +
    >>>> +    if (!blk_queue_standby_en(q))
    >>>> +        return;
    >>>> +
    >>>> +    /* move to standby if delta lease was stolen or mount is gone */
    >>>> +    if (res != -EBUSY && res != -ENOTCONN && res != -EIO)
    >>>> +        return;
    >>>> +
    >>>> +    spin_lock_irq(&q->queue_lock);
    >>>> +    prev = blk_queue_flag_test_and_set(QUEUE_FLAG_STANDBY, q);
    >>>> +    spin_unlock_irq(&q->queue_lock);
    >>>> +
    >>>> +    if (!prev)
    >>>> +        pr_info("ploop: switch into standby mode\n");
    > 
    > i think we should have same messages both in vz7 and vz9.
    > 
    > In vz7 it is:
    >          if (!prev)
    >                  printk("ploop%d was switched into "
    >                         "the standby mode\n", plo->index);

    With the patch for the log messages it will be.

    ploop: dm-NNNN: was switched into standby mode

    i will change it to match the wording in vz7.

    > 
    > Well, i understand that ploop id part of the message has changed - it's ok,
    > but i suggest to keep the rest of the message the same.
    > 
    > Or if we want to change them - to change them in both versions 
    > synchronously.

    Let's make them like in vz7. I don't know why it was changed in the 
    original port to vz9 but if there is a reason we can change it too.

    > 
    >>>
    >>> some ploop identifier is needed here to determine device name
    >>
    >> I'll add the device name here. dm-ploop logs without device name
    >> everywhere - i'll check that too.
    >>
    > 
    > [snip]
    > 
    >>>> @@ -406,6 +408,14 @@ static int ploop_ctr(struct dm_target *ti,
    >>>> unsigned int argc, char **argv)
    >>>>        ti->private = ploop;
    >>>>        ploop->ti = ti;
    >>>> +    if (blk_queue_standby_en(ploop_blk_queue(ploop))) {
    >>>> +        blk_queue_flag_clear(QUEUE_FLAG_STANDBY, 
    >>>> ploop_blk_queue(ploop));
    >>>> +        if (static_branch_unlikely(&ploop_standby_check)) {
    >>> this point should NOT be reached. If it is reached - SCST has violated
    >>> the protocol at my opinion
    >>
    >> If any other device wants to user the flag and it sets it before _ctr it
    >> will catch it and enable the key. I can change it to a WARN_ON if _en is
    >> set but key is not.
    > 
    > 1. i think the behavior should be the same both in vz7 and vz9.
    >     In vz7 we don't have a semantic "well, under certain circumstances 
    > if you set QUEUE_FLAG_STANDBY_EN queue bit, it will automatically enable 
    > the global static key", so let's not implement this logic in vz9 as well.

    the difference between ploop and dm-ploop is how the queue is created.
    vz7 ploop creates the queue itself , so no one can pass the flag
    vz9 receives an already created queue - so we must take care of the 
    flags, how to do so is still under question. But since the creation of 
    devices is different we can not keep it 1:1.


    > 2. i do not think we need to clear QUEUE_FLAG_STANDBY bit in the request 
    > queue, because it should be zeroed on allocation:
    > 
    > struct request_queue *blk_alloc_queue(int node_id)
    > {
    > ...
    >          q = kmem_cache_alloc_node(blk_requestq_cachep,
    >                                  GFP_KERNEL | __GFP_ZERO, node_id);
    > 
    > 

    For both 1. and 2. - we do not know where the queue comes from - with 
    device mapper stacking devices one over other i think we might receive a 
    queue that is from another device and it can have the _en flag set.
    Thinking about this

    ploop0 over scst
    ploop1 over ploop0

    i have to verify if this is possible but i think it is.
    If a queue gets moved it might not be freshly allocated and have the bit 
    set, so to be safe clear the flag - this is from the original patch and 
    i guess that is why it was there.

[LIU]:  QUEUE_FLAG_STANDBY bit MUST be cleared here, as it is the only place where 
the ploop device recovers from standby mode.  In our use case, once a ploop device enters standby
 mode, the userspace will initiate recovery by replacing the top delta file without destroying 
the device, which is why the bit is clear in 'repalce_delta' in vz7. 
And in vz9, replace delta is achieved by  table reload, which reallocates a new ploop instance, 
but keeps underlying mapped_device unchanged, thus request_queue unchanged,   so we have
to clear the bit here.

    If we receive a queue with the flag set and the static key not set
    there are some options:
    - enable the key and go on
    - print a warning and clear the _en bit
    - other ?

  
    > 3. BTW, we talk here about standby mode for dm-ploop,
    >     and in vz9 we have dm-qcow2 as well (which is going to be default BTW),
    >     so should we introduce the standby mode for it as well?

    If it is required for dm-qcow2  too we should think how to implement it.

[LIU] I think it depends whether we are going to use dm-qcow2  instead of ploop  in SCST. I don’t see 
why we need to move to dm-qcow2 unless ploop can and will be completely replaced by dm-qcow2.  

    -- 
    Regards,
    Alexander Atanasov





More information about the Devel mailing list