[Devel] [PATCH RH9] dm-ploop: port the standby mode feature.

Fri Oct 14 08:57:36 MSK 2022

-----Original Message-----
From: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
Date: Thursday, 13 October 2022 at 11:26 PM
To: Kui Liu <Kui.Liu at acronis.com>, Konstantin Khorenko <khorenko at virtuozzo.com>, Alexey Kuznetsov <kuznet at acronis.com>
Cc: Devel <devel at openvz.org>, Alexander Mikhalitsyn <alexander.mikhalitsyn at virtuozzo.com>
Subject: Re: [Devel] [PATCH RH9] dm-ploop: port the standby mode feature.

    On 13.10.22 16:33, Kui Liu wrote:
    > Hello,
    > 
    > First of all, please bear in mind that this patch is a port of a patchset currently present in VZ 7 kernel.
    > Original patches were added just to address a problem that would only happen in our particular use case,
    > where we need to use ploop devices as the back store for  iSCSI target, while ploop devices are backed
    > by files stored in vstorage filesystem.

    Ok, i don't quite get this yet. Since it is specific to vstorage can we 
    check the queue on init and see if it is backed by that iSCSI target and 
    set a static key or flag to enable the checks only if that is the case? 
    check can be done via queue->disk->major/minor/diskname/fops/pick-one 
    from a quick look.

[LIU]: Do you mean that you want a 'check standby mode enable' flag so that the check 
is done only when the enable flag is true, and the flag should be set by iSCSI target when the 
ploop device is attached iSCSI target?  
Well, it can be done, but I'm not sure why that's necessary?  Current code is kind copied 
directly from original implementation, where there wasn't such 'enable' flag, the question would 
be why it wasn't implemented back then?  

    [snip]

    >      >> +    /* move to standby if delta lease was stolen or mount is gone */
    >      >> +    if (res != -EBUSY && res != -ENOTCONN && res != -EIO) {
    >      >> +        return;
    >      >> +    }
    >      >> +
    >      >> +    spin_lock_irq(&q->queue_lock);
    >      >> +    prev = blk_queue_flag_test_and_set(QUEUE_FLAG_STANDBY, q);
    >      >> +    spin_unlock_irq(&q->queue_lock);
    >      >> +
    >      >> +    if (!prev)
    >      >> +        pr_info("ploop: switch into standby mode\n");
    >      >> +}
    > 
    > 
    >      What guarantees that we got EBUSY, EIO or ENOTCONN for the reasons
    >      listed in the description - "This mode shows that a delta lease was
    >      stolen and it is impossible to handle any requests." - what if we got
    >      such an error for other reason? If the problem comes from nfs why not
    >      suspend device mapper from there ?
    > 
    > [LIU]: I would assume vstorage filesystem guarantees that only EBUSY, EIO, or ENOTCONN
    > will be returned when the described event happens, and it doesn't matter whether we could
    > get these errors for other reasons.  And what matters is, in our use case, once the 3 errors are
    > returned, the ploop device need to be flagged as 'standby mode' which needs to be passed to
    >   iSCSI target driver.  As for what happens when used with other filesystems, we actually don't care,
    >   however you can see that the changes don't affect ploop's normal behaviour either. >
    >      Also do this errors actually reach dm-ploop ? From my recent tests i
    >      observed that if you have and EIO the filesystem gets it and remounts RO
    >      - it might not reach ploop code at all. Even if you suspend/resume the
    >      queue it would not help with the RO fs.
    > 
    > [LIU]:  When used with vstorage filesystem, these errors do reach dm-ploop.  Again, we really
    > don’t care about other filesystems as long as it doesn't break anything.

    If there is no check for the specific target and some of the errors 
    comes from other device - ploop will say it is into standby but no 
    device will check the flags - so it will not be true. My point is that 
    it is hard to say it won't break anything.

[LIU]: Currently only iSCSI target driver is aware of this flag, and of course we have code in iSCSI target driver 
to deal with this flag.  However for any other ploop users who are not aware of the standby flag,  the flag is just
 transparent, and it doesn't matter whether the flag is set or clear anyway, then how does it break anything?

Or is your concern that there may be users that would test the entire flag without masking out uninterested bits?
I don’t believe such use exists in the kernel code. 

    [snip]

    >      >>        INIT_WORK(&ploop->event_work, do_ploop_event_work);
    >      >>        init_completion(&ploop->inflight_bios_ref_comp);
    >      >>    +    ploop->queue = ti->table->md->queue;
    >      >
    >      > This looks wrong, ploop struct alread have a ptr to ti.
    >      > Tables are reloaded in some cases and it is possible that table/md/queue
    >      > change.
    >      >
    >      >> +    blk_queue_flag_clear(QUEUE_FLAG_STANDBY, ploop->queue);
    >      >> +
    >      >>        for (i = 0; i < 2; i++) {
    >      >>            release = i ? ploop_inflight_bios_ref_exit1 :
    >      >>                      ploop_inflight_bios_ref_exit0;
    >      >> diff --git a/drivers/md/dm-ploop.h b/drivers/md/dm-ploop.h
    >      >> index 5d953278a976..2502718c131f 100644
    >      >> --- a/drivers/md/dm-ploop.h
    >      >> +++ b/drivers/md/dm-ploop.h
    >      >> @@ -229,6 +229,8 @@ struct ploop {
    >      >>         struct timer_list enospc_timer;
    >      >>        bool event_enospc;
    >      >> +
    >      >> +    struct request_queue *queue;
    >      >>    };

    There is ploop->ti->table->md->queue, why not use it but cache queue ptr 
    here ? Is it guaranteed that queue won't change and leave ploop with 
    dangling ptr?

[LIU]: Because of convenience,  and yes, it is guaranteed that the queue won't change 
during ploop's lifespan. Looking at dm-mapper's code, apparently  the 'md', hence 'md->queue', 
outlives 'ploop', and md->queue can't be changed while ploop is still alive. 
In case of table reload,  a new ploop instance will allocated and initialized,  the old one will be destroyed.    

    >      >>     struct ploop_rq {
    [snip]

    >      >> +#define blk_queue_standby(q)    test_bit(QUEUE_FLAG_STANDBY,
    >      >> &(q)->queue_flags)
    >      >>     extern void blk_set_pm_only(struct request_queue *q);
    >      >>    extern void blk_clear_pm_only(struct request_queue *q);
    > 
    > 
    >      Why introduce a new queue flag?
    >      Why can't device mapper suspend be used ?
    >      Why not use the existing ploop code to defer and retry pios?
    > 
    > [LIU]:  The original commit does give some explanation on why  introduce the new queue flag, because we need to
    > transit to another state, aka standby, once the 3 errors are returned, then the state needs to be aware by iSCSI target
    > driver.  while we cannot use the bio error code to pass the event to iSCSI target driver (adding new error code is not an
    > option here  because it would break other ploop users) . Since the only data struct that are shared between dm-ploop
    > and iSCSI target driver is 'struct request_queue', therefore use a new flag is the most convenient way, this is my understanding.
    > 
    > We can't use device mapper suspend flag because it is not aware by iSCSI target driver.

    If i understand correct - no way to get from iSCSI back to device 
    mapper. So it makes sense to use the shared queue.

    > Defer and retry pios will be unnecessary once these errors is returned in our used case.

    -- 
    Regards,
    Alexander Atanasov