[Devel] [PATCH RH9] dm-ploop: port the standby mode feature.

Fri Oct 14 12:42:06 MSK 2022

Hello,

On 14.10.22 8:57, Kui Liu wrote:
> 
> 
> -----Original Message-----
> From: Alexander Atanasov <alexander.atanasov at virtuozzo.com>
> Date: Thursday, 13 October 2022 at 11:26 PM
> To: Kui Liu <Kui.Liu at acronis.com>, Konstantin Khorenko <khorenko at virtuozzo.com>, Alexey Kuznetsov <kuznet at acronis.com>
> Cc: Devel <devel at openvz.org>, Alexander Mikhalitsyn <alexander.mikhalitsyn at virtuozzo.com>
> Subject: Re: [Devel] [PATCH RH9] dm-ploop: port the standby mode feature.
> 
>      On 13.10.22 16:33, Kui Liu wrote:
>      > Hello,
>      >
>      > First of all, please bear in mind that this patch is a port of a patchset currently present in VZ 7 kernel.
>      > Original patches were added just to address a problem that would only happen in our particular use case,
>      > where we need to use ploop devices as the back store for  iSCSI target, while ploop devices are backed
>      > by files stored in vstorage filesystem.
> 
> 
>      Ok, i don't quite get this yet. Since it is specific to vstorage can we
>      check the queue on init and see if it is backed by that iSCSI target and
>      set a static key or flag to enable the checks only if that is the case?
>      check can be done via queue->disk->major/minor/diskname/fops/pick-one
>      from a quick look.
> 
> [LIU]: Do you mean that you want a 'check standby mode enable' flag so that the check
> is done only when the enable flag is true, and the flag should be set by iSCSI target when the
> ploop device is attached iSCSI target?
> Well, it can be done, but I'm not sure why that's necessary?  Current code is kind copied
> directly from original implementation, where there wasn't such 'enable' flag, the question would
> be why it wasn't implemented back then?

Yes, I think it is better to be explicit and not blindly assume that it 
wouldn't affect anything. Old ploop code might not needed it. But 
dm-ploop is new code and it might need that flag - point is that it is 
better be safe and we know that we run on exactly that target.

> 
>      [snip]
> 
>      >      >> +    /* move to standby if delta lease was stolen or mount is gone */
>      >      >> +    if (res != -EBUSY && res != -ENOTCONN && res != -EIO) {
>      >      >> +        return;
>      >      >> +    }
>      >      >> +
>      >      >> +    spin_lock_irq(&q->queue_lock);
>      >      >> +    prev = blk_queue_flag_test_and_set(QUEUE_FLAG_STANDBY, q);
>      >      >> +    spin_unlock_irq(&q->queue_lock);
>      >      >> +
>      >      >> +    if (!prev)
>      >      >> +        pr_info("ploop: switch into standby mode\n");
>      >      >> +}
>      >
>      >
>      >      What guarantees that we got EBUSY, EIO or ENOTCONN for the reasons
>      >      listed in the description - "This mode shows that a delta lease was
>      >      stolen and it is impossible to handle any requests." - what if we got
>      >      such an error for other reason? If the problem comes from nfs why not
>      >      suspend device mapper from there ?
>      >
>      > [LIU]: I would assume vstorage filesystem guarantees that only EBUSY, EIO, or ENOTCONN
>      > will be returned when the described event happens, and it doesn't matter whether we could
>      > get these errors for other reasons.  And what matters is, in our use case, once the 3 errors are
>      > returned, the ploop device need to be flagged as 'standby mode' which needs to be passed to
>      >   iSCSI target driver.  As for what happens when used with other filesystems, we actually don't care,
>      >   however you can see that the changes don't affect ploop's normal behaviour either. >
>      >      Also do this errors actually reach dm-ploop ? From my recent tests i
>      >      observed that if you have and EIO the filesystem gets it and remounts RO
>      >      - it might not reach ploop code at all. Even if you suspend/resume the
>      >      queue it would not help with the RO fs.
>      >
>      > [LIU]:  When used with vstorage filesystem, these errors do reach dm-ploop.  Again, we really
>      > don’t care about other filesystems as long as it doesn't break anything.
> 
>      If there is no check for the specific target and some of the errors
>      comes from other device - ploop will say it is into standby but no
>      device will check the flags - so it will not be true. My point is that
>      it is hard to say it won't break anything.
> 
> [LIU]: Currently only iSCSI target driver is aware of this flag, and of course we have code in iSCSI target driver
> to deal with this flag.  However for any other ploop users who are not aware of the standby flag,  the flag is just
>   transparent, and it doesn't matter whether the flag is set or clear anyway, then how does it break anything?
> 
> Or is your concern that there may be users that would test the entire flag without masking out uninterested bits?
> I don’t believe such use exists in the kernel code.

My concern is more about the debug message in case some other driver 
return one of the errors.

> 
> 
> 
>      [snip]
>      There is ploop->ti->table->md->queue, why not use it but cache queue ptr
>      here ? Is it guaranteed that queue won't change and leave ploop with
>      dangling ptr?
> 
> [LIU]: Because of convenience,  and yes, it is guaranteed that the queue won't change
> during ploop's lifespan. Looking at dm-mapper's code, apparently  the 'md', hence 'md->queue',
> outlives 'ploop', and md->queue can't be changed while ploop is still alive.
> In case of table reload,  a new ploop instance will allocated and initialized,  the old one will be destroyed.

Ok, i double checked and you are right about this.

-- 
Regards,
Alexander Atanasov