[Devel] [PATCH VZ10] vhost-blk: stop fetching descriptors on VHOST_BLK_SET_BACKEND(-1)

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Mon Jun 1 12:26:08 MSK 2026



On 5/29/26 21:35, Andrey Drobyshev wrote:
> Saving or migrating a vhost-blk guest under disk load can fail to load on
> the destination:
> 
>   qemu-kvm: VQ 0 size 0x100 < last_avail_idx 0xb8ab - used_idx 0xb934
>   qemu-kvm: Failed to load vhost-blk:virtio
>   qemu-kvm: error while loading state for instance 0x0 of device vhost-blk
>   load of migration failed: Operation not permitted
> 
> virtio_load() rejects the device because the saved used_idx is ahead of
> last_avail_idx, which is impossible for a coherent vring.
> 
> The root cause is that vhost-blk has no "stop fetching" step before the
> device is stopped.  On stop, QEMU's vhost_dev_stop() reads last_avail_idx
> via VHOST_GET_VRING_BASE, but the vhost worker is still running: it keeps
> pulling the avail-ring backlog and completing those requests, advancing
> the guest used->idx past the last_avail_idx that was just sampled.  The
> saved state is therefore incoherent.
> 
> vhost-net does not hit this because it detaches the backend
> (VHOST_NET_SET_BACKEND, fd == -1) before VHOST_GET_VRING_BASE, so its
> worker stops fetching.  vhost-blk had no equivalent operation.
> 
> Teach VHOST_BLK_SET_BACKEND to treat a negative fd as "stop the device":
> detach the backend from every vq (vhost_blk_handle_guest_kick() bails on a
> NULL backend), drain in-flight requests with vhost_blk_flush(), and release
> the backing file.  After this the worker no longer advances the rings, so
> the subsequent VHOST_GET_VRING_BASE reports a final, coherent
> last_avail_idx.  The unconsumed avail backlog stays in the ring and is
> reprocessed once the device is restarted.  The companion QEMU change issues
> this stop before vhost_dev_stop().
> 
> https://virtuozzo.atlassian.net/browse/VSTOR-133464
> Fixes: 40a5928ec730 ("drivers/vhost: vhost-blk accelerator for virtio-blk guests")
> Signed-off-by: Andrey Drobyshev <andrey.drobyshev at virtuozzo.com>

Reviewed-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>

> ---
>  drivers/vhost/blk.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/vhost/blk.c b/drivers/vhost/blk.c
> index b11f08f878f4..1b073011c445 100644
> --- a/drivers/vhost/blk.c
> +++ b/drivers/vhost/blk.c
> @@ -744,6 +744,24 @@ static long vhost_blk_set_backend(struct vhost_blk *blk, int fd)
>  	if (ret)
>  		goto out_dev;
>  
> +	/*
> +	 * fd < 0 means "stop the device".  Detach the backend from every vq so
> +	 * vhost_blk_handle_guest_kick() stops fetching descriptors, drain the
> +	 * in-flight requests, and release the backing file.
> +	 */
> +	if (fd < 0) {
> +		if (!blk->backend) {
> +			mutex_unlock(&blk->dev.mutex);
> +			return 0;		/* already stopped */
> +		}
> +		vhost_blk_drop_backends(blk);
> +		vhost_blk_flush(blk);
> +		fput(blk->backend);
> +		blk->backend = NULL;
> +		mutex_unlock(&blk->dev.mutex);
> +		return 0;
> +	}
> +
>  	if (blk->backend) {
>  		ret = -EBUSY;
>  		goto out_dev;

-- 
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.



More information about the Devel mailing list