[Devel] [PATCH RH7] virtio_net: fix hang on vhost backend setting

Tue May 17 19:43:55 MSK 2022

On 17.05.2022 18:29, Pavel Tikhomirov wrote:
> We can see a hang for 40 sec in (vz7.185.3) in vhost_net_set_backend:
>
> crash> ps -m | grep UN
> [0 00:00:40.157] [UN]  PID: 38937  TASK: ffff9b7f54ca8000  CPU: 2   COMMAND: "qemu-kvm"
>
> [<ffffffffc09c04a5>] vhost_net_ubuf_put_and_wait+0x65/0xb0 [vhost_net]
> [<ffffffffc09c2451>] vhost_net_ioctl+0x661/0x890 [vhost_net]
> [<ffffffff8fc89d80>] do_vfs_ioctl+0x3b0/0x5b0
> [<ffffffff8fc8a021>] SyS_ioctl+0xa1/0xc0
>
> while waiting for ubufs->refcount to get to zero.
>
> In crash we see an inprogress ubuf_info generated by handle_tx(), which
> is holding a refcount on ubufs, and should release it from
> vhost_zerocopy_callback() but yet didn't to it. We also see the
> skb_shared_info referencing this ubuf_info with zerocopy flag set. And
> finally we see an sk_buff referencing this skb_shared_info. This sk_buff
> had passed through validate_xmit_skb_list() as it has ->next == NULL and
> ->prev == skb. So it also passed dev_hard_start_xmit() already in
> sch_direct_xmit(), and by looking on sk_buff's dev's netdev_queue's
> trans_start we can see that it happened exactly 40 sec earlier. The
> netdev_queue has virtnet_netdev ops so everything happens in virtio_net.
>
> Considering all the above I believe that we have the same thing as in
> tun devices, seems that after virtio_net xmit the sk_buff can be held
> for long and thus we need to orphan frags to release ubufs earlier.
>
> See similar patches for tun:
> 868eefeb17d4 ("tun: orphan frags on xmit")
> 149d36f7187c ("tun: report orphan frags errors to zero copy callback")
>
> https://pmc.acronis.com/browse/VSTOR-52993
>
> Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
> ---
>   drivers/net/virtio_net.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 652f5b1b218c..5d22e19ad161 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -790,16 +790,21 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>   	/* Free up any pending old buffers before queueing new ones. */
>   	free_old_xmit_skbs(sq);
>   
> +	if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))
> +		goto drop;
> +

This smells like the SKB data will be copied for EACH skb and thus
"zero copy" will not be zero copy anymore.

Am I correct?

>   	/* Try to transmit */
>   	err = xmit_skb(sq, skb);
>   
>   	/* This should not happen! */
>   	if (unlikely(err)) {
> +drop:
>   		dev->stats.tx_fifo_errors++;
>   		if (net_ratelimit())
>   			dev_warn(&dev->dev,
>   				 "Unexpected TXQ (%d) queue failure: %d\n", qnum, err);
>   		dev->stats.tx_dropped++;
> +		skb_tx_error(skb);
>   		kfree_skb(skb);
>   		return NETDEV_TX_OK;
>   	}