[Devel] [PATCH] fuse kio: Fix deadlock at pcs_fuse_submit() error path

Pavel Butsykin pbutsykin at virtuozzo.com
Wed Oct 17 18:06:07 MSK 2018



On 17.10.2018 16:57, Kirill Tkhai wrote:
> request_end() takes fc->lock, so we in case of error we bump
> into deadlock:
> 
> Call Trace:
>    [<ffffffffb3bb63f5>] _raw_spin_lock+0x75/0xc0
>    [<ffffffffc170871b>] spin_lock+0x18/0x1b [fuse]
>    [<ffffffffc170ba63>] request_end+0x265/0x72b [fuse]
>    [<ffffffffc18a1b8d>] pcs_fuse_submit+0x9fb/0xaa3 [fuse_kio_pcs]
>    [<ffffffffc18a35c4>] kpcs_req_send+0x793/0xa60 [fuse_kio_pcs]
>    [<ffffffffc170b6ca>] flush_bg_queue+0x14f/0x283 [fuse]
>    [<ffffffffc170d4d4>] fuse_request_send_background_locked+0x50b/0x512 [fuse]
>    [<ffffffffc170d844>] fuse_request_send_background+0x369/0x43f [fuse]
>    [<ffffffffc173028b>] fuse_send_readpages+0x372/0x3b5 [fuse]
>    [<ffffffffc1730c3c>] fuse_readpages+0x28c/0x2f0 [fuse]
>    [<ffffffffb296ba58>] __do_page_cache_readahead+0x518/0x6d0
> 
> Fix this by unlocking fc->lock before request_end() call. Note,
> that it may look strange to have two same lk parameters in
> pcs_fuse_submit(pfc, req, lk, lk), but the current design
> interprets requests submitted with locked lk as async and
> we keep this logic.
> 
> Generally, I feel we need to improve design in a thing
> of queueing requests and locking, but we need more
> inverstigation and thinking here, so let's delay this
> to next VZ update.
> 
> https://pmc.acronis.com/browse/VSTOR-16246
> 
> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
> ---
>   fs/fuse/kio/pcs/pcs_fuse_kdirect.c |   10 +++++++---
>   1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/fuse/kio/pcs/pcs_fuse_kdirect.c b/fs/fuse/kio/pcs/pcs_fuse_kdirect.c
> index b286a956a751..61415e029c45 100644
> --- a/fs/fuse/kio/pcs/pcs_fuse_kdirect.c
> +++ b/fs/fuse/kio/pcs/pcs_fuse_kdirect.c
> @@ -883,7 +883,7 @@ static int pcs_fuse_prep_rw(struct pcs_fuse_req *r)
>   	return ret;
>   }
>   
> -static void pcs_fuse_submit(struct pcs_fuse_cluster *pfc, struct fuse_req *req, int async)
> +static void pcs_fuse_submit(struct pcs_fuse_cluster *pfc, struct fuse_req *req, bool async, bool lk)
>   {
>   	struct pcs_fuse_req *r = pcs_req_from_fuse(req);
>   	struct fuse_inode *fi = get_fuse_inode(req->io_inode);
> @@ -963,7 +963,11 @@ static void pcs_fuse_submit(struct pcs_fuse_cluster *pfc, struct fuse_req *req,
>   error:
>   	DTRACE("do fuse_request_end req:%p op:%d err:%d\n", &r->req, r->req.in.h.opcode, r->req.out.h.error);
>   
> +	if (lk)
> +		spin_unlock(&pfc->fc->lock);

We can't unlock fc->lock inside fuse_request_send_background_locked(),
because it breaks compatibility with fuse_set_nowrite(). We must
ensure that no one pending requests should not be between
fuse_set_nowrite() and fuse_release_nowrite(). But since fc unlock
inside fuse_request_send_background_locked() this promise can be broken.

>   	request_end(pfc->fc, &r->req);
> +	if (lk)
> +		spin_lock(&pfc->fc->lock);
>   	return;
>   
>   submit:
> @@ -1027,7 +1031,7 @@ static void _pcs_shrink_end(struct fuse_conn *fc, struct fuse_req *req)
>   
>   		TRACE("resubmit %p\n", &r->req);
>   		list_del_init(&ireq->list);
> -		pcs_fuse_submit(pfc, &r->req, 1);
> +		pcs_fuse_submit(pfc, &r->req, true, false);
>   	}
>   }
>   
> @@ -1174,7 +1178,7 @@ static int kpcs_req_send(struct fuse_conn* fc, struct fuse_req *req, bool bg, bo
>   	}
>   	__clear_bit(FR_PENDING, &req->flags);
>   
> -	pcs_fuse_submit(pfc, req, lk);
> +	pcs_fuse_submit(pfc, req, lk, lk);
>   	if (!bg)
>   		wait_event(req->waitq,
>   			   test_bit(FR_FINISHED, &req->flags) && !req->end);
> 



More information about the Devel mailing list