[Devel] [PATCH RH9] fuse: pcs: do not use sendpage on pages which cannot be sendpaged

Kui Liu Kui.Liu at acronis.com
Wed Apr 19 10:12:53 MSK 2023


Hello,

How about use WARN_ONCE()  just in case to catch any unexpected changes in the future?
Though page array is not used in bvec now, we can't guarantee it won't change in the future. 

Wrt whether it is necessary to check page ref = 0 here, I have same doubts, I also don't quite 
understand why it is possible that 0-ref page would end up here, but if that is possible, it should 
be handled correctly with this patch. 

BTW, do you think we need to replace iov_iter_get_pages() in do_recv_one_seg() with similar implementation?

-----Original Message-----
From: Alexey Kuznetsov <kuznet at acronis.com <mailto:kuznet at acronis.com>>
Date: Wednesday, 19 April 2023 at 12:09 AM
To: Kui Liu <Kui.Liu at acronis.com <mailto:Kui.Liu at acronis.com>>
Cc: Devel <devel at openvz.org <mailto:devel at openvz.org>>, Konstantin Khorenko <khorenko at virtuozzo.com <mailto:khorenko at virtuozzo.com>>
Subject: Re: [Devel] [PATCH RH9] fuse: pcs: do not use sendpage on pages which cannot be sendpaged


Hello!


Ack.


Disgusting at the first sight, but I like this! :-)


Indeed, this iterator is our internal thing, we created it ourselves and
actually we have no reasons to use dubious spoiled system implementation.


One note, formally, BUG_ON is incorrect. We should assert
it->iov_offset + i->bvec->bv_offset < PAGE_SIZE.
I do not understand this place in system implementation, I have never
seen anyone
using bvecs in this way, they allow bv_page pointing to array of
pages, this looks so scary.
But this is our internal iter, so that we may omit assertions.


BTW this does not answer my doubts, the problem is not in our code. I
do not understand
_why_ sendpage_ok is necessary. The check page_count(page) >= 1 wants
to tell that
get_page on page with 0 refcnt is illegal. But fuse already did
get_page at top level,
it uses iov_iter_get_pages in fuse_get_user_pages(), so that:


1. If it is not allowed it is already broken by the time the page arrives to us
2. It is impossible to have page_count == 0 in kernel_sendpage, it is
grabbed by top level.


All this looks as big smelly pile of shit, mainstream not us. :-)






On Tue, Apr 18, 2023 at 10:22 PM Kui Liu <Kui.Liu at acronis.com <mailto:Kui.Liu at acronis.com>> wrote:
>
>
> It looks like the new API iov_iter_get_pages is not safe for use
> when trying to get a page from a bvec to be sent by kernel_sendpage().
> So just revert back our own implentation where we check the page
> before making get_page(), if the page can't be sendpage, fall back
> to copy mode.
>
> Affects: #PSBM-146821, #PSBM-146846
> https://jira.vzint.dev/browse/PSBM-146821 <https://jira.vzint.dev/browse/PSBM-146821>
> https://jira.vzint.dev/browse/PSBM-146846 <https://jira.vzint.dev/browse/PSBM-146846>
> Signed-off-by: Liu Kui <Kui.Liu at acronis.com <mailto:Kui.Liu at acronis.com>>
> ---
> fs/fuse/kio/pcs/pcs_sock_io.c | 28 ++++++++++++++++++++++------
> 1 file changed, 22 insertions(+), 6 deletions(-)
>
> diff --git a/fs/fuse/kio/pcs/pcs_sock_io.c b/fs/fuse/kio/pcs/pcs_sock_io.c
> index 552b82ab398b..20f088d36d6d 100644
> --- a/fs/fuse/kio/pcs/pcs_sock_io.c
> +++ b/fs/fuse/kio/pcs/pcs_sock_io.c
> @@ -143,16 +143,32 @@ static int do_send_one_seg(struct socket *sock, struct iov_iter *it, size_t left
> }
>
> if (iov_iter_is_bvec(it)) {
> - /* Zerocopy */
> size_t offset;
> ssize_t len;
> struct page *page;
>
> - len = iov_iter_get_pages(it, &page, size, 1, &offset);
> - BUG_ON(len <= 0);
> -
> - ret = kernel_sendpage(sock, page, offset, len, flags);
> - put_page(page);
> + /* Only support single page bvec here */
> + BUG_ON(it->bvec->bv_len > PAGE_SIZE);
> +
> + page = it->bvec->bv_page;
> + offset = it->bvec->bv_offset + it->iov_offset;
> + len = min(size, it->bvec->bv_len - it->iov_offset);
> +
> + if (sendpage_ok(page)) {
> + /* Zero copy */
> + get_page(page);
> + ret = kernel_sendpage(sock, page, offset, len, flags);
> + put_page(page);
> + } else {
> + /* Fall back to copy mode */
> + struct msghdr msg = { .msg_flags = flags };
> + struct kvec kv;
> +
> + kv.iov_base = kmap(page) + offset;
> + kv.iov_len = len;
> + ret = kernel_sendmsg(sock, &msg, &kv, 1, size);
> + kunmap(page);
> + }
> }
>
> out:
> --
> 2.32.0 (Apple Git-132)
>
> -----Original Message-----
> From: Alexey Kuznetsov <kuznet at acronis.com <mailto:kuznet at acronis.com> <mailto:kuznet at acronis.com <mailto:kuznet at acronis.com>>>
> Date: Monday, 17 April 2023 at 10:31 PM
> To: Devel <devel at openvz.org <mailto:devel at openvz.org> <mailto:devel at openvz.org <mailto:devel at openvz.org>>>, Konstantin Khorenko <khorenko at virtuozzo.com <mailto:khorenko at virtuozzo.com> <mailto:khorenko at virtuozzo.com <mailto:khorenko at virtuozzo.com>>>, Kui Liu <Kui.Liu at acronis.com <mailto:Kui.Liu at acronis.com> <mailto:Kui.Liu at acronis.com <mailto:Kui.Liu at acronis.com>>>
> Subject: [PATCH RH9] fuse: pcs: do not use sendpage on pages which cannot be sendpaged
>
>
> Use the same approach as in iscsi iscsi_tcp_segment_map() which makes
> the same operation
> in the same context.
>
>
> In vz7 we added iov_iter_get_pages() infrastructure ourselves and we
> did it in safe way.
> We checked page before doing any manipulations on it right in core of
> iov_iter_get_pages()
> and terminated iov_iter_get_pages() if the page looked dubious. We
> cannot make this now because
> this function is used in many places over kernel and not all the
> places are sensitive
> to origin of pages but all of them definitely will crash if
> iov_iter_get_pages() returns error.
>
>
> This patch will work for now. I still believe this place deserves
> further investigation.
> It smells like mainstream could be wrong. Look at my comment in code
> for the reasons of doubts.
>
>
> Affects: #PSBM-146821, #PSBM-146846
> https://jira.vzint.dev/browse/PSBM-146821 <https://jira.vzint.dev/browse/PSBM-146821> <https://jira.vzint.dev/browse/PSBM-146821> <https://jira.vzint.dev/browse/PSBM-146821&gt;>
> https://jira.vzint.dev/browse/PSBM-146846 <https://jira.vzint.dev/browse/PSBM-146846> <https://jira.vzint.dev/browse/PSBM-146846> <https://jira.vzint.dev/browse/PSBM-146846&gt;>
>
>
> Signed-off-by: Alexey Kuznetsov <kuznet at acronis.com <mailto:kuznet at acronis.com> <mailto:kuznet at acronis.com <mailto:kuznet at acronis.com>>>
>
>
>
> _______________________________________________
> Devel mailing list
> Devel at openvz.org <mailto:Devel at openvz.org>
> https://lists.openvz.org/mailman/listinfo/devel <https://lists.openvz.org/mailman/listinfo/devel>




More information about the Devel mailing list