[Devel] [PATCH v2 VZ9 3/4] iov_iter: Add a function to extract a page list from an iterator
Pavel Tikhomirov
ptikhomirov at virtuozzo.com
Mon Dec 30 08:39:50 MSK 2024
From: David Howells <dhowells at redhat.com>
Add a function, iov_iter_extract_pages(), to extract a list of pages from
an iterator. The pages may be returned with a pin added or nothing,
depending on the type of iterator.
Add a second function, iov_iter_extract_will_pin(), to determine how the
cleanup should be done.
There are two cases:
(1) ITER_IOVEC or ITER_UBUF iterator.
Extracted pages will have pins (FOLL_PIN) obtained on them so that a
concurrent fork() will forcibly copy the page so that DMA is done
to/from the parent's buffer and is unavailable to/unaffected by the
child process.
iov_iter_extract_will_pin() will return true for this case. The
caller should use something like unpin_user_page() to dispose of the
page.
(2) Any other sort of iterator.
No refs or pins are obtained on the page, the assumption is made that
the caller will manage page retention.
iov_iter_extract_will_pin() will return false. The pages don't need
additional disposal.
Signed-off-by: David Howells <dhowells at redhat.com>
Reviewed-by: Christoph Hellwig <hch at lst.de>
Reviewed-by: Jens Axboe <axboe at kernel.dk>
cc: Al Viro <viro at zeniv.linux.org.uk>
cc: John Hubbard <jhubbard at nvidia.com>
cc: David Hildenbrand <david at redhat.com>
cc: Matthew Wilcox <willy at infradead.org>
cc: linux-fsdevel at vger.kernel.org
cc: linux-mm at kvack.org
Signed-off-by: Steve French <stfrench at microsoft.com>
--------
iov_iter: Fix iov_iter_extract_pages() with zero-sized entries
iov_iter_extract_pages() doesn't correctly handle skipping over initial
zero-length entries in ITER_KVEC and ITER_BVEC-type iterators.
The problem is that it accidentally reduces maxsize to 0 when it
skipping and thus runs to the end of the array and returns 0.
Fix this by sticking the calculated size-to-copy in a new variable
rather than back in maxsize.
Fixes: 7d58fe731028 ("iov_iter: Add a function to extract a page list from an iterator")
Signed-off-by: David Howells <dhowells at redhat.com>
Reviewed-by: Christoph Hellwig <hch at lst.de>
Cc: Christian Brauner <brauner at kernel.org>
Cc: Jens Axboe <axboe at kernel.dk>
Cc: Al Viro <viro at zeniv.linux.org.uk>
Cc: David Hildenbrand <david at redhat.com>
Cc: John Hubbard <jhubbard at nvidia.com>
Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
----
We want bio_iov_iter_get_pages() to accept kvec iterators. There are
already some mainstream patches. Backport only the part with
iov_iter_extract_kvec_pages() and it's fixes.
https://virtuozzo.atlassian.net/browse/PSBM-157752
(cherry picked from commit 7d58fe731028128f3a7e20b9c492be48aae133ee)
(cherry picked from commit f741bd7178c95abd7aeac5a9d933ee542f9a5509)
Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
---
lib/iov_iter.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+)
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 5df03dedfc016..de88467a4ea75 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1432,6 +1432,11 @@ static struct page *first_bvec_segment(const struct iov_iter *i,
return page;
}
+static ssize_t iov_iter_extract_kvec_pages(struct iov_iter *i,
+ struct page ***pages, size_t maxsize,
+ unsigned int maxpages,
+ size_t *offset0);
+
static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i,
struct page ***pages, size_t maxsize,
unsigned int maxpages, size_t *start)
@@ -1493,6 +1498,8 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i,
return pipe_get_pages(i, pages, maxsize, maxpages, start);
if (iov_iter_is_xarray(i))
return iter_xarray_get_pages(i, pages, maxsize, maxpages, start);
+ if (iov_iter_is_kvec(i))
+ return iov_iter_extract_kvec_pages(i, pages, maxsize, maxpages, start);
return -EFAULT;
}
@@ -1907,3 +1914,58 @@ void iov_iter_restore(struct iov_iter *i, struct iov_iter_state *state)
i->__iov -= state->nr_segs - i->nr_segs;
i->nr_segs = state->nr_segs;
}
+
+/*
+ * Extract a list of virtually contiguous pages from an ITER_KVEC iterator.
+ * This does not get references on the pages, nor does it get a pin on them.
+ */
+static ssize_t iov_iter_extract_kvec_pages(struct iov_iter *i,
+ struct page ***pages, size_t maxsize,
+ unsigned int maxpages,
+ size_t *offset0)
+{
+ struct page **p, *page;
+ const void *kaddr;
+ size_t skip = i->iov_offset, offset, len, size;
+ int k;
+
+ for (;;) {
+ if (i->nr_segs == 0)
+ return 0;
+ size = min(maxsize, i->kvec->iov_len - skip);
+ if (size)
+ break;
+ i->iov_offset = 0;
+ i->nr_segs--;
+ i->kvec++;
+ skip = 0;
+ }
+
+ kaddr = i->kvec->iov_base + skip;
+ offset = (unsigned long)kaddr & ~PAGE_MASK;
+ *offset0 = offset;
+
+ maxpages = want_pages_array(pages, size, offset, maxpages);
+ if (!maxpages)
+ return -ENOMEM;
+ p = *pages;
+
+ kaddr -= offset;
+ len = offset + size;
+ for (k = 0; k < maxpages; k++) {
+ size_t seg = min_t(size_t, len, PAGE_SIZE);
+
+ if (is_vmalloc_or_module_addr(kaddr))
+ page = vmalloc_to_page(kaddr);
+ else
+ page = virt_to_page(kaddr);
+
+ p[k] = page;
+ len -= seg;
+ kaddr += PAGE_SIZE;
+ }
+
+ size = min_t(size_t, size, maxpages * PAGE_SIZE - offset);
+ iov_iter_advance(i, size);
+ return size;
+}
--
2.47.0
More information about the Devel
mailing list