[Devel] [PATCH RHEL9 COMMIT] dm-qcow2: switch allocation of compression buffers to kvmalloc

Konstantin Khorenko khorenko at virtuozzo.com
Thu Nov 21 19:14:31 MSK 2024


The commit is pushed to "branch-rh9-5.14.0-427.44.1.vz9.80.x-ovz" and will appear at git at bitbucket.org:openvz/vzkernel.git
after rh9-5.14.0-427.44.1.vz9.80.1
------>
commit e6aaeb68d4769cf75e448054119560dd74024dea
Author: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
Date:   Tue Nov 19 15:36:42 2024 +0800

    dm-qcow2: switch allocation of compression buffers to kvmalloc
    
    We see high order allocation warnings:
    
    kernel: order 10 >= 10, gfp 0x40c00
    kernel: WARNING: CPU: 5 PID: 182 at mm/page_alloc.c:5630 __alloc_pages+0x1d7/0x3f0
    kernel: process_compressed_read+0x6f/0x590 [dm_qcow2]
    
    This is because we have 1M clusters and in case of zstd compression the
    buffer size used for decompression is (clu_size + sizeof(ZSTD_DCtx) +
    ZSTD_BLOCKSIZE_MAX + clu_size + ZSTD_BLOCKSIZE_MAX + 64 = 2520776) which
    requires 4M allocation.
    
    This is a really big continious allocation and it has very high
    probability to fail. Let's fix it by switching to kvmalloc.
    
    note 1: It looks like we can't instead decrease the buffer as
    compression buffer size is equal to the cluster size.
    
    note 2: There already are several places in mainstream kernel of
    kvmalloc(GFP_NOIO), it should be fine to use it since we have commit
    451769ebb7e79 ("mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc").
    
    note 3: Other option here is to switch to a custom memory pool for these
    allocations, but downside of this approach is that
      a) if we do a big pool it will always consume a lot of memory,
      b) if we do a small pool it may slow down compressed reads in case of
         multiple concurrent reads, and
      c) if we do scalable pools, scaling them in this stack would require
         kvmalloc(GFP_NOIO) anyway, so we also need to implement some
         monitor to scale buffer on the side which might be an overkill for
         this problem.
    
    https://virtuozzo.atlassian.net/browse/VSTOR-94596
    Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
    
    Feature: dm-qcow2: ZSTD decompression
---
 drivers/md/dm-qcow2-map.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-qcow2-map.c b/drivers/md/dm-qcow2-map.c
index 6585f3fac6e7..112f6dde44af 100644
--- a/drivers/md/dm-qcow2-map.c
+++ b/drivers/md/dm-qcow2-map.c
@@ -3671,7 +3671,7 @@ static void process_compressed_read(struct qcow2 *qcow2, struct list_head *read_
 		dctxlen = zlib_inflate_workspacesize();
 
 
-	buf = kmalloc(qcow2->clu_size + dctxlen, GFP_NOIO);
+	buf = kvmalloc(qcow2->clu_size + dctxlen, GFP_NOIO);
 	if (!buf) {
 		end_qios(read_list, BLK_STS_RESOURCE);
 		return;
@@ -3681,7 +3681,7 @@ static void process_compressed_read(struct qcow2 *qcow2, struct list_head *read_
 		arg = zstd_init_dstream(qcow2->clu_size, buf + qcow2->clu_size, dctxlen);
 		if (!arg) {
 			end_qios(read_list, BLK_STS_RESOURCE);
-			kfree(buf);
+			kvfree(buf);
 			return;
 		}
 	} else {
@@ -3716,7 +3716,7 @@ static void process_compressed_read(struct qcow2 *qcow2, struct list_head *read_
 		list_add_tail(&qio->link, cow_list);
 	}
 
-	kfree(buf);
+	kvfree(buf);
 }
 
 static int prepare_sliced_data_write(struct qcow2 *qcow2, struct qio *qio,


More information about the Devel mailing list