[Devel] [PATCH RHEL7 COMMIT] ms/ext4: fix SEEK_HOLE

Konstantin Khorenko khorenko at virtuozzo.com
Wed Jul 26 11:01:37 MSK 2017


The commit is pushed to "branch-rh7-3.10.0-514.26.1.vz7.33.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-514.26.1.vz7.33.17
------>
commit 2a13149e0b3241a8c8be5032a84a066035cd99f3
Author: Maxim Patlasov <mpatlasov at virtuozzo.com>
Date:   Wed Jul 26 12:01:37 2017 +0400

    ms/ext4: fix SEEK_HOLE
    
    Patchset description:
    ext4: backport SEEK_DATA/SEEK_HOLE patches from mainline
    
    QEMU uses SEEK_DATA/SEEK_HOLE to optimize QCOW2 snapshot COW. Before
    these patches it was possible to get "no extents found" from lseek(SEEK_DATA)
    even if there were valid extents present.
    
    https://jira.sw.ru/browse/PSBM-68292
    
    Maxim Patlasov (3):
          ext4: fix SEEK_HOLE
          ext4: fix off-by-in in loop termination in ext4_find_unwritten_pgoff()
          ext4: fix off-by-one on max nr_pages in ext4_find_unwritten_pgoff():
    
    ====================================
    This patch description:
    
    Backport 7d95eddf313c88b24f99d4ca9c2411a4b82fef33 from ml:
    
        ext4: fix SEEK_HOLE
    
        Currently, SEEK_HOLE implementation in ext4 may both return that there's
        a hole at some offset although that offset already has data and skip
        some holes during a search for the next hole. The first problem is
        demostrated by:
    
        xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "seek -h 0" file
        wrote 57344/57344 bytes at offset 0
        56 KiB, 14 ops; 0.0000 sec (2.054 GiB/sec and 538461.5385 ops/sec)
        Whence  Result
        HOLE    0
    
        Where we can see that SEEK_HOLE wrongly returned offset 0 as containing
        a hole although we have written data there. The second problem can be
        demonstrated by:
    
        xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k"
               -c "seek -h 0" file
    
        wrote 57344/57344 bytes at offset 0
        56 KiB, 14 ops; 0.0000 sec (1.978 GiB/sec and 518518.5185 ops/sec)
        wrote 8192/8192 bytes at offset 131072
        8 KiB, 2 ops; 0.0000 sec (2 GiB/sec and 500000.0000 ops/sec)
        Whence  Result
        HOLE    139264
    
        Where we can see that hole at offsets 56k..128k has been ignored by the
        SEEK_HOLE call.
    
        The underlying problem is in the ext4_find_unwritten_pgoff() which is
        just buggy. In some cases it fails to update returned offset when it
        finds a hole (when no pages are found or when the first found page has
        higher index than expected), in some cases conditions for detecting hole
        are just missing (we fail to detect a situation where indices of
        returned pages are not contiguous).
    
        Fix ext4_find_unwritten_pgoff() to properly detect non-contiguous page
        indices and also handle all cases where we got less pages then expected
        in one place and handle it properly there.
    
        CC: stable at vger.kernel.org
        Fixes: c8c0df241cc2719b1262e627f999638411934f60
        CC: Zheng Liu <wenqing.lz at taobao.com>
        Signed-off-by: Jan Kara <jack at suse.cz>
        Signed-off-by: Theodore Ts'o <tytso at mit.edu>
    
    https://jira.sw.ru/browse/PSBM-68292
    
    Signed-off-by: Maxim Patlasov <mpatlasov at virtuozzo.com>
---
 fs/ext4/file.c | 50 ++++++++++++++------------------------------------
 1 file changed, 14 insertions(+), 36 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index cea271f..da5851e 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -453,47 +453,27 @@ static int ext4_find_unwritten_pgoff(struct inode *inode,
 		num = min_t(pgoff_t, end - index, PAGEVEC_SIZE);
 		nr_pages = pagevec_lookup(&pvec, inode->i_mapping, index,
 					  (pgoff_t)num);
-		if (nr_pages == 0) {
-			if (whence == SEEK_DATA)
-				break;
-
-			BUG_ON(whence != SEEK_HOLE);
-			/*
-			 * If this is the first time to go into the loop and
-			 * offset is not beyond the end offset, it will be a
-			 * hole at this offset
-			 */
-			if (lastoff == startoff || lastoff < endoff)
-				found = 1;
-			break;
-		}
-
-		/*
-		 * If this is the first time to go into the loop and
-		 * offset is smaller than the first page offset, it will be a
-		 * hole at this offset.
-		 */
-		if (lastoff == startoff && whence == SEEK_HOLE &&
-		    lastoff < page_offset(pvec.pages[0])) {
-			found = 1;
+		if (nr_pages == 0)
 			break;
-		}
 
 		for (i = 0; i < nr_pages; i++) {
 			struct page *page = pvec.pages[i];
 			struct buffer_head *bh, *head;
 
 			/*
-			 * If the current offset is not beyond the end of given
-			 * range, it will be a hole.
+			 * If current offset is smaller than the page offset,
+			 * there is a hole at this offset.
 			 */
-			if (lastoff < endoff && whence == SEEK_HOLE &&
-			    page->index > end) {
+			if (whence == SEEK_HOLE && lastoff < endoff &&
+			    lastoff < page_offset(pvec.pages[i])) {
 				found = 1;
 				*offset = lastoff;
 				goto out;
 			}
 
+			if (page->index > end)
+				goto out;
+
 			lock_page(page);
 
 			if (unlikely(page->mapping != inode->i_mapping)) {
@@ -533,20 +513,18 @@ static int ext4_find_unwritten_pgoff(struct inode *inode,
 			unlock_page(page);
 		}
 
-		/*
-		 * The no. of pages is less than our desired, that would be a
-		 * hole in there.
-		 */
-		if (nr_pages < num && whence == SEEK_HOLE) {
-			found = 1;
-			*offset = lastoff;
+		/* The no. of pages is less than our desired, we are done. */
+		if (nr_pages < num)
 			break;
-		}
 
 		index = pvec.pages[i - 1]->index + 1;
 		pagevec_release(&pvec);
 	} while (index <= end);
 
+	if (whence == SEEK_HOLE && lastoff < endoff) {
+		found = 1;
+		*offset = lastoff;
+	}
 out:
 	pagevec_release(&pvec);
 	return found;


More information about the Devel mailing list