[Devel] [PATCH RHEL7 COMMIT] ext4: fix race aio-dio vs freeze_fs

Konstantin Khorenko khorenko at virtuozzo.com
Mon Dec 7 04:30:12 PST 2015


The commit is pushed to "branch-rh7-3.10.0-229.7.2.vz7.9.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-229.7.2.vz7.9.15
------>
commit ee99dc6c7593728c6e0b4838c188794db46d4ece
Author: Dmitry Monakhov <dmonakhov at openvz.org>
Date:   Mon Dec 7 16:30:11 2015 +0400

    ext4: fix race aio-dio vs freeze_fs
    
    After freeze_fs was revoked (from Jan Kara) pages's write-back completion
    is deffered before unwritten conversion, so explicit flush_unwritten_io()
    was removed here: c724585b62411
    
    But we still may face deferred conversion for aio-dio case
    # Trivial testcase
    for ((i=0;i<60;i++));do fsfreeze -f /mnt ;sleep 1;fsfreeze -u /mnt;done &
    fio --bs=4k --ioengine=libaio --iodepth=128 --size=1g --direct=1 \
        --runtime=60 --filename=/mnt/file --name=rand-write --rw=randwrite
    
    NOTE: Sane testcase should be integrated to xfstests, but it requires
    changes in common/* code, so let's use this this test at the moment.
    
    In order to fix this race we have to guard journal transaction with explicit
    sb_{start,end}_intwrite()  as we do with ext4_evict_inode here:8e8ad8a5
    
    https://jira.sw.ru/browse/PSBM-39352
    
    Signed-off-by: Dmitry Monakhov <dmonakhov at openvz.org>
---
 fs/ext4/extents.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index bac9339..3d288ba 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -5148,6 +5148,12 @@ int ext4_convert_unwritten_extents(handle_t *handle, struct inode *inode,
 	max_blocks = ((EXT4_BLOCK_ALIGN(len + offset, blkbits) >> blkbits) -
 		      map.m_lblk);
 	/*
+	 * Protect us against freezing - AIO-DIO case. Caller didn't have to
+	 * have any protection against it
+	 */
+	sb_start_intwrite(inode->i_sb);
+
+	/*
 	 * This is somewhat ugly but the idea is clear: When transaction is
 	 * reserved, everything goes into it. Otherwise we rather start several
 	 * smaller transactions for conversion of each extent separately.
@@ -5191,6 +5197,7 @@ int ext4_convert_unwritten_extents(handle_t *handle, struct inode *inode,
 	}
 	if (!credits)
 		ret2 = ext4_journal_stop(handle);
+	sb_end_intwrite(inode->i_sb);
 	return ret > 0 ? ret2 : ret;
 }
 


More information about the Devel mailing list