[Devel] [PATCH RHEL7 COMMIT] ext4: fix dir corruption when ext4_dx_add_entry() fails

Konstantin Khorenko khorenko at virtuozzo.com
Tue Apr 21 17:32:53 MSK 2026


The commit is pushed to "branch-rh7-3.10.0-1160.129.1.vz7.226.x-ovz" and will appear at git at bitbucket.org:openvz/vzkernel.git
after rh7-3.10.0-1160.129.1.vz7.226.2
------>
commit 5dd12e9f3302560ef3c86ac25689f57bd5bf64b0
Author: Zhihao Cheng <chengzhihao1 at huawei.com>
Date:   Sat Mar 21 12:30:48 2026 +0100

    ext4: fix dir corruption when ext4_dx_add_entry() fails
    
    Following process may lead to fs corruption:
    1. ext4_create(dir/foo)
     ext4_add_nondir
      ext4_add_entry
       ext4_dx_add_entry
         a. add_dirent_to_buf
          ext4_mark_inode_dirty
          ext4_handle_dirty_metadata   // dir inode bh is recorded into journal
         b. ext4_append    // dx_get_count(entries) == dx_get_limit(entries)
           ext4_bread(EXT4_GET_BLOCKS_CREATE)
            ext4_getblk
             ext4_map_blocks
              ext4_ext_map_blocks
                ext4_mb_new_blocks
                 dquot_alloc_block
                  dquot_alloc_space_nodirty
                   inode_add_bytes    // update dir's i_blocks
                ext4_ext_insert_extent
                 ext4_ext_dirty  // record extent bh into journal
                  ext4_handle_dirty_metadata(bh)
                  // record new block into journal
           inode->i_size += inode->i_sb->s_blocksize   // new size(in mem)
         c. ext4_handle_dirty_dx_node(bh2)
            // record dir's new block(dx_node) into journal
         d. ext4_handle_dirty_dx_node((frame - 1)->bh)
         e. ext4_handle_dirty_dx_node(frame->bh)
         f. do_split    // ret err!
         g. add_dirent_to_buf
             ext4_mark_inode_dirty(dir)  // update raw_inode on disk(skipped)
    2. fsck -a /dev/sdb
     drop last block(dx_node) which beyonds dir's i_size.
      /dev/sdb: recovering journal
      /dev/sdb contains a file system with errors, check forced.
      /dev/sdb: Inode 12, end of extent exceeds allowed value
            (logical block 128, physical block 3938, len 1)
    3. fsck -fn /dev/sdb
     dx_node->entry[i].blk > dir->i_size
      Pass 2: Checking directory structure
      Problem in HTREE directory inode 12 (/dir): bad block number 128.
      Clear HTree index? no
      Problem in HTREE directory inode 12: block #3 has invalid depth (2)
      Problem in HTREE directory inode 12: block #3 has bad max hash
      Problem in HTREE directory inode 12: block #3 not referenced
    
    Fix it by marking inode dirty directly inside ext4_append().
    Fetch a reproducer in [Link].
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=216466
    Cc: stable at vger.kernel.org
    Signed-off-by: Zhihao Cheng <chengzhihao1 at huawei.com>
    Reviewed-by: Jan Kara <jack at suse.cz>
    Link: https://lore.kernel.org/r/20220911045204.516460-1-chengzhihao1@huawei.com
    Signed-off-by: Theodore Ts'o <tytso at mit.edu>
    
    [khorenko: backport to 3.10.
    
     The problem: ext4_append() updates i_size and i_disksize in memory
     but does not journal the inode via ext4_mark_inode_dirty().  If a
     later step in ext4_dx_add_entry() (e.g. do_split()) fails, the new
     i_size never reaches disk.  The extent for the newly allocated block
     is already journaled, but i_size does not cover it - after journal
     recovery fsck drops that block and corrupts the htree.
    
     The fix: add ext4_mark_inode_dirty(handle, inode) in ext4_append()
     right after updating i_size/i_disksize, so the inode metadata is
     always journaled together with the block allocation.
    
     Adapted to the older ext4_append() that uses
     ext4_journal_get_write_access(handle, bh) without sb/JTR args.
     Refactored error handling to use a common 'out' label with brelse()
     and ext4_std_error(), matching upstream.]
    
    https://virtuozzo.atlassian.net/browse/PSBM-161670
    (cherry picked from commit 7177dd009c7c04290891e9a534cd47d1b620bd04)
    Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
    Reviewed-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
---
 fs/ext4/namei.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index bacdd0630b2c0..4f8636c0ea273 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -68,14 +68,19 @@ static struct buffer_head *ext4_append(handle_t *handle,
 		return ERR_PTR(err);
 	inode->i_size += inode->i_sb->s_blocksize;
 	EXT4_I(inode)->i_disksize = inode->i_size;
+	err = ext4_mark_inode_dirty(handle, inode);
+	if (err)
+		goto out;
 	BUFFER_TRACE(bh, "get_write_access");
 	err = ext4_journal_get_write_access(handle, bh);
-	if (err) {
-		brelse(bh);
-		ext4_std_error(inode->i_sb, err);
-		return ERR_PTR(err);
-	}
+	if (err)
+		goto out;
 	return bh;
+
+out:
+	brelse(bh);
+	ext4_std_error(inode->i_sb, err);
+	return ERR_PTR(err);
 }
 
 static int ext4_dx_csum_verify(struct inode *inode,


More information about the Devel mailing list