[Devel] [PATCH rh7 2/3] ext4: fix dir corruption when ext4_dx_add_entry() fails

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Mon Apr 20 13:01:43 MSK 2026


Reviewed-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>

On 4/9/26 15:56, Konstantin Khorenko wrote:
> From: Zhihao Cheng <chengzhihao1 at huawei.com>
> 
> Following process may lead to fs corruption:
> 1. ext4_create(dir/foo)
>  ext4_add_nondir
>   ext4_add_entry
>    ext4_dx_add_entry
>      a. add_dirent_to_buf
>       ext4_mark_inode_dirty
>       ext4_handle_dirty_metadata   // dir inode bh is recorded into journal
>      b. ext4_append    // dx_get_count(entries) == dx_get_limit(entries)
>        ext4_bread(EXT4_GET_BLOCKS_CREATE)
>         ext4_getblk
>          ext4_map_blocks
>           ext4_ext_map_blocks
>             ext4_mb_new_blocks
>              dquot_alloc_block
>               dquot_alloc_space_nodirty
>                inode_add_bytes    // update dir's i_blocks
>             ext4_ext_insert_extent
>              ext4_ext_dirty  // record extent bh into journal
>               ext4_handle_dirty_metadata(bh)
>               // record new block into journal
>        inode->i_size += inode->i_sb->s_blocksize   // new size(in mem)
>      c. ext4_handle_dirty_dx_node(bh2)
>         // record dir's new block(dx_node) into journal
>      d. ext4_handle_dirty_dx_node((frame - 1)->bh)
>      e. ext4_handle_dirty_dx_node(frame->bh)
>      f. do_split    // ret err!
>      g. add_dirent_to_buf
>          ext4_mark_inode_dirty(dir)  // update raw_inode on disk(skipped)
> 2. fsck -a /dev/sdb
>  drop last block(dx_node) which beyonds dir's i_size.
>   /dev/sdb: recovering journal
>   /dev/sdb contains a file system with errors, check forced.
>   /dev/sdb: Inode 12, end of extent exceeds allowed value
>         (logical block 128, physical block 3938, len 1)
> 3. fsck -fn /dev/sdb
>  dx_node->entry[i].blk > dir->i_size
>   Pass 2: Checking directory structure
>   Problem in HTREE directory inode 12 (/dir): bad block number 128.
>   Clear HTree index? no
>   Problem in HTREE directory inode 12: block #3 has invalid depth (2)
>   Problem in HTREE directory inode 12: block #3 has bad max hash
>   Problem in HTREE directory inode 12: block #3 not referenced
> 
> Fix it by marking inode dirty directly inside ext4_append().
> Fetch a reproducer in [Link].
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216466
> Cc: stable at vger.kernel.org
> Signed-off-by: Zhihao Cheng <chengzhihao1 at huawei.com>
> Reviewed-by: Jan Kara <jack at suse.cz>
> Link: https://lore.kernel.org/r/20220911045204.516460-1-chengzhihao1@huawei.com
> Signed-off-by: Theodore Ts'o <tytso at mit.edu>
> 
> [khorenko: backport to 3.10.
> 
>  The problem: ext4_append() updates i_size and i_disksize in memory
>  but does not journal the inode via ext4_mark_inode_dirty().  If a
>  later step in ext4_dx_add_entry() (e.g. do_split()) fails, the new
>  i_size never reaches disk.  The extent for the newly allocated block
>  is already journaled, but i_size does not cover it - after journal
>  recovery fsck drops that block and corrupts the htree.
> 
>  The fix: add ext4_mark_inode_dirty(handle, inode) in ext4_append()
>  right after updating i_size/i_disksize, so the inode metadata is
>  always journaled together with the block allocation.
> 
>  Adapted to the older ext4_append() that uses
>  ext4_journal_get_write_access(handle, bh) without sb/JTR args.
>  Refactored error handling to use a common 'out' label with brelse()
>  and ext4_std_error(), matching upstream.]
> 
> https://virtuozzo.atlassian.net/browse/PSBM-161670
> (cherry picked from commit 7177dd009c7cb6c5ca4c8cefc49942bf9830daea)
> Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
> ---
>  fs/ext4/namei.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index bacdd0630b2c..4f8636c0ea27 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -68,14 +68,19 @@ static struct buffer_head *ext4_append(handle_t *handle,
>  		return ERR_PTR(err);
>  	inode->i_size += inode->i_sb->s_blocksize;
>  	EXT4_I(inode)->i_disksize = inode->i_size;
> +	err = ext4_mark_inode_dirty(handle, inode);
> +	if (err)
> +		goto out;
>  	BUFFER_TRACE(bh, "get_write_access");
>  	err = ext4_journal_get_write_access(handle, bh);
> -	if (err) {
> -		brelse(bh);
> -		ext4_std_error(inode->i_sb, err);
> -		return ERR_PTR(err);
> -	}
> +	if (err)
> +		goto out;
>  	return bh;
> +
> +out:
> +	brelse(bh);
> +	ext4_std_error(inode->i_sb, err);
> +	return ERR_PTR(err);
>  }
>  
>  static int ext4_dx_csum_verify(struct inode *inode,

-- 
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.



More information about the Devel mailing list