[Devel] [PATCH RH7 0/3] ext4/jbd2: port data corruption fixes from ms

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Thu Aug 22 12:47:36 MSK 2019


When investigating the data corruption on vzt-ploop-check test, when we
detect one page in file which contains wrong data, we were lucky to have
exact the same pattern in bad page each time. So we've added a small
debugging to fail on setting a dirty bit for a page if it contains the
pattern, in the begining of __set_page_dirty and
__set_page_dirty_nobuffers.

We've got a crash, which looks connected with the ported patches:

crash> bt
PID: 17855  TASK: ffff8cfb19144000  CPU: 3   COMMAND: "jbd2/ploop45613"
 #0 [ffff8cfcb6fdf8a0] machine_kexec at ffffffff9e2643c4
 #1 [ffff8cfcb6fdf900] __crash_kexec at ffffffff9e32d672
 #2 [ffff8cfcb6fdf9d0] crash_kexec at ffffffff9e32d760
 #3 [ffff8cfcb6fdf9e8] oops_end at ffffffff9e99f858
 #4 [ffff8cfcb6fdfa10] die at ffffffff9e22f88b
 #5 [ffff8cfcb6fdfa40] do_trap at ffffffff9e99eee0
 #6 [ffff8cfcb6fdfa90] do_invalid_op at ffffffff9e22c1d4
 #7 [ffff8cfcb6fdfb40] invalid_op at ffffffff9e9a928e
    [exception RIP: page_check_corruption_pattern+397]
    RIP: ffffffff9e3d719d  RSP: ffff8cfcb6fdfbf8  RFLAGS: 00010246
    RAX: ffff8cfcb6fdffd8  RBX: 00007303b0607000  RCX: 000000010025603b
    RDX: 0000000000000190  RSI: 0000000000000000  RDI: 0000000000000206
    RBP: ffff8cfcb6fdfc10   R8: ffff8cfbe6c19e00   R9: 0000000000000001
    R10: 0000000000000004  R11: 0000000000000005  R12: ffffe079873e7e40
    R13: ffff8cfca7bc3ab0  R14: ffff8cfc351d5a90  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff8cfcb6fdfbf0] page_check_corruption_pattern at ffffffff9e3d70d6
 #9 [ffff8cfcb6fdfc18] __set_page_dirty at ffffffff9e499cb5
 #10 [ffff8cfcb6fdfc50] mark_buffer_dirty at ffffffff9e499efa
 #11 [ffff8cfcb6fdfc70] __jbd2_journal_temp_unlink_buffer at ffffffffc048893a [jbd2]
 #12 [ffff8cfcb6fdfc80] __jbd2_journal_refile_buffer at ffffffffc048ac08 [jbd2]
 #13 [ffff8cfcb6fdfca8] jbd2_journal_commit_transaction at ffffffffc048c1e0 [jbd2]
 #14 [ffff8cfcb6fdfe48] kjournald2 at ffffffffc0491f79 [jbd2]
 #15 [ffff8cfcb6fdfec8] kthread at ffffffff9e2c4661

Before ("jbd2: clear dirty flag when revoking a buffer from an older
transaction") revoken buffer/page can be wrongly marked dirty, and later
be wrongly written to disk. Other patches from the same series might be
also helpful.

https://jira.sw.ru/browse/PSBM-96719

zhangyi (F) (3):
  jbd2: clear dirty flag when revoking a buffer from an older
    transaction
  jbd2: discard dirty data when forgetting an un-journalled buffer
  ext4: cleanup clean_bdev_aliases() calls

 fs/ext4/extents.c     | 21 +--------------
 fs/ext4/inode.c       | 13 ----------
 fs/ext4/page-io.c     |  4 +--
 fs/jbd2/transaction.c | 59 ++++++++++++++++++++++++++++++++++++-------
 4 files changed, 52 insertions(+), 45 deletions(-)

-- 
2.20.1



More information about the Devel mailing list