[Devel] [VZ7 PATCH 1/2] iomap: report collisions between directio and buffered writes to userspace

Valeriy Vdovin valeriy.vdovin at virtuozzo.com
Thu Jan 28 17:06:12 MSK 2021


If two programs simultaneously try to write to the same part of a file
via direct IO and buffered IO, there's a chance that the post-diowrite
pagecache invalidation will fail on the dirty page.  When this happens,
the dio write succeeded, which means that the page cache is no longer
coherent with the disk!

Programs are not supposed to mix IO types and this is a clear case of
data corruption, so store an EIO which will be reflected to userspace
during the next fsync.  Replace the WARN_ON with a ratelimited pr_crit
so that the developers have /some/ kind of breadcrumb to track down the
offending program(s) and file(s) involved.

Signed-off-by: Darrick J. Wong <darrick.wong at oracle.com>
Reviewed-by: Liu Bo <bo.li.liu at oracle.com>

(cherry-picked from 5a9d929d6e13278df62bd9e3d3ceae8c87ad1eea)
file_path changed to d_path
https://jira.sw.ru/browse/PSBM-124609

Signed-off-by: Valeriy Vdovin <valeriy.vdovin at virtuozzo.com>
---
 fs/direct-io.c     | 24 +++++++++++++++++++++++-
 include/linux/fs.h |  1 +
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index f5fd6ff..886989d 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -256,6 +256,27 @@ static void dio_iodone2_helper(struct dio *dio, loff_t offset,
 	}
 }
 
+/*
+ * Warn about a page cache invalidation failure during a direct io write.
+ */
+void dio_warn_stale_pagecache(struct file *filp)
+{
+	static DEFINE_RATELIMIT_STATE(_rs, 86400 * HZ, DEFAULT_RATELIMIT_BURST);
+	char pathname[128];
+	struct inode *inode = file_inode(filp);
+	char *path;
+
+	errseq_set(&inode->i_mapping->wb_err, -EIO);
+	if (__ratelimit(&_rs)) {
+		path = d_path(&filp->f_path, pathname, sizeof(pathname));
+		if (IS_ERR(path))
+			path = "(unknown)";
+		pr_crit("Page cache invalidation failure on direct I/O.  Possible data corruption due to collision with buffered I/O!\n");
+		pr_crit("File: %s PID: %d Comm: %.20s\n", path, current->pid,
+			current->comm);
+	}
+}
+
 /**
  * dio_complete() - called when all DIO BIO I/O has been completed
  * @offset: the byte offset in the file of the completed operation
@@ -312,7 +333,8 @@ static ssize_t dio_complete(struct dio *dio, loff_t offset, ssize_t ret,
 		err = invalidate_inode_pages2_range(dio->inode->i_mapping,
 					offset >> PAGE_SHIFT,
 					(offset + ret - 1) >> PAGE_SHIFT);
-		WARN_ON_ONCE(err);
+		if (err)
+			dio_warn_stale_pagecache(dio->iocb->ki_filp);
 	}
 
 	/*
diff --git a/include/linux/fs.h b/include/linux/fs.h
index aee8adf..bc5417f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3316,6 +3316,7 @@ enum {
 };
 
 void dio_end_io(struct bio *bio, int error);
+void dio_warn_stale_pagecache(struct file *filp);
 
 ssize_t __blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode,
 	struct block_device *bdev, struct iov_iter *iter, loff_t offset,
-- 
1.8.3.1



More information about the Devel mailing list