[Devel] [PATCH RHEL COMMIT] overlayfs: add mnt_id paths options

Konstantin Khorenko khorenko at virtuozzo.com
Mon Oct 4 20:39:05 MSK 2021


The commit is pushed to "branch-rh9-5.14.vz9.1.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after ark-5.14
------>
commit 0f01030446b3afcecc8846eee744a5a3c5743d2a
Author: Alexander Mikhalitsyn <alexander.mikhalitsyn at virtuozzo.com>
Date:   Mon Oct 4 20:39:05 2021 +0300

    overlayfs: add mnt_id paths options
    
    This patch adds config OVERLAY_FS_PATH_OPTIONS_MNT_ID
    compile-time option, and "mnt_id_path_opts" runtime module option.
    If enabled, user may see mnt_ids for lowerdir, upperdir paths
    in mountinfo in separate lowerdir_mnt_id/upperdir_mnt_id options.
    
    This patch is very helpful to checkpoint/restore functionality
    of overlayfs mounts in case when we have overmounts on
    lowerdir, workdir, upperdir paths.
    
    https://jira.sw.ru/browse/PSBM-58614
    
    Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn at virtuozzo.com>
    
    =====================
    Patchset description:
    overlayfs: C/R enhancements
    
    This patchset aimed to make C/R of overlayfs mounts with CRIU possible.
    We introduce two new overlayfs module options -- dyn_path_opts and
    mnt_id_path_opts. If enabled this options allows to see real *full* paths
    in lowerdir, workdir, upperdir options, and also mnt_ids for corresponding
    paths.
    
    This changes should not break anything because for showing mnt_ids we simply
    introduce new show-time mount options. And for paths we simply *always*
    provide *full paths* instead of relative path on mountinfo.
    
    BEFORE
    
    overlay on /var/lib/docker/overlay2/XYZ/merged type overlay (rw,relatime,
    lowerdir=/var/lib/docker/overlay2/XYZ-init/diff:/var/lib/docker/overlay2/
    ABC/diff,upperdir=/var/lib/docker/overlay2/XYZ/diff,workdir=/var/lib/docker
    /overlay2/XYZ/work)
    none on /sys type sysfs (rw,relatime)
    
    AFTER
    
    overlay on /var/lib/docker/overlay2/XYZ/merged type overlay (rw,relatime,
    lowerdir=/var/lib/docker/overlay2/XYZ-init/diff:/var/lib/docker/overlay2/
    ABC/diff,upperdir=/var/lib/docker/overlay2/XYZ/diff,workdir=/var/lib/docker
    /overlay2/XYZ/work,lowerdir_mnt_id=175:175,upperdir_mnt_id=175)
    none on /sys type sysfs (rw,relatime)
    
    Alexander Mikhalitsyn (2):
      overlayfs: add dynamic path resolving in mount options
      overlayfs: add mnt_id paths options
    
    =====================
    Rebase to RHEL8.3 kernel-4.18.0-240.1.1.el8_3 note:
    - original patch from vz8 kernel has been dropped (did not apply):
      c38c281cbe49 ("overlayfs: add mnt_id paths options")
    
    - a patchset developed for mainstream has been appliedi
      (it's not accepted in ms yet):
      https://lore.kernel.org/lkml/20200604161133.20949-1-alexander.mikhalitsyn@virtuozzo.com/
    
    +++
    fs/overlayfs: Fixed default value for parameter 'mnt_id_path_opts'
    
    The value queries .config with IS_ENABLED macro but CONFIG_ prefix
    is not provided.
    
    mFixes: c38c281cbe49 ("overlayfs: add mnt_id paths options")
    
    Signed-off-by: Valeriy.Vdovin <valeriy.vdovin at virtuozzo.com>
    
    +++
    fs/ovelayfs: Fix crash on overlayfs mount
    
    Kdump kernel fails to load because of crash on mount of overlayfs:
    
     BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
    ....
     Call Trace:
      seq_path+0x64/0xb0
      print_paths_option+0x79/0xa0
      ovl_show_options+0x3a/0x320
      show_mountinfo+0x1ee/0x290
      seq_read+0x2f8/0x400
      vfs_read+0x9d/0x150
      ksys_read+0x4f/0xb0
      do_syscall_64+0x5b/0x1a0
    
    This is cause by OOB access of ofs->lowerpaths.
    We transfer to print_paths_option() ofs->numlayer as size of ->lowerpaths
    array, but it's not.
    
    The correct number of lowerpaths elements is ->numlower in 'struct ovl_entry'.
    So move lowerpaths there and use oe->numlower as array size.
    
    mFixes: 17fc61697f73 ("overlayfs: add dynamic path resolving in mount options")
    mFixes: 2191d729083d ("overlayfs: add mnt_id paths options")
    
    https://jira.sw.ru/browse/PSBM-123508
    Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
    
    Reviewed-by: Alexander Mikhalitsyn <alexander.mikhalitsyn at virtuozzo.com>
    
    +++
    fs/overlayfs: Fix crash on overlayfs mount
    
    [  261.403900] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
    [  261.412847] Call Trace:
    [  261.413463]  seq_path+0x3c/0xa0
    [  261.414090]  print_paths_option+0x8c/0xa0
    [  261.414736]  ovl_show_options+0x41/0x320
    [  261.415378]  show_mountinfo+0x1df/0x2b0
    [  261.416019]  seq_read+0x26e/0x3d0
    [  261.416644]  vfs_read+0x89/0x140
    [  261.417269]  ksys_read+0x52/0xc0
    [  261.418918]  do_syscall_64+0x5b/0x1e0
    [  261.419580]  entry_SYSCALL_64_after_hwframe+0x65/0xca
    [  261.420256] RIP: 0033:0x7f20b59f28e4
    
    The problem is that we take overlayfs lower layers info not
    from root dentry. Non-root dentries can have less layers than
    root dentry.
    
    Crash reproducer:
    mkdir {lower,upper,work,merged}
    touch lower/lower
    touch upper/upper
    touch lowermnt
    touch uppermnt
    mount -t overlay overlay -o lowerdir=lower,upperdir=upper,workdir=work merged
    mount --bind merged/upper uppermnt
    mount --bind merged/lower lowermnt
    
    mFixes: 4267859a0 ("fs/ovelayfs: Fix crash on overlayfs mount")
    
    https://jira.sw.ru/browse/PSBM-129333
    
    Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn at virtuozzo.com>
    
    (cherry picked from vz8 commit d001a4d7b50a13b2f459a307f4542e3beb1ed1fd)
    Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko at virtuozzo.com>
---
 fs/overlayfs/Kconfig     | 26 ++++++++++++++++++++++++++
 fs/overlayfs/overlayfs.h |  3 +++
 fs/overlayfs/super.c     | 15 +++++++++++++++
 fs/overlayfs/util.c      | 21 +++++++++++++++++++++
 4 files changed, 65 insertions(+)

diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
index be733bcd4c00..1ba9411d9a6a 100644
--- a/fs/overlayfs/Kconfig
+++ b/fs/overlayfs/Kconfig
@@ -155,3 +155,29 @@ config OVERLAY_FS_DYNAMIC_RESOLVE_PATH_OPTIONS
 	  For more information, see Documentation/filesystems/overlayfs.txt
 
 	  If unsure, say N.
+
+config OVERLAY_FS_PATH_OPTIONS_MNT_ID
+	bool "Overlayfs: show mnt_id for all mount paths options"
+	default y
+	depends on OVERLAY_FS
+	help
+	  This option helps checkpoint/restore of overlayfs mounts.
+	  If N selected, old behavior is saved.
+
+	  If this config option is enabled then in overlay filesystems mount
+	  options you will be able to see additional parameters lowerdir_mnt_id/
+	  upperdir_mnt_id with corresponding mnt_ids.
+
+	  It's also possible to change this behavior on overlayfs module loading or
+	  through sysfs (mnt_id_path_opts parameter).
+
+	  Disable this to get a backward compatible with previous kernels configuration,
+	  but in this case checkpoint/restore functionality for overlayfs mounts
+	  may not fully work.
+
+	  If backward compatibility is not an issue, then it is safe and
+	  recommended to say Y here.
+
+	  For more information, see Documentation/filesystems/overlayfs.txt
+
+	  If unsure, say N.
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index d30e097fcea5..b4a7d2f72186 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -355,6 +355,9 @@ static inline bool ovl_test_flag(unsigned long flag, struct inode *inode)
 void print_path_option(struct seq_file *m, const char *name, struct path *path);
 void print_paths_option(struct seq_file *m, const char *name,
 			struct path *paths, unsigned int num);
+void print_mnt_id_option(struct seq_file *m, const char *name, struct path *path);
+void print_mnt_ids_option(struct seq_file *m, const char *name,
+			struct path *paths, unsigned int num);
 
 static inline bool ovl_is_impuredir(struct super_block *sb,
 				    struct dentry *dentry)
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index fdb0d9a45104..065b7778e720 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -57,6 +57,10 @@ static bool ovl_dyn_path_opts = IS_ENABLED(CONFIG_OVERLAY_FS_DYNAMIC_RESOLVE_PAT
 module_param_named(dyn_path_opts, ovl_dyn_path_opts, bool, 0644);
 MODULE_PARM_DESC(dyn_path_opts, "dyn_path_opts feature enabled");
 
+static bool ovl_mnt_id_path_opts = IS_ENABLED(CONFIG_OVERLAY_FS_PATH_OPTIONS_MNT_ID);
+module_param_named(mnt_id_path_opts, ovl_mnt_id_path_opts, bool, 0644);
+MODULE_PARM_DESC(mnt_id_path_opts, "mnt_id_path_opts feature enabled");
+
 static void ovl_entry_stack_free(struct ovl_entry *oe)
 {
 	unsigned int i;
@@ -382,6 +386,17 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 			seq_show_option(m, "workdir", ofs->config.workdir);
 		}
 	}
+
+	if (ovl_mnt_id_path_opts) {
+		print_mnt_ids_option(m, "lowerdir_mnt_id", oe->lowerpaths, oe->numlower);
+		/*
+		 * We don't need to show mnt_id for workdir because it
+		 * on the same mount as upperdir.
+		 */
+		if (ofs->config.upperdir)
+			print_mnt_id_option(m, "upperdir_mnt_id", &ofs->upperpath);
+	}
+
 	if (ofs->config.default_permissions)
 		seq_puts(m, ",default_permissions");
 	if (strcmp(ofs->config.redirect_mode, ovl_redirect_mode_def()) != 0)
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 29cf1947fd00..0dd8356e1145 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -14,6 +14,7 @@
 #include <linux/namei.h>
 #include <linux/ratelimit.h>
 #include <linux/seq_file.h>
+#include "../mount.h"
 #include "overlayfs.h"
 
 int ovl_want_write(struct dentry *dentry)
@@ -997,3 +998,23 @@ void print_paths_option(struct seq_file *m, const char *name,
 		seq_path(m, &paths[i], ", \t\n\\");
 	}
 }
+
+void print_mnt_id_option(struct seq_file *m, const char *name, struct path *path)
+{
+	seq_show_option(m, name, "");
+	seq_printf(m, "%i", real_mount(path->mnt)->mnt_id);
+}
+
+void print_mnt_ids_option(struct seq_file *m, const char *name,
+			struct path *paths, unsigned int num)
+{
+	int i;
+
+	seq_show_option(m, name, "");
+
+	for (i = 0; i < num; i++) {
+		if (i)
+			seq_putc(m, ':');
+		seq_printf(m, "%i", real_mount(paths[i].mnt)->mnt_id);
+	}
+}


More information about the Devel mailing list