[CRIU] [PATCH 1/2] finalize cgroups correctly

Saied Kazemi saied at google.com
Wed Mar 25 16:46:05 PDT 2015


Sorry for the delay in getting back.  Here are the answers I owe you:

1. Re kernel, I am running 3.13 kernel.  I also have a 3.16 kernel, but
nothing higher.

2. Re breaking things, we won't break anything to the best of my
knowledge.  So it's safe to disallow image directory inside root.

3. Re umount_cgyard(), it works with the following patch.  It's a quick
hack because it assumes root is a subset of imgs_dir.  What's the best way
to generalize this?

diff --git a/cgroup.c b/cgroup.c
index b1a2f92..913f480 100644
--- a/cgroup.c
+++ b/cgroup.c
@@ -939,6 +939,22 @@ int prepare_task_cgroup(struct pstree_item *me)
        return move_in_cgroup(se);
 }

+void umount_cgyard(void)
+{
+       char path[PATH_MAX];
+       int n;
+
+       if (!cg_yard)
+               return;
+
+       n = strlen(opts.root);
+       if (!strncmp(opts.imgs_dir, opts.root, n)) {
+               snprintf(path, PATH_MAX, "%s/%s", &opts.imgs_dir[n],
cg_yard);
+               if (umount2(path, MNT_DETACH) == -1)
+                       pr_perror("Unable to unmount cg_yard %s", path);
+       }
+}
+
 void fini_cgroup(void)
 {
        if (!cg_yard)
diff --git a/crtools.c b/crtools.c
index 0b3c497..202acd8 100644
--- a/crtools.c
+++ b/crtools.c
@@ -278,6 +278,7 @@ int main(int argc, char *argv[], char *envp[])
                        opts.restore_sibling = true;
                        break;
                case 'D':
+                       opts.imgs_dir = optarg;
                        imgs_dir = optarg;
                        break;
                case 'W':
diff --git a/include/cr_options.h b/include/cr_options.h
index d6ce2ae..90eb0ed 100644
--- a/include/cr_options.h
+++ b/include/cr_options.h
@@ -60,6 +60,7 @@ struct cr_options {
        char                    *new_global_cg_root;
        struct list_head        new_cgroup_roots;
        bool                    aufs;           /* auto-deteced, not via
cli */
+       char                    *imgs_dir;
 };

 extern struct cr_options opts;
diff --git a/mount.c b/mount.c
index e59cb0d..df668ce 100644
--- a/mount.c
+++ b/mount.c
@@ -2013,12 +2013,16 @@ static int do_restore_task_mnt_ns(struct ns_id
*nsid)

 int restore_task_mnt_ns(struct pstree_item *current)
 {
+       extern void umount_cgyard(void);
+
        if (current->ids && current->ids->has_mnt_ns_id) {
                unsigned int id = current->ids->mnt_ns_id;
                struct ns_id *nsid;

-               if (root_item->ids->mnt_ns_id == id)
+               if (root_item->ids->mnt_ns_id == id) {
+                       umount_cgyard();
                        return 0;
+               }

                nsid = lookup_ns_by_id(id, &mnt_ns_desc);
                if (nsid == NULL) {
@@ -2026,6 +2030,7 @@ int restore_task_mnt_ns(struct pstree_item *current)
                        return -1;
                }

+               umount_cgyard();
                if (do_restore_task_mnt_ns(nsid))
                        return -1;
        }

--Saied


On Tue, Mar 24, 2015 at 11:52 AM, Pavel Emelyanov <xemul at parallels.com>
wrote:

> On 03/24/2015 09:30 PM, Saied Kazemi wrote:
> >
> >
> > On Tue, Mar 24, 2015 at 10:57 AM, Pavel Emelyanov <xemul at parallels.com
> <mailto:xemul at parallels.com>> wrote:
> >
> >     On 03/24/2015 08:25 PM, Saied Kazemi wrote:
> >     > For the easy case of comparing pathname strings on the command
> line (-D and --root), I
> >     > think it's better to error and exit instead of restoring with all
> the cgyard mounts
> >     > leaked into the restored process tree.
> >
> >     I rather agree with this. It's quite uncommon (from my perspective)
> that images sit
> >     in the container's root, so preventing this usage doesn't
> (shouldn't) break anything
> >     critical. Does it?
> >
> >
> > It wouldn't break anything because nothing has been merged yet.
>
> I meant -- right now we allow for images to be under root, but we plan
> to ban this and if we do, nothing should get broken.
>
> >     > For a more general solution, is it possible to make sure that
> after fini_cgroup() the
> >     > cgyard mount points have been successfully removed from the
> restored tree?
> >
> >     What if we call umount with detach flag on cgyard dir inside
> restore_task_mnt_ns()?
> >     Would this resolve this?
> >
> >
> > I just quickly tried the following patch, first calling umount_cgyard()
> before if and then
> > after if.  Neither case worked :(
>
> The cg_yard variable is just a ".criu.cgyard.XXXXX" relative path. When
> calling umount2 on it kernel would try to resolve one relative to task's
> cwd.
>
> As far "calling this after if" is concerned -- yes, it's supposed to fail
> since restoring task's mntns implies calling pivot_roo() which moves task's
> cwd into new root and thus the cg_yard starts pointing to non-existing
> directory.
>
> But "before if"... I have no explanation for it. With what error does the
> umount fail in this case?
>
> > --Saied
> >
> > diff --git a/cgroup.c b/cgroup.c
> > index b1a2f92..fa94853 100644
> > --- a/cgroup.c
> > +++ b/cgroup.c
> > @@ -951,6 +951,12 @@ void fini_cgroup(void)
> >         cg_yard = NULL;
> >  }
> >
> > +void umount_cgyard(void)
> > +{
> > +       if (cg_yard)
> > +               umount2(cg_yard, MNT_DETACH);
> > +}
> > +
> >  static int restore_cgroup_prop(const CgroupPropEntry * cg_prop_entry_p,
> >                                char *path, int off)
> >  {
> > diff --git a/include/cgroup.h b/include/cgroup.h
> > index 14fef67..5bcf028 100644
> > --- a/include/cgroup.h
> > +++ b/include/cgroup.h
> > @@ -10,6 +10,7 @@ int prepare_cgroup(void);
> >  /* Restore things like cpu_limit in known cgroups. */
> >  int prepare_cgroup_properties(void);
> >  void fini_cgroup(void);
> > +void umount_cgyard(void);
> >
> >  struct cg_controller;
> >
> > diff --git a/mount.c b/mount.c
> > index e59cb0d..b0b34cb 100644
> > --- a/mount.c
> > +++ b/mount.c
> > @@ -2028,6 +2028,7 @@ int restore_task_mnt_ns(struct pstree_item
> *current)
> >
> >                 if (do_restore_task_mnt_ns(nsid))
> >                         return -1;
> > +               umount_cgyard();
> >         }
> >
> >         return 0;
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20150325/c911dfd0/attachment-0001.html>


More information about the CRIU mailing list