[CRIU] [PATCH 1/2] finalize cgroups correctly
Saied Kazemi
saied at google.com
Wed Mar 25 16:46:05 PDT 2015
Sorry for the delay in getting back. Here are the answers I owe you:
1. Re kernel, I am running 3.13 kernel. I also have a 3.16 kernel, but
nothing higher.
2. Re breaking things, we won't break anything to the best of my
knowledge. So it's safe to disallow image directory inside root.
3. Re umount_cgyard(), it works with the following patch. It's a quick
hack because it assumes root is a subset of imgs_dir. What's the best way
to generalize this?
diff --git a/cgroup.c b/cgroup.c
index b1a2f92..913f480 100644
--- a/cgroup.c
+++ b/cgroup.c
@@ -939,6 +939,22 @@ int prepare_task_cgroup(struct pstree_item *me)
return move_in_cgroup(se);
}
+void umount_cgyard(void)
+{
+ char path[PATH_MAX];
+ int n;
+
+ if (!cg_yard)
+ return;
+
+ n = strlen(opts.root);
+ if (!strncmp(opts.imgs_dir, opts.root, n)) {
+ snprintf(path, PATH_MAX, "%s/%s", &opts.imgs_dir[n],
cg_yard);
+ if (umount2(path, MNT_DETACH) == -1)
+ pr_perror("Unable to unmount cg_yard %s", path);
+ }
+}
+
void fini_cgroup(void)
{
if (!cg_yard)
diff --git a/crtools.c b/crtools.c
index 0b3c497..202acd8 100644
--- a/crtools.c
+++ b/crtools.c
@@ -278,6 +278,7 @@ int main(int argc, char *argv[], char *envp[])
opts.restore_sibling = true;
break;
case 'D':
+ opts.imgs_dir = optarg;
imgs_dir = optarg;
break;
case 'W':
diff --git a/include/cr_options.h b/include/cr_options.h
index d6ce2ae..90eb0ed 100644
--- a/include/cr_options.h
+++ b/include/cr_options.h
@@ -60,6 +60,7 @@ struct cr_options {
char *new_global_cg_root;
struct list_head new_cgroup_roots;
bool aufs; /* auto-deteced, not via
cli */
+ char *imgs_dir;
};
extern struct cr_options opts;
diff --git a/mount.c b/mount.c
index e59cb0d..df668ce 100644
--- a/mount.c
+++ b/mount.c
@@ -2013,12 +2013,16 @@ static int do_restore_task_mnt_ns(struct ns_id
*nsid)
int restore_task_mnt_ns(struct pstree_item *current)
{
+ extern void umount_cgyard(void);
+
if (current->ids && current->ids->has_mnt_ns_id) {
unsigned int id = current->ids->mnt_ns_id;
struct ns_id *nsid;
- if (root_item->ids->mnt_ns_id == id)
+ if (root_item->ids->mnt_ns_id == id) {
+ umount_cgyard();
return 0;
+ }
nsid = lookup_ns_by_id(id, &mnt_ns_desc);
if (nsid == NULL) {
@@ -2026,6 +2030,7 @@ int restore_task_mnt_ns(struct pstree_item *current)
return -1;
}
+ umount_cgyard();
if (do_restore_task_mnt_ns(nsid))
return -1;
}
--Saied
On Tue, Mar 24, 2015 at 11:52 AM, Pavel Emelyanov <xemul at parallels.com>
wrote:
> On 03/24/2015 09:30 PM, Saied Kazemi wrote:
> >
> >
> > On Tue, Mar 24, 2015 at 10:57 AM, Pavel Emelyanov <xemul at parallels.com
> <mailto:xemul at parallels.com>> wrote:
> >
> > On 03/24/2015 08:25 PM, Saied Kazemi wrote:
> > > For the easy case of comparing pathname strings on the command
> line (-D and --root), I
> > > think it's better to error and exit instead of restoring with all
> the cgyard mounts
> > > leaked into the restored process tree.
> >
> > I rather agree with this. It's quite uncommon (from my perspective)
> that images sit
> > in the container's root, so preventing this usage doesn't
> (shouldn't) break anything
> > critical. Does it?
> >
> >
> > It wouldn't break anything because nothing has been merged yet.
>
> I meant -- right now we allow for images to be under root, but we plan
> to ban this and if we do, nothing should get broken.
>
> > > For a more general solution, is it possible to make sure that
> after fini_cgroup() the
> > > cgyard mount points have been successfully removed from the
> restored tree?
> >
> > What if we call umount with detach flag on cgyard dir inside
> restore_task_mnt_ns()?
> > Would this resolve this?
> >
> >
> > I just quickly tried the following patch, first calling umount_cgyard()
> before if and then
> > after if. Neither case worked :(
>
> The cg_yard variable is just a ".criu.cgyard.XXXXX" relative path. When
> calling umount2 on it kernel would try to resolve one relative to task's
> cwd.
>
> As far "calling this after if" is concerned -- yes, it's supposed to fail
> since restoring task's mntns implies calling pivot_roo() which moves task's
> cwd into new root and thus the cg_yard starts pointing to non-existing
> directory.
>
> But "before if"... I have no explanation for it. With what error does the
> umount fail in this case?
>
> > --Saied
> >
> > diff --git a/cgroup.c b/cgroup.c
> > index b1a2f92..fa94853 100644
> > --- a/cgroup.c
> > +++ b/cgroup.c
> > @@ -951,6 +951,12 @@ void fini_cgroup(void)
> > cg_yard = NULL;
> > }
> >
> > +void umount_cgyard(void)
> > +{
> > + if (cg_yard)
> > + umount2(cg_yard, MNT_DETACH);
> > +}
> > +
> > static int restore_cgroup_prop(const CgroupPropEntry * cg_prop_entry_p,
> > char *path, int off)
> > {
> > diff --git a/include/cgroup.h b/include/cgroup.h
> > index 14fef67..5bcf028 100644
> > --- a/include/cgroup.h
> > +++ b/include/cgroup.h
> > @@ -10,6 +10,7 @@ int prepare_cgroup(void);
> > /* Restore things like cpu_limit in known cgroups. */
> > int prepare_cgroup_properties(void);
> > void fini_cgroup(void);
> > +void umount_cgyard(void);
> >
> > struct cg_controller;
> >
> > diff --git a/mount.c b/mount.c
> > index e59cb0d..b0b34cb 100644
> > --- a/mount.c
> > +++ b/mount.c
> > @@ -2028,6 +2028,7 @@ int restore_task_mnt_ns(struct pstree_item
> *current)
> >
> > if (do_restore_task_mnt_ns(nsid))
> > return -1;
> > + umount_cgyard();
> > }
> >
> > return 0;
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20150325/c911dfd0/attachment-0001.html>
More information about the CRIU
mailing list