[CRIU] [PATCH 1/4] Support for dumping/restoring user namespaces

Sophie Blee-Goldman ableegoldman at google.com
Mon Aug 11 22:49:21 PDT 2014


On Mon, Aug 11, 2014 at 8:50 AM, Andrew Vagin <avagin at parallels.com> wrote:

> On Mon, Aug 11, 2014 at 05:24:05PM +0400, Pavel Emelyanov wrote:
> > On 08/11/2014 05:17 PM, Andrew Vagin wrote:
> > > Hi Sophie,
> > >
> > > On Fri, Aug 08, 2014 at 10:21:19PM -0700, Sophie Blee-Goldman wrote:
> > >> Adds basic support for user namespaces by dumping and restoring
> > >> the namespace itself and the uid/gid maps of the root process.
> > >
> > > How do you test your patches?
> >
> > I have the same question. It's OK if the initial version of userns
> > only supports some limited stuff, but we should know what it is :)
> >
> > > ZDTM test suite can execute tests in
> > > namespaces, but the current version knows nothing about userns. Have
> you
> > > try to add userns in ZDTM lib?
> > >
>

I wrote a short script for testing that starts several processes
(with different sets of credentials) in a namespace jail, then dumps and
restores
the process tree.


> > >>
> > >> Currently depends on a kernel patch to avoid failing on the prctl
> > >> syscall by checking for CAP_SYS_RESOURCE in the user namespace
> > >> instead of in the global one.
> > >
> > > It isn't so simple.
> > > Kirill is trying to fix this issue: https://lkml.org/lkml/2014/8/4/570
> > >
> > > We have a number of other kernel issues, which are described here:
> > > http://criu.org/UserNamespace
> > >
> > > Have you seen my patches for userns?
> > > http://lists.openvz.org/pipermail/criu/2014-February/012399.html
> > >
> > > and here is updated version:
> > > https://github.com/avagin/criu/tree/userns2
> > >
> > > I suggest to find the difference between our patch sets and make a new
> one,
> > > which will contain best things from both ones.
> >
> > Andrey, can you suggest which things are best in both sets? :)
>
> For example what I see now. My version:
> * doesn't have comments, which I enumerated for this patch
> * handles ghost uids
> * fixes zdtm lib to test user namespaces
> * prepares the root mount for pivot_root
> * set PR_SET_DUMPABLE to have access to proc files
> * extends ZDTM lib to execute tests in userns
>
> Sophie's version:
> * dumps capability from parasite. I don't know which capabilities are
>   shown in /proc/PID/status.

* ...


> Sophie, could you continue  this list or comment my list.
>

I think you've listed all the major points. The thing we seem to handle
in a significantly different way is getting the ids inside the namespace.
I think there's something to be said for dumping those in the parasite
alongside the capabilities, which would avoid the need to store the
maps as a global variable (Also, it's possible I overlooked something but
I think you may be leaking the memory malloc'ed for these maps).

That said, the simplicity of just calling getresuid and getresgid from the
parasite doesn't extend to getting the ghost files' ids, since it looks like
a lot more would have to be moved to the parasite. So maybe it's best to
use your method and just translate the ids. On that note, one thing I
noticed
in your code is that during a dump you translate the ids before calling
may_dump(), which could be problematic if criu is ever to support dumping
user namespaces by non-root users. A similar issue arises for may_restore().


 {u,g}id_map files can be written only once. If you use fprintf, you
> don't know how many times write will be called.
> kernel/user_namespace.c:
> static ssize_t map_write(struct file *file, const char __user *buf,
> ...
>         ret = -EPERM;
>         /* Only allow one successful write to the map */
>         if (map->nr_extents != 0)
>                 goto out;
> ...

My mistake, I had thought the restriction was that the id_maps could only
be opened for writing once.

> +int dump_user_ns(int ns_pid, int ns_id)
> > +{
> > +     int fd, ret, n_uid_entries, n_gid_entries;
> > +     char map_fname[PATH_MAX];
> > +
> > +     LIST_HEAD(uid_list);
> Why do we need this list? Can we write uid_maps in an image without
> collection in the list?


>
I felt that reading the map entries into a list first was cleaner than
reading
them directly into the array and resizing it while parsing the uid/gid
maps. But
either way of doing it seems reasonable, so I wouldn't say we *need* this
list.



> > +     /* read uid map */
> > +     sprintf(map_fname, "/proc/%d/uid_map", ns_pid);
>
> can we use fopen_proc here?
>

I hadn't noticed the 'open_proc' set of functions, but yes that would be
better.

Thanks,
Sophie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20140811/473cfb89/attachment.html>


More information about the CRIU mailing list