[CRIU] [PATCH 1/2] usernsd: The way to restore priviledged stuff in userns
Andrew Vagin
avagin at parallels.com
Thu Feb 12 14:02:01 PST 2015
On Thu, Feb 12, 2015 at 01:39:15PM +0300, Pavel Emelyanov wrote:
> We have collected a good set of calls that cannot be done inside
> user namespaces, but we need to [1]. Some of them has already
> being addressed, like prctl mm bits restore, but some are not.
>
> I'm pretty sceptical about the ability to relax the security
> checks on quite a lot of them (e.g. open-by-handle is indeed a
> very dangerous operation if allowed to unpriviledged user), so
> we need some way to call those things even in user namespaces.
>
> The good news about it its that all the calls I've found operate
> on file descriptors this way or another. So if we had a process,
> that lived outside of user namespace, we could ask one to do the
> high priority operation we need and exchange the affected file
> descriptor via unix socket.
>
> So the usernsd is the one doing exactly this. It starts before we
> create the user namespace and accepts requests via unix socket.
> Clients (the processes we restore) send him the functions they
> want to call, the descriptor they want to operate on and the
> arguments blob. Optionally, they can request some file descriptor
> back after the call.
>
> In non usernamespace case the daemon is not started and the calls
> are done right in the requestor's process environment.
>
> In the next patch there's an example of how to use this daemon
> to do the priviledged SO_SNDBUFFORCE/_RCVBUFFORCE sockopt on
> a socket.
>
> [1] http://criu.org/UserNamespace
>
> Signed-off-by: Pavel Emelyanov <xemul at parallels.com>
....
> +static inline void unsc_msg_init(struct unsc_msg *m, uns_call_t *c,
> + int *x, void *arg, size_t asize, int fd)
> +{
> + m->h.msg_iov = m->iov;
> + m->h.msg_iovlen = 2;
> +
> + m->iov[0].iov_base = c;
> + m->iov[0].iov_len = sizeof(*c);
> + m->iov[1].iov_base = x;
> + m->iov[1].iov_len = sizeof(*x);
> +
> + if (arg) {
> + m->iov[2].iov_base = arg;
> + m->iov[2].iov_len = asize;
> + m->h.msg_iovlen++;
> + }
> +
> + m->h.msg_name = NULL;
> + m->h.msg_namelen = 0;
> + m->h.msg_flags = 0;
> +
> + if (fd == -1) {
We save a return code in fd, so I think it's better to check that fd
isn't negative.
> + m->h.msg_control = NULL;
> + m->h.msg_controllen = 0;
> + } else {
> + struct cmsghdr *ch;
> +
> + m->h.msg_control = &m->c;
> + m->h.msg_controllen = sizeof(m->c);
> + ch = CMSG_FIRSTHDR(&m->h);
> + ch->cmsg_len = CMSG_LEN(sizeof(int));
> + ch->cmsg_level = SOL_SOCKET;
> + ch->cmsg_type = SCM_RIGHTS;
> + *((int *)CMSG_DATA(ch)) = fd;
> + }
> +}
...
> +int start_usernsd(void)
> +{
> + int sk[2];
> +
> + if (!(root_ns_mask & CLONE_NEWUSER))
> + return 0;
> +
> + /*
> + * Seqpacket to
> + *
> + * a) Help daemon distinguish individual requests from
> + * each other easily. Stream socket require manual
> + * messages boundaries.
> + *
> + * b) Make callers note the damon death by seeing the
> + * disconnected socket. In case of dgram socket
> + * callers would just get stuck in receiving the
> + * responce.
> + */
> +
> + if (socketpair(PF_UNIX, SOCK_SEQPACKET, 0, sk)) {
> + pr_perror("Can't make usernsd socket");
> + return -1;
> + }
> +
> + usernsd_pid = fork();
We need to handle errors here.
> + if (usernsd_pid == 0) {
> + int ret;
> +
> + close(sk[0]);
> + ret = usernsd(sk[1]);
> + exit(ret);
> + }
> +
> + close(sk[1]);
> + install_service_fd(USERNSD_SK, sk[0]);
and here
> + close(sk[0]);
> +
> + return 0;
> +}
> +
More information about the CRIU
mailing list