[CRIU] Patch for unnamed unix sockets
Pavel Emelyanov
xemul at parallels.com
Wed Jul 22 03:11:17 PDT 2015
On 07/21/2015 12:56 PM, artem.kuzmitskiy at lge.com wrote:
> Hi all,
Welcome :)
> This patch for dumping and restoring unnamed unix socket, details about feature
> you can find -> http://criu.org/External_UNIX_socket#What_to_do_with_socketpair.28.29-s.3F.
Cool! Thanks for the feature, it's really tempting.
> Review please.
Yup, I have several comments, please find them inline.
First of all, please, split the patch into smaller pieces. Minimally I
would suggest having separate patches for -x option extension (dump part)
and --inherit-fd support for sockets (restore part) and for test. I.e.
3 patches. More suggestions for separate patches are inline.
>>From a41fde482f5cb0a08191b4fcc05f04a0a67f77f6 Mon Sep 17 00:00:00 2001
> From: Artem Kuzmitskiy <artem.kuzmitskiy at lge.com>
> Date: Tue, 21 Jul 2015 12:15:21 +0300
> Subject: [PATCH] Added functionality for dumping\restoring unnamed unix
> sockets.
Please, format the e-mail(s) so that it doesn't contain trash. The example
of patch message is here:
http://criu.org/How_to_submit_patches#An_example_of_patch_message
> When we call CRIU with dump option, for unnamed socket we should pass it inode
> into --ext-unix-sk in next format: socket[<inode_value>]. When we call
> CRIU with restore option, we using inherit functionality and pass socket's
> inode in same format(put socket instead of pipe).
> Details about this problem described in
> http://criu.org/External_UNIX_socket#What_to_do_with_socketpair.28.29-s.3F.
> Simple example available in test/socketpair directory.
> Usage example:
> For dump, criu dump -D images -o dump.log -v4 -x socket:[4529709] -t 13506
I guess the "socket:" part is excessive here, isn't it? We know that we're
dumping the socket, so why writing it one more time? :)
> For restore, criu restore -d -D images -o restore.log --pidfile restore.pid \
> -v4 -x --inherit-fd fd[3]:socket:[4529709]
>
> Signed-off-by: Artem Kuzmitskiy <artem.kuzmitskiy at lge.com>
> ---
> crtools.c | 6 +-
> files.c | 19 +-
> include/cr_options.h | 1 +
> include/files.h | 2 +-
> include/sockets.h | 3 +
> lib/criu.h | 9 +
> sk-unix.c | 144 ++++++++--
> test/libcriu/run.sh | 1 +
> test/socketpairs/Makefile | 12 +
> test/socketpairs/socketpair.c | 595 ++++++++++++++++++++++++++++++++++++++++++
> 10 files changed, 768 insertions(+), 24 deletions(-)
> create mode 100644 test/socketpairs/Makefile
> create mode 100644 test/socketpairs/socketpair.c
>
> diff --git a/crtools.c b/crtools.c
> index b085d33..1012426 100644
> --- a/crtools.c
> +++ b/crtools.c
> @@ -50,6 +50,7 @@ void init_opts(void)
>
> /* Default options */
> opts.final_state = TASK_DEAD;
> + INIT_LIST_HEAD(&opts.ext_unixsk_ids);
> INIT_LIST_HEAD(&opts.veth_pairs);
> INIT_LIST_HEAD(&opts.scripts);
> INIT_LIST_HEAD(&opts.ext_mounts);
Tabs were corrupted into spaces while sending the patch :(
> @@ -184,7 +185,7 @@ int main(int argc, char *argv[], char *envp[])
> int log_level = LOG_UNSET;
> char *imgs_dir = ".";
> char *work_dir = NULL;
> - static const char short_opts[] = "dSsRf:F:t:p:hcD:o:n:v::xVr:jlW:L:M:";
> + static const char short_opts[] = "dSsRf:F:t:p:hcD:o:n:v::x::Vr:jlW:L:M:";
> static struct option long_opts[] = {
> { "tree", required_argument, 0, 't' },
> { "pid", required_argument, 0, 'p' },
> @@ -201,7 +202,7 @@ int main(int argc, char *argv[], char *envp[])
> { "log-file", required_argument, 0, 'o' },
> { "namespaces", required_argument, 0, 'n' },
> { "root", required_argument, 0, 'r' },
> - { USK_EXT_PARAM, no_argument, 0, 'x' },
> + { USK_EXT_PARAM, optional_argument, 0, 'x' },
Plz, update the help text below. And add the "adding the necessary bits to the RPC
code" task to your todo list :)
> { "help", no_argument, 0, 'h' },
> { SK_EST_PARAM, no_argument, 0, 1042 },
> { "close", required_argument, 0, 1043 },
> @@ -278,6 +279,7 @@ int main(int argc, char *argv[], char *envp[])
> opts.final_state = TASK_ALIVE;
> break;
> case 'x':
> + if (optarg && unix_sk_ids_parse(optarg) < 0) return 1;
Styling: if/while bodies should go on separate line.
> opts.ext_unix_sk = true;
> break;
> case 'p':
> diff --git a/files.c b/files.c
> index 3e69be7..6c3472c 100644
> --- a/files.c
> +++ b/files.c
> @@ -1381,6 +1381,19 @@ static int inherit_fd_lookup_id(char *id)
> return ret;
> }
>
> +bool inherit_fd_lookup_desc(struct file_desc *d)
> +{
> + char buf[32], *id_str;
> +
> + if (!d->ops->name)
> + return -1;
> +
> + id_str = d->ops->name(d, buf, sizeof(buf));
> + int ret = inherit_fd_lookup_id(id_str);
> +
> + return (ret < 0 ? false : true);
> +}
> +
> bool inherited_fd(struct file_desc *d, int *fd_p)
> {
> char buf[32], *id_str;
> @@ -1398,10 +1411,12 @@ bool inherited_fd(struct file_desc *d, int *fd_p)
> return true;
>
> *fd_p = dup(i_fd);
> - if (*fd_p < 0)
> + if (*fd_p < 0) {
> pr_perror("Inherit fd DUP failed");
> + return false;
> + }
> else
> - pr_info("File %s will be restored from fd %d duped "
> + pr_info("File %s will be restored from fd %d dumped "
> "from inherit fd %d\n", id_str, *fd_p, i_fd);
This hunk deserves separate patch with comment why you think that reporting
dup() failure explicitly is nice.
> return true;
> }
> diff --git a/include/cr_options.h b/include/cr_options.h
> index 9ab8bba..62233c3 100644
> --- a/include/cr_options.h
> +++ b/include/cr_options.h
> @@ -45,6 +45,7 @@ struct cr_options {
> };
> bool restore_sibling;
> bool ext_unix_sk;
> + struct list_head ext_unixsk_ids;
> bool shell_job;
> bool handle_file_locks;
> bool tcp_established_ok;
> diff --git a/include/files.h b/include/files.h
> index db7e108..25d69ad 100644
> --- a/include/files.h
> +++ b/include/files.h
> @@ -174,7 +174,7 @@ extern int inherit_fd_add(int fd, char *key);
> extern void inherit_fd_log(void);
> extern int inherit_fd_resolve_clash(int fd);
> extern int inherit_fd_fini(void);
> -
> +extern bool inherit_fd_lookup_desc(struct file_desc *);
> extern bool inherited_fd(struct file_desc *, int *fdp);
>
> #endif /* __CR_FILES_H__ */
> diff --git a/include/sockets.h b/include/sockets.h
> index a3010e1..deb00a3 100644
> --- a/include/sockets.h
> +++ b/include/sockets.h
> @@ -60,6 +60,9 @@ extern int inet_collect_one(struct nlmsghdr *h, int family, int type);
> extern int unix_receive_one(struct nlmsghdr *h, void *);
> extern int netlink_receive_one(struct nlmsghdr *hdr, void *arg);
>
> +extern int unix_sk_ids_parse(char *optarg);
> +extern int unix_sk_id_add(ino_t ino);
> +
> extern int do_dump_opt(int sk, int level, int name, void *val, int len);
> #define dump_opt(s, l, n, f) do_dump_opt(s, l, n, f, sizeof(*f))
> extern int do_restore_opt(int sk, int level, int name, void *val, int len);
> diff --git a/lib/criu.h b/lib/criu.h
> index 1655c02..0711313 100644
> --- a/lib/criu.h
> +++ b/lib/criu.h
> @@ -22,11 +22,16 @@
> #include <stdbool.h>
> #include "rpc.pb-c.h"
>
> +#ifdef __GNUG__
> +extern "C" {
> +#endif
Please, describe this change and send as separate patch.
> +
> enum criu_service_comm {
> CRIU_COMM_SK,
> CRIU_COMM_FD
> };
>
> +
> void criu_set_service_address(char *path);
> void criu_set_service_fd(int fd);
>
> @@ -188,4 +193,8 @@ int criu_local_restore(criu_opts *opts);
> int criu_local_restore_child(criu_opts *opts);
> int criu_local_dump_iters(criu_opts *opts, int (*more)(criu_predump_info pi));
>
> +#ifdef __GNUG__
> +}
> +#endif
> +
> #endif /* __CRIU_LIB_H__ */
> diff --git a/sk-unix.c b/sk-unix.c
> index 6c9ec25..203a6f4 100644
> --- a/sk-unix.c
> +++ b/sk-unix.c
> @@ -65,6 +65,11 @@ struct unix_sk_listen_icon {
> struct unix_sk_listen_icon *next;
> };
>
> +struct unix_sk_exception {
> + struct list_head unix_sk_list;
> + ino_t unix_sk_ino;
> +};
> +
> #define SK_HASH_SIZE 32
>
> static struct unix_sk_listen_icon *unix_listen_icons[SK_HASH_SIZE];
> @@ -129,6 +134,22 @@ static int can_dump_unix_sk(const struct unix_sk_desc *sk)
> return 1;
> }
>
> +static bool unix_sk_exception_lookup_id(ino_t ino)
> +{
> + bool ret = false;
> + struct unix_sk_exception *sk;
> +
> + list_for_each_entry(sk, &opts.ext_unixsk_ids, unix_sk_list) {
> + if (sk->unix_sk_ino == ino) {
> + pr_debug("Found ino %u in exception unix sk list\n", (unsigned int)ino);
> + ret = true;
> + break;
> + }
> + }
> +
> + return ret;
> +}
> +
> static int write_unix_entry(struct unix_sk_desc *sk)
> {
> int ret;
> @@ -559,16 +580,22 @@ static int dump_external_sockets(struct unix_sk_desc *peer)
> return -1;
> }
>
> - if (peer->type != SOCK_DGRAM) {
> - show_one_unix("Ext stream not supported", peer);
> - pr_err("Can't dump half of stream unix connection.\n");
> - return -1;
> + if (!peer->name && unix_sk_exception_lookup_id(sk->sd.ino)) {
> + pr_debug("found exception for unix name-less external socket.\n");
> }
> + else {
I don't quite get this if. I thought that ANY socket specified with -x socket:...
will be treated as external, why checking for !peer->name here?
> + if (peer->type != SOCK_DGRAM) {
> + show_one_unix("Ext stream not supported", peer);
> + pr_err("Can't dump half of stream unix connection.\n");
> + return -1;
> + }
>
> - if (!peer->name) {
> - show_one_unix("Ext dgram w/o name", peer);
> - pr_err("Can't dump name-less external socket.\n");
> - return -1;
> + if (!peer->name) {
> + show_one_unix("Ext dgram w/o name", peer);
> + pr_err("Can't dump name-less external socket.\n");
> + pr_err("%d\n", sk->fd);
> + return -1;
> + }
> }
> } else if (ret < 0)
> return -1;
> @@ -691,21 +718,24 @@ static int post_open_unix_sk(struct file_desc *d, int fd)
> if (ui->ue->uflags & USK_CALLBACK)
> return 0;
>
> - pr_info("\tConnect %#x to %#x\n", ui->ue->ino, peer->ue->ino);
> -
> /* Skip external sockets */
> if (!list_empty(&peer->d.fd_info_head))
> futex_wait_while(&peer->prepared, 0);
>
> - memset(&addr, 0, sizeof(addr));
> - addr.sun_family = AF_UNIX;
> - memcpy(&addr.sun_path, peer->name, peer->ue->name.len);
> + if (!inherit_fd_lookup_desc(d)) {
It was already looked-up in open_unix_sk(). Put a flag onto unix_sk_info
object in the ->open and check one here. We have plenty of free bits on
the unix_sk_info.flags.
>
> - if (connect(fd, (struct sockaddr *)&addr,
> - sizeof(addr.sun_family) +
> - peer->ue->name.len) < 0) {
> - pr_perror("Can't connect %#x socket", ui->ue->ino);
> - return -1;
> + memset(&addr, 0, sizeof(addr));
> + addr.sun_family = AF_UNIX;
> + memcpy(&addr.sun_path, peer->name, peer->ue->name.len);
> +
> + pr_info("\tConnect %#x to %#x\n", ui->ue->ino, peer->ue->ino);
> +
> + if (connect(fd, (struct sockaddr *)&addr,
> + sizeof(addr.sun_family) +
> + peer->ue->name.len) < 0) {
> + pr_perror("Can't connect %#x socket", ui->ue->ino);
> + return -1;
> + }
> }
>
> if (restore_sk_queue(fd, peer->ue->id))
> @@ -981,7 +1011,15 @@ static int open_unix_sk(struct file_desc *d)
> struct unix_sk_info *ui;
>
> ui = container_of(d, struct unix_sk_info, d);
> - if (ui->flags & USK_PAIR_MASTER)
> +
> + if (inherit_fd_lookup_desc(d)) {
> + int sk = -1;
> + if (inherited_fd(d, &sk))
Presumably single call to inherit_fd_lookup_desc(), without
inherit_fd_lookup_desc() in advance, would be enough.
> + return sk;
> + else
> + return -1;
> + }
> + else if (ui->flags & USK_PAIR_MASTER)
> return open_unixsk_pair_master(ui);
> else if (ui->flags & USK_PAIR_SLAVE)
> return open_unixsk_pair_slave(ui);
> @@ -989,11 +1027,27 @@ static int open_unix_sk(struct file_desc *d)
> return open_unixsk_standalone(ui);
> }
>
> +static char *socket_d_name(struct file_desc *d, char *buf, size_t s)
> +{
> + struct unix_sk_info *ui;
> +
> + ui = container_of(d, struct unix_sk_info, d);
> +
> + if (snprintf(buf, s, "socket:[%d]", ui->ue->ino) >= s) {
> + pr_err("Not enough room for unixsk %d identifier string\n",
> + ui->ue->ino);
> + return NULL;
> + }
> +
> + return buf;
> +}
> +
> static struct file_desc_ops unix_desc_ops = {
> .type = FD_TYPES__UNIXSK,
> .open = open_unix_sk,
> .post_open = post_open_unix_sk,
> .want_transport = unixsk_should_open_transport,
> + .name = socket_d_name,
> };
>
> static int collect_one_unixsk(void *o, ProtobufCMessage *base)
> @@ -1105,3 +1159,55 @@ int resolve_unix_peers(void)
> return 0;
> }
>
> +int unix_sk_ids_parse(char *optarg)
> +{
> + /* parsing option of the following form: --ext-unix-sk=socket:[<inode value>],
> + * socket:[<inode value>]... or short form -x socket:[<inode value>],
> + * socket:[<inode value>]...*/
Style: comments like
/*
* text
* text
*/
are preferred over
/* text
* text */
> +
> + char *iter = optarg;
> + char *token = NULL;
> + int success = 0;
> + const char *keyword= "socket:[";
> + while (iter != NULL){
Style: space between ) and {.
> + if ((token = strstr(iter, keyword)) != NULL) {
> + token += strlen(keyword);
> + char *begin = token;
> + char *end = strchr(token, ']');
> + if (end == NULL) {
> + success = 0;
> + break;
> + }
> + iter = end;
> + ino_t ino = (ino_t)strtoul(begin, &end, 10);
> + if (ino > 0 && unix_sk_id_add(ino) > -1)
> + success++;
> + else {
> + success = 0;
> + break;
> + }
> + }
> + else break;
> + }
> +
> + if (!success){
> + pr_err("Can't parse unix socket inode from optarg: %s\n", optarg);
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> +int unix_sk_id_add(ino_t ino)
This routine should be static.
> +{
> + struct unix_sk_exception *unix_sk;
> +
> + /* TODO: may validate inode here, but how?*/
> +
> + unix_sk = xmalloc(sizeof *unix_sk);
> + if (unix_sk == NULL) return -1;
> + unix_sk->unix_sk_ino = ino;
> + list_add_tail(&unix_sk->unix_sk_list, &opts.ext_unixsk_ids);
> +
> + return 0;
> +}
> diff --git a/test/libcriu/run.sh b/test/libcriu/run.sh
> index e38c76f..d97c518 100755
> --- a/test/libcriu/run.sh
> +++ b/test/libcriu/run.sh
> @@ -43,5 +43,6 @@ run_test test_errno
>
> echo "== Stopping service"
> kill -TERM $(cat wdir/s/pidfile)
> +unlink libcriu.so.1
Plz, describe this change.
> [ $RESULT -eq 0 ] && echo "Success" || echo "FAIL"
> exit $RESULT
> diff --git a/test/socketpairs/Makefile b/test/socketpairs/Makefile
Huge thanks for the test! Plz, add one to the test/Makefile's TEST
variable to that the other: target runs one. In this case it will
be auto-tested dayly and we'll find breakges early.
> new file mode 100644
> index 0000000..19330cf
> --- /dev/null
> +++ b/test/socketpairs/Makefile
That's it.
-- Pavel
More information about the CRIU
mailing list