[CRIU] Process Migration using Sockets v3 - Patch 1/2
Rodrigo Bruno
rbruno at gsd.inesc-id.pt
Thu Nov 19 06:41:23 PST 2015
Hi,
On Thu, 19 Nov 2015 16:30:45 +0300
Pavel Emelyanov <xemul at parallels.com> wrote:
> Please, find the comments inline. These are mostly (but not all) cosmetic,
> so fixing them should relatively easy.
>
> I've also checked the patch with Linux's checkpatch.pl scrip, it reported
> quite a lot of 'space indent', 'no space before (' and alike. Please, fix.
Okey.
>
> Also, we're doing the feature freeze today, so proxy-cache thing will go
> into the 1.9 after 7 Dec.
No problem.
>
> I would also ask you to write tests for proxy-cache thing, otherwise
> chances that it will be occasionally broken increase drastically.
Okey. Is it okey for the tests to be all local (i.e., run all daemons
locally)?
>
>
> > diff --git a/crtools.c b/crtools.c
> > index ea8b889..eb19ade 100644
> > --- a/crtools.c
> > +++ b/crtools.c
> > @@ -43,6 +43,8 @@
> >
> > #include "setproctitle.h"
> >
> > +#include "img-remote.h"
> > +
> > struct cr_options opts;
> >
> > void init_opts(void)
> > @@ -62,6 +64,8 @@ void init_opts(void)
> > opts.cpu_cap = CPU_CAP_DEFAULT;
> > opts.manage_cgroups = CG_MODE_DEFAULT;
> > opts.ps_socket = -1;
> > + opts.addr = PROXY_FWD_HOST;
> > + opts.ps_port = CACHE_WRITE_PORT;
>
> These fields are used by page-server and dump/pre-dump, so would you pick some
> more generic names for these macros?
Okey.
>
> > opts.ghost_limit = DEFAULT_GHOST_LIMIT;
> > }
> >
> > @@ -680,6 +696,8 @@ usage:
> > " dedup remove duplicates in memory dump\n"
> > " cpuinfo dump writes cpu information into image file\n"
> > " cpuinfo check validates cpu information read from image file\n"
> > +" image-cache launch image-cache, used for process live migration\n"
> > +" image-proxy launch image-proxy, used for process live migration\n"
>
> Plz, add few words about on which side of the process each is used. E.g.
Okey.
>
> image-cache launch destination-side cache for images
> image-proxy launch source-side proxy to send images to destination mode
>
> > );
> >
> > if (usage_error) {
> > @@ -706,6 +724,7 @@ usage:
> > " restore making it the parent of the restored process\n"
> > " --freeze-cgroup\n"
> > " use cgroup freezer to collect processes\n"
> > +" --remote dump images directly to remote node\n"
>
> Not directly, but via image-proxy, right?
Yes. I will fix the text.
>
> > "\n"
> > "* Special resources support:\n"
> > " -x|--" USK_EXT_PARAM "inode,.." " allow external unix connections (optionally can be assign socket's inode that allows one-sided dump)\n"
> > diff --git a/image.c b/image.c
> > index dc9d6a1..a519788 100644
> > --- a/image.c
> > +++ b/image.c
> > @@ -306,18 +307,50 @@ static int do_open_image(struct cr_img *img, int dfd, int type, unsigned long of
> > {
> > int ret, flags;
> >
> > - flags = oflags & ~(O_NOBUF | O_SERVICE);
> > + flags = oflags & ~(O_NOBUF | O_SERVICE | O_FORCE_LOCAL);
> >
> > - ret = openat(dfd, path, flags, CR_FD_PERM);
> > - if (ret < 0) {
> > - if (!(flags & O_CREAT) && (errno == ENOENT)) {
> > - pr_info("No %s image\n", path);
> > + if(opts.remote && !(oflags & O_FORCE_LOCAL)) {
> > + char* snapshot_id = NULL;
> > +
> > + /* Note: if dfd is the service fd then we need the current
> > + * snapshot_id. Else we need a parent snapshot_id. */
> > + if (dfd == get_service_fd(IMG_FD_OFF))
> > + snapshot_id = get_curr_snapshot_id();
> > + else
> > + snapshot_id = get_snapshot_id_from_idx(dfd);
>
> Can you move this if () into get_snapshot_id_from_idx()? So that here
> we just have
>
> snapshot_id = get_snapshot_id_from_idx(dfd)
>
> ?
>
Yes.
> > +
> > + if(snapshot_id == NULL) {
> > + ret = -1;
> > + }
>
> Braces around one-line bodies (especially that simple ones) are not welcome.
Okey, I thought that when we have 'else' blocks after an if we should include braces.
I will fix that.
>
> > + else if (flags == O_RDONLY) {
> > + pr_info("do_open_remote_image RDONLY path=%s snapshot_id=%s\n",
> > + path, snapshot_id);
> > + ret = read_remote_image_connection(snapshot_id, path);
> > + }
> > + else {
> > + pr_info("do_open_remote_image WDONLY path=%s snapshot_id=%s\n",
> > + path, snapshot_id);
> > + ret = write_remote_image_connection(snapshot_id, path, O_WRONLY);
> > + }
> > +
> > + if (ret < 0) {
>
> This is not correct. Empty image should be set only if the error is ENOENT,
> but any other open/setup error should be reported back.
Okey, I will look into that.
>
> > + pr_info("No %s (dfd=%d) image\n", path, dfd);
> > img->_x.fd = EMPTY_IMG_FD;
> > goto skip_magic;
> > }
> > -
> > - pr_perror("Unable to open %s", path);
> > - goto err;
> > + }
> > + else {
> > + ret = openat(dfd, path, flags, CR_FD_PERM);
> > + if (ret < 0) {
> > + if (!(flags & O_CREAT) && (errno == ENOENT)) {
> > + pr_info("No %s image\n", path);
> > + img->_x.fd = EMPTY_IMG_FD;
> > + goto skip_magic;
> > + }
> > +
> > + pr_perror("Unable to open %s", path);
> > + goto err;
> > + }
> > }
> >
> > img->_x.fd = ret;
> > diff --git a/img-remote.c b/img-remote.c
> > new file mode 100644
> > index 0000000..3c0affe
> > --- /dev/null
> > +++ b/img-remote.c
> > @@ -0,0 +1,245 @@
> > +#include <unistd.h>
> > +#include <stdlib.h>
> > +#include <sys/types.h>
> > +#include <sys/socket.h>
> > +#include <netinet/in.h>
> > +#include <netdb.h>
> > +#include "xmalloc.h"
> > +#include "criu-log.h"
> > +#include "img-remote.h"
> > +#include "img-remote-proto.h"
> > +#include "protobuf/remote-image.pb-c.h"
> > +#include "protobuf-desc.h"
> > +#include <fcntl.h>
> > +
> > +#define PB_LOCAL_IMAGE_SIZE PATHLEN
> > +
> > +static char* snapshot_id = NULL;
> > +
> > +LIST_HEAD(snapshot_head);
> > +
> > +/* A snapshot is a dump or pre-dump operation. Each snapshot is identified by an
> > + * ID which corresponds to the working directory specefied by the user. */
> > +struct snapshot {
> > + char snapshot_id[PATHLEN];
> > + struct list_head l;
> > +};
> > +
> > +struct snapshot* new_snapshot(char* snapshot_id)
> > +{
> > + struct snapshot* s = malloc(sizeof(struct snapshot));
> > + if (!s) {
> > + pr_perror("Failed to allocate snapshot structure");
> > + }
> > + strncpy(s->snapshot_id, snapshot_id, PATHLEN);
> > + return s;
> > +}
> > +
> > +void add_snapshot(struct snapshot* snapshot)
> > +{
> > + list_add_tail(&(snapshot->l), &snapshot_head);
> > +}
> > +
> > +int read_remote_image_connection(char* snapshot_id, char* path)
> > +{
> > + int error;
> > + int sockfd = setup_UNIX_client_socket(READ_IMG_PATH);
>
> Here (and below) the paths to proxy-cache sockets are hard-coded. Is it going
> to stay forever?
You are referring to the defines in img-remote.h, right? Maybe it should be better
to allow the user to modify the paths but also provide defaults if the user does
not provide them. Do you agree?
>
> > + if (sockfd < 0) {
> > + return -1;
> > + }
> > +
> > + if (write_header(sockfd, snapshot_id, path, O_RDONLY) < 0) {
> > + pr_perror("Error writing header for %s:%s", path, snapshot_id);
> > + return -1;
> > + }
> > +
> > + if (read_reply_header(sockfd, &error) < 0) {
> > + pr_perror("Error reading reply header for %s:%s", path, snapshot_id);
> > + return -1;
> > + }
> > +
> > + if (!error) {
> > + pr_info("Image does exist (%s:%s)\n", path, snapshot_id);
> > + return sockfd;
> > + }
> > + else if (error == ENOENT) {
> > + pr_info("Image does not exist (%s:%s)\n", path, snapshot_id);
> > + close(sockfd);
> > + return -1;
> > + }
> > + else {
> > + pr_perror("Unexpected error returned: %d (%s:%s)\n", error, path, snapshot_id);
> > + close(sockfd);
> > + return -1;
> > + }
> > +}
> > +
> > diff --git a/include/img-remote.h b/include/img-remote.h
> > new file mode 100644
> > index 0000000..f3bb810
> > --- /dev/null
> > +++ b/include/img-remote.h
> > @@ -0,0 +1,95 @@
> > +#include <limits.h>
> > +
> > +#ifndef IMAGE_REMOTE_H
> > +#define IMAGE_REMOTE_H
> > +
> > +#define PATHLEN PATH_MAX
> > +#define DUMP_FINISH "DUMP_FINISH"
> > +#define PARENT_IMG "parent"
> > +#define NULL_SNAPSHOT_ID "null"
> > +
> > +/* This flag is used to enable local debugging (dump + proxy + cache + restore)
> > + * on the same machine. The idea is that both dump and restore processes are
> > + * orthogonal to this. */
> > +#define LOCAL_DEVEL 1
> > +
> > +#if LOCAL_DEVEL
> > +/* In local Devel, the image-proxy will open a local socket on WRITE_IMG_PATH and
> > + * the image-cache will open a local socket on READ_IMG_PATH. */
> > +#define PROXY_IMG_PATH "/tmp/criu-remote-proxy.sock"
> > +#define CACHE_IMG_PATH "/tmp/criu-remote-cache.sock"
> > +#define READ_IMG_PATH CACHE_IMG_PATH
> > +#define WRITE_IMG_PATH PROXY_IMG_PATH
> > +#else
> > +#define IMG_PATH "/tmp/criu-remote.sock"
> > +#define READ_IMG_PATH IMG_PATH
> > +#define WRITE_IMG_PATH IMG_PATH
> > +#define PROXY_IMG_PATH IMG_PATH
> > +#define CACHE_IMG_PATH IMG_PATH
> > +#endif
> > +
> > +/* Only the this channel is TCP.*/
> > +#define CACHE_WRITE_PORT 9996
> > +#define PROXY_FWD_PORT CACHE_WRITE_PORT
> > +#define PROXY_FWD_HOST "localhost"
> > +
> > +/* Warning: This may be problematic because of double evaluation... */
> > +#define MIN(x, y) (((x) < (y)) ? (x) : (y))
>
> We have min() and min_t macros in compiler.h
Okey, I will use it then.
>
> > +
> > +/* Called by restore to get the fd correspondent to a particular path. This call
> > + * will block until the connection is received. */
> > +int read_remote_image_connection(char* snapshot_id, char* path);
> > +
>
> > diff --git a/include/protobuf-desc.h b/include/protobuf-desc.h
> > index ab7e4f2..ef7544d 100644
> > --- a/include/protobuf-desc.h
> > +++ b/include/protobuf-desc.h
> > @@ -55,16 +55,20 @@ enum {
> > PB_CPUINFO,
> > PB_USERNS,
> > PB_NETNS,
> > + PB_REMOTE_IMAGE,
> > + PB_LOCAL_IMAGE, /* 50 */
> > + PB_LOCAL_IMAGE_REPLY,
> > + PB_SNAPSHOT_ID,
>
> Plz, describe these images with comments in this enum.
Okey.
>
> >
> > /* PB_AUTOGEN_STOP */
> >
> > PB_PAGEMAP_HEAD,
> > - PB_IDS, /* 50 */
> > + PB_IDS,
> > PB_SIGACT,
> > PB_NETDEV,
> > PB_REMAP_FPATH,
> > PB_SK_QUEUES,
> > - PB_IPCNS_MSG,
> > + PB_IPCNS_MSG, /* 60 */
> > PB_IPCNS_MSG_ENT,
> >
> > PB_MAX,
>
> > diff --git a/page-read.c b/page-read.c
> > index 832c057..40c38bc 100644
> > --- a/page-read.c
> > +++ b/page-read.c
> > @@ -195,9 +204,24 @@ static int try_open_parent(int dfd, int pid, struct page_read *pr, int pr_flags)
> > int pfd, ret;
> > struct page_read *parent = NULL;
> >
> > - pfd = openat(dfd, CR_PARENT_LINK, O_RDONLY);
> > - if (pfd < 0 && errno == ENOENT)
> > - goto out;
> > + if(opts.remote) {
> > + /* Note: we are replacing a real directory FD for a snapshot_id
> > + * index. Since we need the parent of the current snapshot_id,
> > + * we want the current snapshot_id index minus one.
>
> Is it 100% guaranteed that parent image is -1 from the current one?
Yes (unless there is a bug...). Snapshots are taken sequentially and the ids are
also stored one after another.
>
> > + * possible that dfd is already a snapshot_id index. We test it
> > + * by comparing it to the service FD. When opening an image (see
> > + * do_open_image) we convert the snapshot_id index into a real
> > + * snapshot_id. */
> > + pfd = dfd == get_service_fd(IMG_FD_OFF) ?
> > + get_curr_snapshot_id_idx() - 1 : dfd - 1;
> > + if(pfd < 0)
> > + goto out;
> > + }
> > + else {
> > + pfd = openat(dfd, CR_PARENT_LINK, O_RDONLY);
> > + if (pfd < 0 && errno == ENOENT)
> > + goto out;
> > + }
> >
> > parent = xmalloc(sizeof(*parent));
> > if (!parent)
>
> -- Pavel
>
--
Rodrigo Bruno <rbruno at gsd.inesc-id.pt>
More information about the CRIU
mailing list