[CRIU] Process Migration Using Sockets - PATCH

Rodrigo Bruno rbruno at gsd.inesc-id.pt
Wed Sep 2 13:24:39 PDT 2015


The patch is listed below. The idea is to migrate processes without using disk-backed
images. Files used by these processes still need to be shared (NFS for example) to 
enable full live migration. In future these files could also be transferred using 
sockets.

Two new entities are introduced: the image-proxy, and the image-cache. The image-proxy
receives the image files from the dump process and forwards them to the image-cache. 
The image-cache waits for requests from  the restore process.

Example:

Target Node:
criu image-cache -vvv -o /tmp/image-cache.log --port <cache port> < /dev/null &
sudo criu restore -D /tmp/dump -d -vvvv -o /tmp/restore.log  --remote && echo OK

Source Node:
criu image-proxy -vvv -o /tmp/image-proxy.log --port <cache port> --address <target node> < /dev/null &
sudo criu pre-dump -D /tmp/pre-dump -d -vvvv -o /tmp/pre-dump.log -t $pid --remote
sudo criu dump -D /tmp/dump -d -vvvv -o /tmp/dump.log -t $pid --remote  --prev-images-dir /tmp/pre-dump --track-mem

The code is also available at https://github.com/rodrigo-bruno/criu (forked from CRIU).

You can also test it locally. I have been using this to migrate OpenJDK processes.
If you ever decide to use this code, I would be glad to help, provide bug fixes, etc.

Signed-off-by: Rodrigo Bruno <rbruno at gsd.inesc-id.pt>

diff -uprN criu-source/cr-dedup.c criu-patch/cr-dedup.c
--- criu-source/cr-dedup.c	2015-09-01 20:34:37.042773339 +0100
+++ criu-patch/cr-dedup.c	2015-09-02 02:22:45.725920125 +0100
@@ -11,6 +11,7 @@
 
 static int cr_dedup_one_pagemap(int pid);
 
+// TODO - Eventually patch this for remote usage?
 int cr_dedup(void)
 {
 	int close_ret, ret = 0;
diff -uprN criu-source/cr-dump.c criu-patch/cr-dump.c
--- criu-source/cr-dump.c	2015-09-01 20:34:37.050773528 +0100
+++ criu-patch/cr-dump.c	2015-09-02 02:37:15.993970004 +0100
@@ -83,6 +83,8 @@
 
 #include "asm/dump.h"
 
+#include "image-remote.h"
+
 static char loc_buf[PAGE_SIZE];
 
 static void close_vma_file(struct vma_area *vma)
@@ -1550,6 +1552,10 @@ err:
 	if (disconnect_from_page_server())
 		ret = -1;
 
+        if (opts.remote) {
+            finish_remote_dump();
+        }
+
 	close_cr_imgset(&glob_imgset);
 
 	if (bfd_flush_images())
diff -uprN criu-source/cr-restore.c criu-patch/cr-restore.c
--- criu-source/cr-restore.c	2015-09-01 20:34:37.050773528 +0100
+++ criu-patch/cr-restore.c	2015-09-02 02:46:57.443338876 +0100
@@ -94,6 +94,8 @@
 
 #include "pie/pie-relocs.h"
 
+#include "image-remote.h" 
+
 #ifndef arch_export_restore_thread
 #define arch_export_restore_thread	__export_restore_thread
 #endif
diff: criu-source/crtools: No such file or directory
diff: criu-patch/crtools: No such file or directory
diff -uprN criu-source/crtools.c criu-patch/crtools.c
--- criu-source/crtools.c	2015-09-01 20:34:37.054773617 +0100
+++ criu-patch/crtools.c	2015-09-02 03:05:47.229581153 +0100
@@ -42,6 +42,8 @@
 
 #include "setproctitle.h"
 
+#include "image-remote.h"
+
 struct cr_options opts;
 
 void init_opts(void)
@@ -60,6 +62,8 @@ void init_opts(void)
 	opts.cpu_cap = CPU_CAP_DEFAULT;
 	opts.manage_cgroups = CG_MODE_DEFAULT;
 	opts.ps_socket = -1;
+	opts.addr = PROXY_FWD_HOST;
+	opts.ps_port = CACHE_PUT_PORT;
 	opts.ghost_limit = DEFAULT_GHOST_LIMIT;
 }
 
@@ -247,6 +251,9 @@ int main(int argc, char *argv[], char *e
 		{ "enable-fs",			required_argument,	0, 1065 },
 		{ "enable-external-sharing", 	no_argument, 		0, 1066 },
 		{ "enable-external-masters", 	no_argument, 		0, 1067 },
+		{ "remote",                     no_argument, 		0, 1070 },
+		{ "image-cache",                no_argument, 		0, 1071 },
+		{ "image-proxy",                required_argument,  0, 1072 },
 		{ "freeze-cgroup",		required_argument,	0, 1068 },
 		{ "ghost-limit",		required_argument,	0, 1069 },
 		{ },
@@ -479,6 +486,9 @@ int main(int argc, char *argv[], char *e
 		case 1067:
 			opts.enable_external_masters = true;
 			break;
+		case 1070:
+			opts.remote = true;
+			break;
 		case 1068:
 			opts.freeze_cgroup = optarg;
 			break;
@@ -589,6 +599,8 @@ int main(int argc, char *argv[], char *e
 
 	if (!strcmp(argv[optind], "dump")) {
 		preload_socket_modules();
+		if(opts.remote && push_namespace() < 0)
+			return 1;
 
 		if (!tree_id)
 			goto opt_pid_missing;
@@ -596,6 +608,9 @@ int main(int argc, char *argv[], char *e
 	}
 
 	if (!strcmp(argv[optind], "pre-dump")) {
+		if(opts.remote && push_namespace() < 0)
+			return 1;
+                
 		if (!tree_id)
 			goto opt_pid_missing;
 
@@ -633,6 +648,12 @@ int main(int argc, char *argv[], char *e
 
 	if (!strcmp(argv[optind], "page-server"))
 		return cr_page_server(opts.daemon_mode, -1) > 0 ? 0 : 1;
+        
+	if (!strcmp(argv[optind], "image-cache"))
+		return image_cache(opts.ps_port);
+    
+	if (!strcmp(argv[optind], "image-proxy"))
+		return image_proxy(opts.addr, opts.ps_port);
 
 	if (!strcmp(argv[optind], "service"))
 		return cr_service(opts.daemon_mode);
@@ -660,6 +681,8 @@ usage:
 "  criu page-server\n"
 "  criu service [<options>]\n"
 "  criu dedup\n"
+"  criu image-cache [<options>]\n"
+"  criu image-proxy [<options>]\n"
 "\n"
 "Commands:\n"
 "  dump           checkpoint a process/tree identified by pid\n"
@@ -672,6 +695,8 @@ usage:
 "  dedup          remove duplicates in memory dump\n"
 "  cpuinfo dump   writes cpu information into image file\n"
 "  cpuinfo check  validates cpu information read from image file\n"
+"  image-cache    launch image-cache service, used for process live migration\n"
+"  image-proxy    launch image-proxy service, used for process live migration\n"
 	);
 
 	if (usage_error) {
@@ -752,7 +777,7 @@ usage:
 "                        when used on restore, as soon as page is restored, it\n"
 "                        will be punched from the image.\n"
 "\n"
-"Page/Service server options:\n"
+"Page/Service/image-cache/image-proxy server options:\n"
 "  --address ADDR        address of server or service\n"
 "  --port PORT           port of page server\n"
 "  -d|--daemon           run in the background after creating socket\n"
diff -uprN criu-source/image.c criu-patch/image.c
--- criu-source/image.c	2015-09-01 20:34:37.058773708 +0100
+++ criu-patch/image.c	2015-09-02 02:57:48.502419478 +0100
@@ -12,6 +12,7 @@
 #include "protobuf.h"
 #include "protobuf/inventory.pb-c.h"
 #include "protobuf/pagemap.pb-c.h"
+#include "image-remote.h"
 
 bool fdinfo_per_id = false;
 bool ns_per_id = false;
@@ -218,6 +219,7 @@ struct cr_imgset *cr_glob_imgset_open(in
 }
 
 static int do_open_image(struct cr_img *img, int dfd, int type, unsigned long flags, char *path);
+static int do_open_remote_image(struct cr_img *img, int dfd, int type, unsigned long flags, char *path);
 
 struct cr_img *open_image_at(int dfd, int type, unsigned long flags, ...)
 {
@@ -251,10 +253,19 @@ struct cr_img *open_image_at(int dfd, in
 	} else
 		img->fd = EMPTY_IMG_FD;
 
-	if (do_open_image(img, dfd, type, oflags, path)) {
-		close_image(img);
-		return NULL;
-	}
+	if(opts.remote && 
+		strcmp(path, "stats-dump") && strcmp(path, "stats-restore")) {
+		if (do_open_remote_image(img, dfd, type, oflags, path)) {
+			close_image(img);
+			return NULL;
+		}
+	}
+	else {
+		if (do_open_image(img, dfd, type, oflags, path)) {
+			close_image(img);
+			return NULL;
+		}
+	}
 
 	return img;
 }
@@ -336,6 +347,72 @@ static int do_open_image(struct cr_img *
 	if (imgset_template[type].magic == RAW_IMAGE_MAGIC)
 		goto skip_magic;
 
+	if (flags == O_RDONLY) {
+		ret = img_check_magic(img, oflags, type, path);
+        }
+	else {
+		ret = img_write_magic(img, oflags, type);
+        }
+	if (ret)
+		goto err;
+
+skip_magic:
+	return 0;
+
+err:
+	return -1;
+}
+
+static int do_open_remote_image(struct cr_img *img, int dfd, int type, unsigned long oflags, char *path)
+{
+	int ret, flags;
+
+	flags = oflags & ~(O_NOBUF | O_SERVICE);
+        
+        if(dfd == get_service_fd(IMG_FD_OFF) || dfd == -1)
+            dfd = get_current_namespace_fd();
+        
+        // TODO - fix this. Find out what is the purpose of this file.
+        if(!strcmp("irmap-cache", path)) {
+            ret = -1;
+        }
+        else if(get_namespace(dfd) == NULL) {
+            ret = -1;
+        }
+        else if (flags == O_RDONLY) {
+            pr_info("do_open_remote_image RDONLY path=%s namespace=%s\n", 
+                    path, get_namespace(dfd));
+            ret = get_remote_image_connection(get_namespace(dfd), path);
+        }
+        else {
+            pr_info("do_open_remote_image WDONLY path=%s namespace=%s\n", 
+                    path, get_namespace(dfd));
+            ret = open_remote_image_connection(get_namespace(dfd), path);
+        }
+        
+        if (ret < 0) {
+            pr_info("No %s (dfd=%d) image\n", path, dfd);
+            img->_x.fd = EMPTY_IMG_FD;
+            goto skip_magic;
+	}
+        
+
+	img->_x.fd = ret;
+	if (oflags & O_NOBUF)
+		bfd_setraw(&img->_x);
+	else {
+		if (flags == O_RDONLY)
+			ret = bfdopenr(&img->_x);
+		else
+			ret = bfdopenw(&img->_x);
+
+		if (ret)
+			goto err;
+	}
+
+	if (imgset_template[type].magic == RAW_IMAGE_MAGIC)
+		goto skip_magic;
+
 	if (flags == O_RDONLY)
 		ret = img_check_magic(img, oflags, type, path);
 	else
@@ -352,17 +429,25 @@ err:
 
 int open_image_lazy(struct cr_img *img)
 {
-	int dfd;
+	int dfd, ret;
 	char *path = img->path;
 
 	img->path = NULL;
 
 	dfd = get_service_fd(IMG_FD_OFF);
-	if (do_open_image(img, dfd, img->type, img->oflags, path)) {
+        
+        if(opts.remote && 
+                strcmp(path, "stats-dump") && strcmp(path, "stats-restore")) {
+            ret = do_open_remote_image(img, dfd, img->type, img->oflags, path);
+        } 
+        else {
+            ret = do_open_image(img, dfd, img->type, img->oflags, path);
+        }
+        
+        if(ret) {
 		xfree(path);
 		return -1;
 	}
-
 	xfree(path);
 	return 0;
 }
@@ -411,12 +496,19 @@ int open_image_dir(char *dir)
 	fd = ret;
 
 	if (opts.img_parent) {
-		ret = symlinkat(opts.img_parent, fd, CR_PARENT_LINK);
-		if (ret < 0 && errno != EEXIST) {
-			pr_perror("Can't link parent snapshot");
-			goto err;
-		}
-	}
+                if(opts.remote)
+                        init_namespace(dir, opts.img_parent);
+                else {
+                        ret = symlinkat(opts.img_parent, fd, CR_PARENT_LINK);
+                        if (ret < 0 && errno != EEXIST) {
+                                pr_perror("Can't link parent snapshot");
+                                goto err;
+                        }
+                }
+	}
+        else if(opts.remote) {
+                init_namespace(dir, NULL);
+        }
 
 	return 0;
 
diff -uprN criu-source/image-cache.c criu-patch/image-cache.c
--- criu-source/image-cache.c	1970-01-01 01:00:00.000000000 +0100
+++ criu-patch/image-cache.c	2015-09-01 20:24:22.544637095 +0100
@@ -0,0 +1,56 @@
+#include <unistd.h>
+
+#include "image-remote.h"
+#include "image-remote-pvt.h"
+#include "criu-log.h"
+
+void* cache_remote_image(void* ptr) 
+{
+        remote_image* rimg = (remote_image*) ptr;
+        
+        if (!strncmp(rimg->path, DUMP_FINISH, sizeof (DUMP_FINISH))) 
+        {
+                close(rimg->src_fd);
+                return NULL;
+        }
+    
+        prepare_put_rimg();
+    
+        recv_remote_image(rimg->src_fd, rimg->path, &rimg->buf_head);
+        
+        finalize_put_rimg(rimg);
+        
+        return NULL;
+}
+
+int image_cache(unsigned short cache_put_port) 
+{    
+        pthread_t get_thr, put_thr;
+        int put_fd, get_fd;
+        
+        pr_info("Put Port %d, Get Port %d\n", cache_put_port, CACHE_GET_PORT);
+
+        put_fd = prepare_server_socket(cache_put_port);
+        get_fd = prepare_server_socket(CACHE_GET_PORT);
+
+        if(init_cache()) 
+                return -1;
+       
+        if (pthread_create(
+            &put_thr, NULL, accept_put_image_connections, (void*) &put_fd)) {
+                pr_perror("Unable to create put thread");
+                return -1;
+        }
+        if (pthread_create(
+            &get_thr, NULL, accept_get_image_connections, (void*) &get_fd)) {
+                pr_perror("Unable to create get thread");
+                return -1;
+        }
+
+        join_workers();
+        
+        // NOTE: these joins will never return...
+        pthread_join(put_thr, NULL);
+        pthread_join(get_thr, NULL);
+        return 0;
+}
\ No newline at end of file
diff -uprN criu-source/image-proxy.c criu-patch/image-proxy.c
--- criu-source/image-proxy.c	1970-01-01 01:00:00.000000000 +0100
+++ criu-patch/image-proxy.c	2015-09-02 02:07:02.536240316 +0100
@@ -0,0 +1,75 @@
+#include <unistd.h>
+
+#include "image-remote.h"
+#include "image-remote-pvt.h"
+#include "criu-log.h"
+
+static char* dst_host;
+static unsigned short dst_port;
+
+void* proxy_remote_image(void* ptr)
+{
+        remote_image* rimg = (remote_image*) ptr;
+        rimg->dst_fd = prepare_client_socket(dst_host, dst_port);
+        if (rimg->dst_fd < 0) {
+                pr_perror("Unable to open recover image socket");
+                return NULL;
+        }
+
+        if(write_header(rimg->dst_fd, rimg->namespace, rimg->path) < 0) {
+                pr_perror("Error writing header for %s:%s", 
+                        rimg->path, rimg->namespace);
+                return NULL;
+        }  
+
+        prepare_put_rimg();
+        
+        if (!strncmp(rimg->path, DUMP_FINISH, sizeof(DUMP_FINISH))) 
+        {
+            close(rimg->dst_fd);
+            finalize_put_rimg(rimg);
+            return NULL;
+        }
+        if (recv_remote_image(rimg->src_fd, rimg->path, &(rimg->buf_head)) < 0) {
+                return NULL;
+        }
+        finalize_put_rimg(rimg);
+        send_remote_image(rimg->dst_fd, rimg->path, &(rimg->buf_head)); 
+    return NULL;
+}
+
+int image_proxy(char* fwd_host, unsigned short fwd_port) 
+{
+        pthread_t get_thr, put_thr;
+        int put_fd, get_fd;
+        
+        dst_host = fwd_host;
+        dst_port = fwd_port;
+        
+        pr_info("Proxy Get Port %d, Put Port %d, Destination Host %s:%hu\n", 
+                PROXY_GET_PORT, PROXY_PUT_PORT, fwd_host, fwd_port);
+
+        put_fd = prepare_server_socket(PROXY_PUT_PORT);
+        get_fd = prepare_server_socket(PROXY_GET_PORT);
+        
+        if(init_proxy()) 
+                return -1;
+
+        if (pthread_create(
+            &put_thr, NULL, accept_put_image_connections, (void*) &put_fd)) {
+                pr_perror("Unable to create put thread");
+                return -1;
+        }
+        if (pthread_create(
+            &get_thr, NULL, accept_get_image_connections, (void*) &get_fd)) {
+                pr_perror("Unable to create get thread");
+                return -1;
+        }
+
+        join_workers();
+        
+        // NOTE: these joins will never return...
+        pthread_join(put_thr, NULL);
+        pthread_join(get_thr, NULL);
+        return 0;
+}
diff -uprN criu-source/image-remote.c criu-patch/image-remote.c
--- criu-source/image-remote.c	1970-01-01 01:00:00.000000000 +0100
+++ criu-patch/image-remote.c	2015-09-02 02:18:33.548099686 +0100
@@ -0,0 +1,281 @@
+#include <unistd.h>
+#include <stdlib.h>
+#include <sys/types.h> 
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <netdb.h>
+
+#include <pthread.h>
+#include <semaphore.h>
+
+#include "criu-log.h"
+#include "image-remote.h"
+
+// TODO - fix space limitation
+static char parents[PATHLEN][PATHLEN]; 
+static int  parents_occ = 0;
+static char* namespace = NULL;
+// TODO - not used for now. It will be used if we implement a shared cache and proxy.
+static char* parent = NULL; 
+
+int setup_local_client_connection(int port) 
+{
+        int sockfd;
+        struct sockaddr_in serv_addr;
+        struct hostent *server;
+
+        sockfd = socket(AF_INET, SOCK_STREAM, 0);
+        if (sockfd < 0) {
+                pr_perror("Unable to open remote image socket to img cache");
+                return -1;
+        }
+
+        server = gethostbyname(DEFAULT_HOST);
+        if (server == NULL) {
+                pr_perror("Unable to get host by name (%s)", DEFAULT_HOST);
+                return -1;
+        }
+
+        bzero((char *) &serv_addr, sizeof (serv_addr));
+        serv_addr.sin_family = AF_INET;
+        bcopy((char *) server->h_addr,
+              (char *) &serv_addr.sin_addr.s_addr,
+              server->h_length);
+        serv_addr.sin_port = htons(port);
+
+        if (connect(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) {
+                pr_perror("Unable to connect to remote restore host %s", DEFAULT_HOST);
+                return -1;
+        }
+
+        return sockfd;
+}
+
+int write_header(int fd, char* namespace, char* path)
+{
+        if (write(fd, path, PATHLEN) < 1) {
+                pr_perror("Unable to send path to remote image connection");
+                return -1;
+        }
+
+        if (write(fd, namespace, PATHLEN) < 1) {
+                pr_perror("Unable to send namespace to remote image connection");
+                return -1;
+        } 
+        return 0;
+}
+
+int read_header(int fd, char* namespace, char* path)
+{
+        int n = read(fd, path, PATHLEN);
+        if (n < 0) {
+                pr_perror("Error reading from remote image socket");
+                return -1;
+        } else if (n == 0) {
+                pr_perror("Remote image socket closed before receiving path");
+                return -1;
+        }
+        n = read(fd, namespace, PATHLEN);
+        if (n < 0) {
+                pr_perror("Error reading from remote image socket");
+                return -1;
+        } else if (n == 0) {
+                pr_perror("Remote image socket closed before receiving namespace");
+                return -1;
+        }
+    return 0;
+}
+
+int get_remote_image_connection(char* namespace, char* path) 
+{
+        int sockfd;
+        char path_buf[PATHLEN], ns_buf[PATHLEN];;
+
+        sockfd = setup_local_client_connection(CACHE_GET_PORT);
+        if(sockfd < 0) {
+               return -1;
+        }
+
+        if(write_header(sockfd, namespace, path) < 0) {
+                pr_perror("Error writing header for %s:%s", path, namespace);
+                return -1;
+        }    
+
+        if(read_header(sockfd, ns_buf, path_buf) < 0) {
+                pr_perror("Error reading header for %s:%s", path, namespace);
+                return -1;
+        }
+
+        if(!strncmp(path_buf, path, PATHLEN) && !strncmp(ns_buf, namespace, PATHLEN)) {
+                pr_info("Image cache does have %s:%s\n", path, namespace);
+                return sockfd;
+        }
+        else if(!strncmp(path_buf, DUMP_FINISH, PATHLEN)) {
+                pr_info("Image cache does not have %s:%s\n", path, namespace);
+                close(sockfd);
+                return -1;
+        }
+        else {
+                pr_perror("Image cache returned erroneous name %s\n", path);
+                close(sockfd);
+                return -1;
+        }
+}
+
+int open_remote_image_connection(char* namespace, char* path)
+{
+        int sockfd = setup_local_client_connection(PROXY_PUT_PORT);
+        if(sockfd < 0) {
+                return -1;
+        }
+
+        if(write_header(sockfd, namespace, path) < 0) {
+                pr_perror("Error writing header for %s:%s", path, namespace);
+                return -1;
+        }
+        
+        return sockfd;
+}
+
+int finish_remote_dump() 
+{
+        pr_info("Dump side is calling finish\n");
+        int fd = open_remote_image_connection(NULL_NAMESPACE, DUMP_FINISH);
+        if (fd == -1) {
+                pr_perror("Unable to open finish dump connection");
+                return -1;
+        }
+        close(fd);
+        return 0;
+}
+
+int skip_remote_bytes(int fd, unsigned long len)
+{
+    static char buf[4096];
+    int n = 0;
+    unsigned long curr = 0;
+    
+    for(; curr < len; ) { 
+            n = read(fd, buf, MIN(len - curr, 4096));
+            if(n == 0) {
+                pr_perror("Unexpected end of stream (skipping %lx/%lx bytes)", 
+                        curr, len);
+                return -1;
+            }
+            else if(n > 0) {
+                    curr += n;
+            }
+            else {
+                pr_perror("Error while skipping bytes from stream (%lx/%lx)", 
+                        curr, len);
+                return -1;
+            }
+    }
+    if( curr != len) {
+            pr_perror("Unable to skip the current number of bytes: %lx instead of %lx",
+                    curr, len);
+            return -1;
+    }
+    return 0;
+}
+
+static int push_namespaces() 
+{
+        int n;
+        int sockfd = open_remote_image_connection(NULL_NAMESPACE, PARENT_IMG); 
+        if(sockfd < 0) {
+                pr_perror("Unable to open namespace push connection");
+                return -1;
+        }
+        for(n = 0; n < parents_occ; n++) {
+                if (write(sockfd, parents[n], PATHLEN) < 1) {
+                        pr_perror("Could not write namespace %s to socket", parents[n]);
+                        close(sockfd);
+                        return -1;
+                }
+        }
+        
+        close(sockfd);
+        return 0;    
+}
+
+static int fetch_namespaces() {
+        int n, sockfd;
+        parents_occ = 0;
+        // Read namespace hierarchy
+        sockfd = get_remote_image_connection(NULL_NAMESPACE, PARENT_IMG);
+        if(sockfd < 0) {
+                pr_perror("Unable to open namespace get connection");
+                return -1;
+        }
+        while(1) {
+                n = read(sockfd, parents[parents_occ], PATHLEN);
+                if(n == 0) {
+                        close(sockfd);
+                        break;
+                }
+                else if(n > 0) {
+                        if(++parents_occ > PATHLEN) {
+                                pr_perror("Parent sequence above the size limit");
+                                return -1;
+                        }
+                }
+                else {
+                        pr_perror("Failed to read namespace from socket");
+                }
+        }       
+        return parents_occ;    
+}
+
+int push_namespace() 
+{
+    if(fetch_namespaces() < 0) {
+            pr_perror("Failed to push namespace");
+            return -1;
+    }
+    strncpy(parents[parents_occ++], namespace, PATHLEN);
+    if(push_namespaces()) {
+            pr_perror("Failed to push namespaces");
+            return -1;        
+    }
+    return parents_occ;
+}
+
+void init_namespace(char* ns, char* p)
+{
+        namespace = ns;
+        parent = p;
+}
+
+int get_current_namespace_fd()
+{
+        int i = 0;
+
+        if(parents_occ == 0) {
+                if(fetch_namespaces() < 0) {
+                    return -1;
+                }
+        }
+
+        for(; i < parents_occ; i++) {
+            if(!strncmp(parents[i], namespace, PATHLEN))
+                return i;
+        }
+        pr_perror("Error, could not find current namespace fd"); 
+        return -1;
+}
+
+char* get_namespace(int dfd)
+{
+        if(parents_occ == 0) {
+                if(fetch_namespaces() < 0) {
+                        pr_perror("No namespace in parent hierarchy (%s:%s)",
+                                namespace, parent);
+                        return NULL;
+                }
+        }    
+        if(dfd >= parents_occ || dfd < 0)
+                return NULL;
+        else
+                return parents[dfd];
+}
diff -uprN criu-source/image-remote-pvt.c criu-patch/image-remote-pvt.c
--- criu-source/image-remote-pvt.c	1970-01-01 01:00:00.000000000 +0100
+++ criu-patch/image-remote-pvt.c	2015-09-02 03:18:18.831058561 +0100
@@ -0,0 +1,451 @@
+#include <unistd.h>
+#include <stdlib.h>
+
+#include <semaphore.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <netdb.h>
+
+#include "image-remote-pvt.h"
+#include "criu-log.h"
+
+typedef struct wthread {
+    pthread_t tid;
+    struct list_head l;
+} worker_thread;
+
+static LIST_HEAD(rimg_head);
+static pthread_mutex_t rimg_lock;
+static sem_t rimg_semph;
+
+static LIST_HEAD(workers_head);
+static pthread_mutex_t workers_lock;
+static sem_t workers_semph;
+
+static int finished = 0;
+static int putting = 0;
+
+static void* (*get_func)(void*);
+static void* (*put_func)(void*);
+
+static remote_image* get_rimg_by_name(const char* namespace, const char* path) 
+{
+        remote_image* rimg = NULL;
+        pthread_mutex_lock(&rimg_lock);
+        list_for_each_entry(rimg, &rimg_head, l) {
+                if( !strncmp(rimg->path, path, PATHLEN) && 
+                    !strncmp(rimg->namespace, namespace, PATHLEN)) {
+                        pthread_mutex_unlock(&rimg_lock);
+                        return rimg;
+                }
+        }
+        pthread_mutex_unlock(&rimg_lock);
+        return NULL;
+}
+
+int init_sync_structures() 
+{
+        if (pthread_mutex_init(&rimg_lock, NULL) != 0) {
+                pr_perror("Remote image connection mutex init failed");
+                return -1;
+        }
+
+        if (sem_init(&rimg_semph, 0, 0) != 0) {
+                pr_perror("Remote image connection semaphore init failed");
+                return -1;
+        }
+        
+        if (pthread_mutex_init(&workers_lock, NULL) != 0) {
+                pr_perror("Workers mutex init failed");
+                return -1;
+        }
+
+        if (sem_init(&workers_semph, 0, 0) != 0) {
+                pr_perror("Workers semaphore init failed");
+                return -1;
+        }
+        return 0;
+}
+
+void* get_remote_image(void* ptr) 
+{
+        remote_image* rimg = (remote_image*) ptr;
+        send_remote_image(rimg->dst_fd, rimg->path, &rimg->buf_head);
+        return NULL;
+}
+
+void prepare_put_rimg() 
+{
+        pthread_mutex_lock(&rimg_lock);
+        putting++;
+        pthread_mutex_unlock(&rimg_lock);    
+}
+
+void finalize_put_rimg(remote_image* rimg) 
+{
+        pthread_mutex_lock(&rimg_lock);
+        list_add_tail(&(rimg->l), &rimg_head);
+        putting--;
+        pthread_mutex_unlock(&rimg_lock);
+        sem_post(&rimg_semph);
+}
+
+int init_proxy() 
+{
+        get_func = get_remote_image;
+        put_func = proxy_remote_image;
+        return init_sync_structures();
+}
+
+int init_cache() 
+{
+        get_func = get_remote_image;
+        put_func = cache_remote_image;
+        return init_sync_structures();
+}
+
+int prepare_server_socket(int port) 
+{
+        struct sockaddr_in serv_addr;
+        int sockopt = 1;
+
+        int sockfd = socket(AF_INET, SOCK_STREAM, 0);
+        if (sockfd < 0) {
+                pr_perror("Unable to open image socket");
+                return -1;
+        }
+
+        bzero((char *) &serv_addr, sizeof (serv_addr));
+        serv_addr.sin_family = AF_INET;
+        serv_addr.sin_addr.s_addr = INADDR_ANY;
+        serv_addr.sin_port = htons(port);
+
+        if (setsockopt(
+            sockfd, SOL_SOCKET, SO_REUSEADDR, &sockopt, sizeof (sockopt)) == -1) {
+                pr_perror("Unable to set SO_REUSEADDR");
+                return -1;
+        }
+
+        if (bind(sockfd, (struct sockaddr *) &serv_addr, sizeof (serv_addr)) < 0) {
+                pr_perror("Unable to bind image socket");
+                return -1;
+        }
+
+        if (listen(sockfd, DEFAULT_LISTEN)) {
+                pr_perror("Unable to listen image socket");
+                return -1;
+        }
+
+        return sockfd;
+}
+
+int prepare_client_socket(char* hostname, int port)
+{
+        struct hostent *server;
+        struct sockaddr_in serv_addr;
+        
+        int sockfd = socket(AF_INET, SOCK_STREAM, 0); 
+        if (sockfd < 0) {
+                pr_perror("Unable to open recover image socket");
+                return -1;
+        }
+
+        server = gethostbyname(hostname);
+        if (server == NULL) {
+                pr_perror("Unable to get host by name (%s)", hostname);
+                return -1;
+        }
+
+        bzero((char *) &serv_addr, sizeof (serv_addr));
+        serv_addr.sin_family = AF_INET;
+        bcopy((char *) server->h_addr,
+              (char *) &serv_addr.sin_addr.s_addr,
+              server->h_length);
+        serv_addr.sin_port = htons(port);
+
+        if (connect(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) {
+                pr_perror("Unable to connect to remote restore host %s", hostname);
+                return -1;
+        }  
+        
+        return sockfd;
+}
+
+static void add_worker(pthread_t tid)
+{
+        worker_thread* wthread = malloc(sizeof(worker_thread));
+        if(!wthread) {
+                pr_perror("Unable to allocate worker thread structure");
+        }
+        wthread->tid = tid;
+        pthread_mutex_lock(&workers_lock);
+        list_add_tail(&(wthread->l), &workers_head);
+        pthread_mutex_unlock(&workers_lock);
+        sem_post(&workers_semph);
+}
+
+void join_workers()
+{
+        worker_thread* wthread = NULL;
+        while(1) {
+            if(list_empty(&workers_head)) {
+                    sem_wait(&workers_semph);
+                    continue;
+            }
+            wthread = list_entry(workers_head.next, worker_thread, l);
+            if(pthread_join(wthread->tid, NULL)) {
+                    pr_perror("Could not join thread %lu", (unsigned long) wthread->tid);
+            }
+            else {
+                    //pr_info("Joined thread %lu\n", (unsigned long) wthread->tid);
+                    list_del(&(wthread->l));
+                    free(wthread);
+            }
+            
+        }
+}
+
+static remote_image* wait_for_image(int cli_fd, char* namespace, char* path) 
+{
+        remote_image *result;
+    
+        while (1) {
+                result = get_rimg_by_name(namespace, path);
+                // The file exists
+                if(result != NULL) {
+                        if(write_header(cli_fd, namespace, path) < 0) {
+                                pr_perror("Error writing header for %s:%s", 
+                                        path, namespace);
+                                close(cli_fd);
+                                return NULL;
+                        }
+                        return result;
+                }
+                // The file does not exist and we do not expect new files
+                if(finished && !putting) {
+                        if(write_header(cli_fd, NULL_NAMESPACE, DUMP_FINISH) < 0) {
+                                pr_perror("Error writing header for %s:%s", 
+                                        DUMP_FINISH, NULL_NAMESPACE);
+                        }
+                        close(cli_fd);
+                        return NULL;
+                }
+                // The file does not exist but the request is for a parent file.
+                // A parent file may not exist for the first process.
+                if(!putting && !strncmp(path, PARENT_IMG, PATHLEN)) {
+                    if(write_header(cli_fd, namespace, path) < 0) {
+                            pr_perror("Error writing header for %s:%s", 
+                                        path, namespace);
+                    }
+                    close(cli_fd);
+                    return NULL;
+                }
+                sem_wait(&rimg_semph);
+        }
+}
+
+void* accept_get_image_connections(void* port) 
+{
+        socklen_t clilen;
+        int cli_fd;
+        pthread_t tid;
+        int get_fd = *((int*) port);
+        struct sockaddr_in cli_addr;
+        clilen = sizeof (cli_addr);
+        char path_buf[PATHLEN];
+        char namespace_buf[PATHLEN];
+        remote_image* rimg;
+
+        while (1) {
+        
+                cli_fd = accept(get_fd, (struct sockaddr *) &cli_addr, &clilen);
+                if (cli_fd < 0) {
+                        pr_perror("Unable to accept get image connection");
+                        return NULL;
+                }
+
+                if(read_header(cli_fd, namespace_buf, path_buf) < 0) {
+                    pr_perror("Error reading header");
+                    continue;
+                }
+                
+                pr_info("Received GET for %s:%s.\n", path_buf, namespace_buf);
+
+                rimg = wait_for_image(cli_fd, namespace_buf, path_buf);
+                if(!rimg) {
+                        continue;
+                }
+
+                rimg->dst_fd = cli_fd;
+
+                if (pthread_create(
+                    &tid, NULL, get_func, (void*) rimg)) {
+                        pr_perror("Unable to create put thread");
+                        return NULL;
+                }
+                
+                pr_info("Serving Get request for %s:%s (tid=%lu)\n", 
+                        rimg->path, rimg->namespace, (unsigned long) tid);
+                
+                add_worker(tid);
+        }
+}
+
+void* accept_put_image_connections(void* port) 
+{
+        socklen_t clilen;
+        int cli_fd;
+        pthread_t tid;
+        int put_fd = *((int*) port);
+        struct sockaddr_in cli_addr;
+        clilen = sizeof(cli_addr);
+        char path_buf[PATHLEN];
+        char namespace_buf[PATHLEN];
+    
+        while (1) {
+
+                cli_fd = accept(put_fd, (struct sockaddr *) &cli_addr, &clilen);
+                if (cli_fd < 0) {
+                        pr_perror("Unable to accept put image connection");
+                        return NULL;
+                }
+
+                if(read_header(cli_fd, namespace_buf, path_buf) < 0) {
+                    pr_perror("Error reading header");
+                    continue;
+                }
+                
+                remote_image* rimg = get_rimg_by_name(namespace_buf, path_buf);
+                
+                pr_info("Reveiced PUT request for %s:%s\n", path_buf, namespace_buf);
+                                
+                if(rimg == NULL) {
+                        rimg = malloc(sizeof (remote_image));
+                        if (rimg == NULL) {
+                                pr_perror("Unable to allocate remote_image structures");
+                                return NULL;
+                        }
+
+                        remote_buffer* buf = malloc(sizeof (remote_buffer));
+                        if(buf == NULL) {
+                                pr_perror("Unable to allocate remote_buffer structures");
+                                return NULL;
+                        }
+
+                        strncpy(rimg->path, path_buf, PATHLEN);
+                        strncpy(rimg->namespace, namespace_buf, PATHLEN);
+                        buf->nbytes = 0;
+                        INIT_LIST_HEAD(&(rimg->buf_head));
+                        list_add_tail(&(buf->l), &(rimg->buf_head));
+                }
+                // NOTE: we implement a PUT by clearing the previous file.
+                else {
+                    pr_info("Clearing previous images for %s:%s\n", 
+                            path_buf, namespace_buf);
+                        pthread_mutex_lock(&rimg_lock);
+                        list_del(&(rimg->l)); 
+                        pthread_mutex_unlock(&rimg_lock);
+                        while(!list_is_singular(&(rimg->buf_head))) {
+                                list_del(rimg->buf_head.prev);
+                        }
+                        list_entry(rimg->buf_head.next, remote_buffer, l)->nbytes = 0;
+                }
+                rimg->src_fd = cli_fd;
+                rimg->dst_fd = -1;
+
+                if (pthread_create(
+                    &tid, NULL, put_func, (void*) rimg)) {
+                        pr_perror("Unable to create put thread");
+                        return NULL;
+                } 
+                
+                pr_info("Serving PUT request for %s:%s (tid=%lu)\n", 
+                        rimg->path, rimg->namespace, (unsigned long) tid);
+                
+                add_worker(tid);
+                
+                if (!strncmp(path_buf, DUMP_FINISH, sizeof (DUMP_FINISH))) {
+                        finished = 1;
+                        pr_info("Received DUMP FINISH\n");
+                        sem_post(&rimg_semph);
+                }
+        }
+}
+
+int recv_remote_image(int fd, char* path, struct list_head* rbuff_head) 
+{
+        remote_buffer* curr_buf = list_entry(rbuff_head->next, remote_buffer, l);
+        int n, nblocks;
+       
+        nblocks = 0;
+        while(1) {
+                n = read(fd, 
+                         curr_buf->buffer + curr_buf->nbytes, 
+                         BUF_SIZE - curr_buf->nbytes);
+                if (n == 0) {
+                        pr_info("Finished receiving %s (%d full blocks, %d bytes on last block)\n", 
+                                path, nblocks, curr_buf->nbytes);
+                        close(fd);
+                        return nblocks*BUF_SIZE + curr_buf->nbytes;
+                }
+                else if (n > 0) {
+                        curr_buf->nbytes += n;
+                        if(curr_buf->nbytes == BUF_SIZE) {
+                                remote_buffer* buf = malloc(sizeof(remote_buffer));
+                                if(buf == NULL) {
+                                        pr_perror("Unable to allocate remote_buffer structures");
+                                        return -1;
+                                }
+                                buf->nbytes = 0;
+                                list_add_tail(&(buf->l), rbuff_head);
+                                curr_buf = buf;
+                                nblocks++;
+                        }            
+                }
+                else {
+                        pr_perror("Read on %s socket failed", path);
+                        return -1;
+                }
+        }
+}
+
+int send_remote_image(int fd, char* path, struct list_head* rbuff_head) 
+{
+        remote_buffer* curr_buf = list_entry(rbuff_head->next, remote_buffer, l);
+        int n, curr_offset, nblocks;
+    
+        nblocks = 0;
+        curr_offset = 0;
+    
+        while(1) {
+                n = send(
+                    fd, 
+                    curr_buf->buffer + curr_offset, 
+                    MIN(BUF_SIZE, curr_buf->nbytes) - curr_offset,
+                    MSG_NOSIGNAL);
+                if(n > -1) {
+                        curr_offset += n;
+                        if(curr_offset == BUF_SIZE) {
+                                curr_buf = 
+                                    list_entry(curr_buf->l.next, remote_buffer, l);
+                                nblocks++;
+                                curr_offset = 0;
+                        }
+                        else if(curr_offset == curr_buf->nbytes) {
+                                pr_info("Finished forwarding %s (%d full blocks, %d bytes on last block)\n", 
+                                        path, nblocks, curr_offset);
+                                close(fd);
+                               return nblocks*BUF_SIZE + curr_buf->nbytes;
+                        }
+                }
+                else if(errno == EPIPE || errno == ECONNRESET) {
+                        pr_warn("Connection for %s was closed early than expected\n", 
+                                path);
+                        return 0;
+                }
+                else {
+                        pr_perror("Write on %s socket failed", path);
+                        return -1;
+                }
+        }
+}
diff -uprN criu-source/include/cr_options.h criu-patch/include/cr_options.h
--- criu-source/include/cr_options.h	2015-09-01 20:34:37.062773807 +0100
+++ criu-patch/include/cr_options.h	2015-09-01 20:54:32.902295137 +0100
@@ -85,6 +85,7 @@ struct cr_options {
 	bool			enable_external_sharing;
 	bool			enable_external_masters;
 	bool			aufs;		/* auto-deteced, not via cli */
+	bool			remote;
 	bool			overlayfs;
 	size_t			ghost_limit;
 };
diff -uprN criu-source/include/image-remote.h criu-patch/include/image-remote.h
--- criu-source/include/image-remote.h	1970-01-01 01:00:00.000000000 +0100
+++ criu-patch/include/image-remote.h	2015-09-02 14:07:48.197937228 +0100
@@ -0,0 +1,78 @@
+/* 
+ * File:   image-remote.h
+ * Author: underscore
+ *
+ * Created on July 8, 2015, 1:06 AM
+ */
+
+#ifndef IMAGE_REMOTE_H
+#define	IMAGE_REMOTE_H
+
+#define DEFAULT_HOST "localhost"
+#define PATHLEN 32
+#define DUMP_FINISH "DUMP_FINISH"
+#define PARENT_IMG "parent"
+#define NULL_NAMESPACE "null"
+
+#define PROXY_GET_PORT 9995
+#define PROXY_PUT_PORT 9996
+#define CACHE_PUT_PORT 9997 // can be overwritten by main
+#define CACHE_GET_PORT 9998
+#define PROXY_FWD_PORT CACHE_PUT_PORT
+#define PROXY_FWD_HOST "localhost" // can be overwritten by main
+
+// TODO - this may be problematic because of double evaluation...
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+/* Called by restore to get the fd correspondent to a particular path. This call
+ * will block until the connection is received. */
+extern int get_remote_image_connection(char* namespace, char* path);
+
+/* Called by dump to create a socket connection to the restore side. The socket
+ * fd is returned for further writing operations. */
+extern int open_remote_image_connection(char* namespace, char* path );
+
+/* Called by dump when everything is dumped. This function creates a new 
+ * connection with a special control name. The recover side uses it to ack that
+ * no more files are coming. */
+extern int finish_remote_dump();
+
+/* Starts an image proxy daemon (dump side). It receives image files through 
+ * socket connections and forwards them to the image cache (restore side). */
+extern int image_proxy(char* cache_host, unsigned short cache_port);
+
+/* Starts an image cache daemon (restore side). It receives image files through
+ * socket connections and caches them until they are requested by the restore
+ * process. */
+extern int image_cache(unsigned short cache_port);
+
+/* Reads (discards) 'len' bytes from fd. This is used to emulate the function
+ * lseek, which is used to advance the file needle. */
+int skip_remote_bytes(int fd, unsigned long len);
+
+/* To support iterative migration (multiple pre-dumps before the final dump
+ * and subsequent restore, the concept of namespace is introduced. Each image
+ * is tagged with one namespace and we build a hierarchy of namespaces to 
+ * represent the dependency between pagemaps. Currently, the images dir is 
+ * used as namespace when the operation is marked as remote. */
+
+/* Sets the current namesapce and parent namespace. */
+void init_namespace(char* namespace, char* parent);
+
+/* Returns an integer (virtual fd) representing the current namespace. */
+int get_current_namespace_fd();
+
+/* Returns the namespace associated with the virtual fd (given as argument). */
+char* get_namespace(int dfd);
+
+/* Pushes the current namespace into the namespace hierarchy. The hierarchy is
+ * read, modified, and written. */
+int push_namespace();
+
+/* Two functions used to read and write remote images' headers.*/
+int write_header(int fd, char* namespace, char* path);
+int read_header(int fd, char* namespace, char* path);
+
+
+
+#endif	/* IMAGE_REMOTE_H */
diff -uprN criu-source/include/image-remote-pvt.h criu-patch/include/image-remote-pvt.h
--- criu-source/include/image-remote-pvt.h	1970-01-01 01:00:00.000000000 +0100
+++ criu-patch/include/image-remote-pvt.h	2015-09-02 03:18:33.151390467 +0100
@@ -0,0 +1,60 @@
+#ifndef IMAGE_REMOTE_INTERNAL_H
+#define	IMAGE_REMOTE_INTERNAL_H
+
+#include <pthread.h>
+#include "list.h"
+#include "image-remote.h"
+
+#define DEFAULT_LISTEN 50
+#define PAGESIZE 4096
+#define BUF_SIZE PAGESIZE
+
+/*
+ * This header is used by both the image-proxy and the image-cache.
+ */
+
+// TODO - if we want to implement shared cache and proxy, we might need to clean
+// image files from memory. Otherwise we will harvest lots of memory unnecessarily.
+
+typedef struct rbuf {
+    char buffer[BUF_SIZE];
+    int nbytes; // How many bytes are in the buffer.
+    struct list_head l;
+} remote_buffer;
+
+typedef struct rimg {
+    char path[PATHLEN];
+    char namespace[PATHLEN];
+    int src_fd;
+    int dst_fd;
+    struct list_head l;
+    struct list_head buf_head;
+    
+} remote_image;
+
+int init_cache();
+int init_proxy();
+
+void join_workers();
+
+void prepare_put_rimg();
+void finalize_put_rimg(remote_image* rimg);
+
+void* accept_get_image_connections(void* port);
+void* accept_put_image_connections(void* port);
+
+void* cache_remote_image(void* rimg);
+void* proxy_remote_image(void* rimg);
+
+int send_remote_image(int fd, char* path, struct list_head* rbuff_head);
+int recv_remote_image(int fd, char* path, struct list_head* rbuff_head);
+
+int prepare_server_socket(int port);
+int prepare_client_socket(char* server, int port);
+
+#if GC_COMPRESSION
+void* get_proxied_image(void* rimg);
+#endif
+
+#endif	/* IMAGE_REMOTE_INTERNAL_H */
+
diff -uprN criu-source/Makefile.crtools criu-patch/Makefile.crtools
--- criu-source/Makefile.crtools	2015-09-01 20:34:37.018772787 +0100
+++ criu-patch/Makefile.crtools	2015-09-01 20:36:27.677316385 +0100
@@ -6,6 +6,10 @@ obj-y	+= crtools.o
 obj-y	+= security.o
 obj-y	+= image.o
 obj-y	+= image-desc.o
+obj-y	+= image-remote.o
+obj-y	+= image-proxy.o
+obj-y	+= image-cache.o
+obj-y	+= image-remote-pvt.o
 obj-y	+= net.o
 obj-y	+= tun.o
 obj-y	+= proc_parse.o
diff -uprN criu-source/page-read.c criu-patch/page-read.c
--- criu-source/page-read.c	2015-09-01 20:34:37.082774260 +0100
+++ criu-patch/page-read.c	2015-09-02 02:21:29.616164017 +0100
@@ -10,6 +10,8 @@
 #include "protobuf.h"
 #include "protobuf/pagemap.pb-c.h"
 
+#include "image-remote.h"
+
 #ifndef SEEK_DATA
 #define SEEK_DATA	3
 #define SEEK_HOLE	4
@@ -90,8 +92,17 @@ static void skip_pagemap_pages(struct pa
 		return;
 
 	pr_debug("\tpr%u Skip %lx bytes from page-dump\n", pr->id, len);
-	if (!pr->pe->in_parent)
-		lseek(img_raw_fd(pr->pi), len, SEEK_CUR);
+	if (!pr->pe->in_parent) {
+            if(opts.remote) {
+                    if(skip_remote_bytes(img_raw_fd(pr->pi), len) < 0)
+                            pr_perror("Unable to seek remote bytes");
+            }
+            else {
+                    if(lseek(img_raw_fd(pr->pi), len, SEEK_CUR) < 0)
+                            pr_perror("Unable to lseek");
+            }
+            	
+        }
 	pr->cvaddr += len;
 }
 
@@ -146,7 +157,10 @@ static int read_pagemap_page(struct page
 			return ret;
 	} else {
 		int fd = img_raw_fd(pr->pi);
-		off_t current_vaddr = lseek(fd, 0, SEEK_CUR);
+                // TODO - this only brings problems if we use auto_dedup in 
+                // restore. Need to provide better solution!
+                //off_t current_vaddr = lseek(fd, 0, SEEK_CUR);
+		off_t current_vaddr = 0;
 		pr_debug("\tpr%u Read page %lx from self %lx/%"PRIx64"\n", pr->id,
 				vaddr, pr->cvaddr, current_vaddr);
 		ret = read(fd, buf, PAGE_SIZE);
@@ -195,9 +209,19 @@ static int try_open_parent(int dfd, int
 	int pfd, ret;
 	struct page_read *parent = NULL;
 
-	pfd = openat(dfd, CR_PARENT_LINK, O_RDONLY);
-	if (pfd < 0 && errno == ENOENT)
-		goto out;
+        if(opts.remote) {
+                // NOTE: dfd is either the service fd or a virtual namespace,
+                pfd = dfd == get_service_fd(IMG_FD_OFF) ? 
+                    get_current_namespace_fd() : dfd;
+                pfd -= 1;
+                if(get_namespace(pfd) == NULL)
+                        goto out;
+        }
+        else {
+                pfd = openat(dfd, CR_PARENT_LINK, O_RDONLY);
+                if (pfd < 0 && errno == ENOENT)
+                        goto out;
+        }
 
 	parent = xmalloc(sizeof(*parent));
 	if (!parent)
@@ -212,7 +236,8 @@ static int try_open_parent(int dfd, int
 		parent = NULL;
 	}
 
-	close(pfd);
+	if(!opts.remote)
+		close(pfd);
 out:
 	pr->parent = parent;
 	return 0;
@@ -220,7 +245,8 @@ out:
 err_free:
 	xfree(parent);
 err_cl:
-	close(pfd);
+	if(!opts.remote)
+		close(pfd);
 	return -1;
 }
 
diff -uprN criu-source/page-xfer.c criu-patch/page-xfer.c
--- criu-source/page-xfer.c	2015-09-01 20:34:37.082774260 +0100
+++ criu-patch/page-xfer.c	2015-09-02 02:21:44.968518366 +0100
@@ -17,6 +17,8 @@
 #include "protobuf.h"
 #include "protobuf/pagemap.pb-c.h"
 
+#include "image-remote.h"
+
 struct page_server_iov {
 	u32	cmd;
 	u32	nr_pages;
@@ -728,13 +730,21 @@ static int open_page_local_xfer(struct p
 		int ret;
 		int pfd;
 
-		pfd = openat(get_service_fd(IMG_FD_OFF), CR_PARENT_LINK, O_RDONLY);
-		if (pfd < 0 && errno == ENOENT)
-			goto out;
+		if(opts.remote) {
+                        pfd = get_current_namespace_fd() - 1;
+                        if(get_namespace(pfd) == NULL)
+                                goto out;
+                }
+                else {
+                        pfd = openat(get_service_fd(IMG_FD_OFF), CR_PARENT_LINK, O_RDONLY);
+                        if (pfd < 0 && errno == ENOENT)
+                                goto out;
+                }
 
 		xfer->parent = xmalloc(sizeof(*xfer->parent));
 		if (!xfer->parent) {
-			close(pfd);
+			if(!opts.remote)
+				close(pfd);
 			return -1;
 		}
 
@@ -743,10 +753,12 @@ static int open_page_local_xfer(struct p
 			pr_perror("No parent image found, though parent directory is set");
 			xfree(xfer->parent);
 			xfer->parent = NULL;
-			close(pfd);
+			if(!opts.remote)
+                                close(pfd);
 			goto out;
 		}
-		close(pfd);
+		if(!opts.remote)
+                        close(pfd);
 	}
 
 out:
@@ -777,20 +789,25 @@ int check_parent_local_xfer(int fd_type,
 	struct stat st;
 	int ret, pfd;
 
-	pfd = openat(get_service_fd(IMG_FD_OFF), CR_PARENT_LINK, O_RDONLY);
-	if (pfd < 0 && errno == ENOENT)
-		return 0;
-
-	snprintf(path, sizeof(path), imgset_template[fd_type].fmt, id);
-	ret = fstatat(pfd, path, &st, 0);
-	if (ret == -1 && errno != ENOENT) {
-		pr_perror("Unable to stat %s", path);
-		close(pfd);
-		return -1;
-	}
-
-	close(pfd);
-	return (ret == 0);
+        if(opts.remote) {
+                pfd = get_current_namespace_fd() - 1;
+                return get_namespace(pfd) == NULL ? 0 : 1;
+        }
+        else {
+                pfd = openat(get_service_fd(IMG_FD_OFF), CR_PARENT_LINK, O_RDONLY);
+                if (pfd < 0 && errno == ENOENT)
+                        return 0;
+                snprintf(path, sizeof(path), imgset_template[fd_type].fmt, id);
+                ret = fstatat(pfd, path, &st, 0);
+                if (ret == -1 && errno != ENOENT) {
+                        pr_perror("Unable to stat %s", path);
+                        close(pfd);
+                        return -1;
+                }
+
+                close(pfd);
+                return (ret == 0);
+        }
 }
 
 static int page_server_check_parent(int sk, struct page_server_iov *pi)

On Fri, 31 Jul 2015 15:35:53 +0300
Pavel Emelyanov <xemul at parallels.com> wrote:

> On 07/31/2015 04:06 AM, Rodrigo Bruno wrote:
> > On Thu, 30 Jul 2015 18:04:20 +0300
> > Pavel Emelyanov <xemul at parallels.com> wrote:
> > 
> >> On 07/30/2015 03:42 AM, Rodrigo Bruno wrote:
> >>> Hi,
> >>>
> >>> I am using CRIU and I extended it to support process live migration using sockets.
> >>
> >> Have you looked at the p.haul stuff we use for the same?
> > 
> > No, I will take a look.
> 
> Yup. It'd not yet 100% functional, but demonstrates the intention.
> https://github.com/xemul/p.haul
> 
> >>
> >>> The idea is to write to a file descriptor which corresponds to a socket instead of a file.
> >>
> >> You mean write the image files into a socket, don't you?
> > 
> > Yes.
> 
> OK. Yes, this is cool feature that is sometimes asked about.
> 
> >>
> >>> The amount of code needed for this modification is very small.
> >>>
> >>> So far my experiments are running smoothly. With this, I do not need a background NFS
> >>> deployment and performance is much better. The user only needs to specify, in the command
> >>> line args, that this migration is done using sockets. For now I am using SSH tunnels to
> >>> redirect and cipher the data between different hosts.
> >>>
> >>> I don't know if this will break any other functionality tough.
> >>
> >> Well, if it's all about image files, then two things to keep in mind.
> >>
> >> First, the contents of the pages.img files can already be sent to
> >> sockets using page server (http://criu.org/Disk-less_migration).
> > 
> > I only realized that after having my solution working... However, as far as I 
> > understood, this does not give a full live migration because other img files 
> > still need to get transferred.
> 
> :)
> 
> >>
> >> Second, image objects are read from images in different order from the
> >> one they were written to. So right now it's not easily possible to
> >> pipe-line CRIU dump into CRIU restore.
> > 
> > Right. I solved this problem with in-memory file caches (separate process) that hold files' 
> > contents. On the dump side, the cache receives the files' contents and simply redirects it 
> > to the restore side cache.
> 
> Do you hold them in some hand-made cache, or use tmpfs for this?
> 
> > The restore side cache holds all files in memory until they are 
> > requested by the CRIU restore process. This enables multiple files to be sent concurrently,
> > allowing the restore mechanism to start while the dump mechanism is still running.
> 
> That's interesting :) So you also "lock" reading from images when more objects
> are requested, but they have not yet arrived, don't you?
> 
> >>
> >>> I am sending this mail to ask you if this contribution is of any interest for the project.
> >>
> >> Of course!
> >>
> >>> If it is, I will be glad to help, providing a patch or whatever you need.
> >>
> >> Sure! The patch is always welcome.
> > 
> > I will wrap up my modifications and submit a patch. =)
> 
> Looking forward to see them :)
> 
> -- Pavel
> 


-- 
Rodrigo Bruno <rbruno at gsd.inesc-id.pt>


More information about the CRIU mailing list