[CRIU] [PATCH 1/4] mem: Introduce image-proxy/image-cache & remote option

Katerina Koukiou k.koukiou at googlemail.com
Tue Aug 9 04:24:57 PDT 2016


On Thu, Aug 4, 2016 at 12:52 PM, Mike Rapoport <rppt at linux.vnet.ibm.com> wrote:
> On Wed, Aug 03, 2016 at 09:50:18PM +0000, Katerina Koukiou wrote:
>> On Wed, Aug 3, 2016 at 12:42 PM, Mike Rapoport <rppt at linux.vnet.ibm.com>
>> wrote:
>>
>> > On Tue, Aug 02, 2016 at 04:39:37PM +0000, Katerina Koukiou wrote:
>> > > This patch introduces --remote option and image-proxy/image-cache
>> > processes.
>> > > This leaves user the option to decide if the checkpoint data are to be
>> > stored
>> > > on disk or sent through socket to the image-proxy.
>> > > The latter forwards the data to the destination node where image-cache
>> > receives
>> > > them.
>> > >
>> > > The overall communication is performed as follows:
>> > > rc_node CRIU dump -> (sends images using a local socket) ->
>> >  image-proxy
>> > >
>> > |
>> > >
>> > V
>> > > dst_node: CRIU restore <- (receives images from a local socket)   <-
>> > image-cache
>> > >
>> > > Running criu with --remote option is like this:
>> > >
>> > > dst_node# criu image-cache --port <port> -o /tmp/image-cache.log
>> > > --local-cache-path <local_cache_path> ...
>> > > dst_node# criu restore --remote -o /tmp/image-cache.log
>> > > --local-cache-path <local_cache_path> ...
>> > > src_node# criu image-proxy --port <port> --address <dst_node> -o
>> > /tmp/image-proxy.log
>> > > --local-proxy-path <local_proxy_path> ...
>> > > src_node# criu dump -t <pid> --remote -o /tmp/dump.log
>> > > --local-proxy-path <local_proxy_path> ...
>> > >
>> > > Signed-off-by: Rodrigo Bruno <rbruno at gsd.inesc-id.pt>
>> > > Signed-off-by: Katerina Koukiou <k.koukiou at gmail.com>
>> > > ---
>> > >  criu/Makefile.crtools        |   4 +
>> > >  criu/cr-dump.c               |  15 +++
>> > >  criu/crtools.c               |  30 ++++-
>> > >  criu/image-desc.c            |   4 +-
>> > >  criu/image.c                 |  26 ++++-
>> > >  criu/img-remote.c            | 272
>> > +++++++++++++++++++++++++++++++++++++++++++
>> > >  criu/include/cr_options.h    |   3 +
>> > >  criu/include/image.h         |   1 +
>> > >  criu/include/protobuf-desc.h |   4 +
>> > >  criu/include/util.h          |   1 +
>> > >  criu/page-xfer.c             |  27 ++++-
>> > >  criu/pagemap.c               |  48 ++++++--
>> > >  criu/protobuf-desc.c         |   1 +
>> > >  criu/util.c                  |  15 +++
>> > >  images/Makefile              |   1 +
>> > >  images/remote-image.proto    |  20 ++++
>> > >  16 files changed, 449 insertions(+), 23 deletions(-)
>> > >  create mode 100644 criu/img-remote.c
>> > >  create mode 100644 images/remote-image.proto
>> > >
>> > > +int read_remote_image_connection(char *snapshot_id, char *path)
>> > > +{
>> > > +     int error;
>> > > +     int sockfd = setup_UNIX_client_socket(get_local_img_path());
>> > > +
>> > > +     if (sockfd < 0) {
>> > > +             pr_perror("Error opening local connection for %s:%s",
>> > path, snapshot_id);
>> > > +             return -1;
>> > > +     }
>> > > +
>> > > +     if (write_header(sockfd, snapshot_id, path, O_RDONLY) < 0) {
>> > > +             pr_perror("Error writing header for %s:%s", path,
>> > snapshot_id);
>> > > +             return -1;
>> > > +     }
>> > > +
>> > > +     if (read_reply_header(sockfd, &error) < 0) {
>> > > +             pr_perror("Error reading reply header for %s:%s", path,
>> > snapshot_id);
>> > > +             return -1;
>> > > +     }
>> > > +     errno = error;
>> >
>> > Can you please explain why do you assign errno value?
>> >
>> Yes, the code is not mine; I just rebased it. But I can tell you what I
>> understand.
>> In criu/image.c  when an image does not exist read_remote_image_connection
>> returns -1. In image.c we want to handle this case separately when
>> read_remote_image_connection returns -1 but the reason is error code ENOENT.
>> Then because read_reply_header does not assign errno but error variable,
>> we do the "errno = error" assignment by ourselves.
>
> Maybe it would be better to return -ENOENT here and add a check for
> ret == -ENOENT in criu/image.c ?
>
Ok.
>> > +     if (!error)
>> > > +             return sockfd;
>> > > +     else if (error == ENOENT) {
>> > > +             pr_info("Image does not exist (%s:%s)\n", path,
>> > snapshot_id);
>> > > +             close(sockfd);
>> > > +             return -1;
>> > > +     }
>> > > +     pr_perror("Unexpected error returned: %d (%s:%s)\n", error, path,
>> > snapshot_id);
>> > > +     close(sockfd);
>> > > +     return -1;
>> > > +}
>> > > +
>> > > @@ -458,7 +483,7 @@ int open_page_read_at(int dfd, int pid, struct
>> > page_read *pr, int pr_flags)
>> > >               return -1;
>> > >       }
>> > >
>> > > -     if (init_pagemaps(pr)) {
>> > > +     if (!opts.remote && init_pagemaps(pr)) {
>> >
>> > Is there anything that prevents using init_pagemaps with opts.remote?
>> > Why cannot we just read the entire pagemap from the socket as we read it
>> > from local file?
>> >
>>
>> I hadn't commented it out at start, but got an error. I am not sure what
>> the exact problem is.
>> The thing is that img_raw_size(pr->pmi) inside init_pagemaps returns 0;
>> Tell me if you understand why that happens.
>
> img_raw_size uses underlying file descriptor to get the image size.
> Obviously, with socket it won't quite work :)
> The pagemap image size is used to estimate amount of memory required to
> hold the entire pagemap in memory and avoid over/under allocations as much
> as possible.
> So, we either need to make the pagemap size available at the restore side
> early enough, or to choose some magic number and hope for best :)
>
If we choose a "magic number" could you please suggest one that would work
for most of the cases?
Otherwise an implementation hint on how to get the exact pagemap size at
that point?
Thanks
>> >
>> > >               close_page_read(pr);
>> > >               return -1;
>> > >       }
>
> --
> Sincerely yours,
> Mike.
>


More information about the CRIU mailing list