[CRIU] checkpointing a docker container and restoring the process to a new container

Ross Boucher rboucher at gmail.com
Fri Apr 24 08:43:42 PDT 2015


Using a checkpointed file system doesn't seem to make a difference.

On Fri, Apr 24, 2015 at 8:26 AM, Ross Boucher <rboucher at gmail.com> wrote:

> The containers are started from the same image and don't write to the
> filesystem (though I suppose something somewhere could be writing without
> my knowledge).
>
> My next step was to use docker commit to checkpoint the filesystem as
> well, and then create the new container based on that image. I'll try that
> and see if it changes anything, even though I don't expect it to.
>
> On Fri, Apr 24, 2015 at 8:23 AM, Pavel Emelyanov <xemul at parallels.com>
> wrote:
>
>> On 04/24/2015 06:12 PM, Ross Boucher wrote:
>> > Yeah, but I think there are other problems as well. I'm trying the same
>> restore process with
>> > a more complex program and seeing odd behavior: the process gets
>> restored, but it seems to be
>> > hung. I have a thread in this program that just prints in a loop every
>> second and it never
>> > prints after being restored (again, this works fine if I restore into
>> the same container).
>>
>> Hm... How do you make sure the filesystem of the container you restore
>> into equals
>> the filesystem of the container you dumped from?
>>
>> The thing is -- if at least one byte in some library changes, criu
>> doesn't notice it
>> (as it doesn't mess with filesystems) and maps them back into processes.
>> They _can_
>> break due to this. E.g. if you have prelink running in container, it can
>> make vary
>> nasty stuff :)
>>
>> -- Pavel
>>
>> > On Fri, Apr 24, 2015 at 6:59 AM, Pavel Emelyanov <xemul at parallels.com
>> <mailto:xemul at parallels.com>> wrote:
>> >
>> >     On 04/24/2015 04:47 PM, Ross Boucher wrote:
>> >     > inherit_fd is being used -- this example works fine if I restore
>> to the same container,
>> >     > it's only breaking now that I'm attempting to restore into a
>> completely different container.
>> >
>> >     So the pipe doesn't get inherited when you restore into different
>> container?
>> >
>> >     > On Fri, Apr 24, 2015 at 4:50 AM, Pavel Emelyanov <
>> xemul at parallels.com <mailto:xemul at parallels.com> <mailto:
>> xemul at parallels.com <mailto:xemul at parallels.com>>> wrote:
>> >     >
>> >     >     On 04/24/2015 12:11 AM, Ross Boucher wrote:
>> >     >     > Another update: I was intrigued by the exit code (which
>> implies SIGPIPE?), since the docker process
>> >     >     > I was running was indeed piping:
>> >     >     >
>> >     >     >     /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i +
>> 1); sleep 3; done'
>> >     >     >
>> >     >     > I tried the same process of checkpointing in one container
>> and restoring to another by writing to a file instead:
>> >     >     >
>> >     >     >     /bin/sh -c 'i=0; while true; do echo $i > /tmp/foo;
>> i=$(expr $i + 1); sleep 3; done'
>> >     >     >
>> >     >     > And this worked correctly! So I've narrowed it done some
>> more, and I'll continue to look into it.
>> >     >
>> >     >     If these are pipes indeed (docker terminals?) then the
>> --inherit-fd option should be used.
>> >     >     Saied (from Google) did some work doing this for docker+criu,
>> he can shed more light, but
>> >     >     he's on vacation right now :)
>> >     >
>> >     >     -- Pavel
>> >     >
>> >     >
>> >
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20150424/93ed3560/attachment-0001.html>


More information about the CRIU mailing list