[CRIU] Alternative to hacky resume detection

Wed May 13 03:27:11 PDT 2015

You're right.  Docker needs to know that a container is being
checkpointed to correctly maintain its internal state.

If self dump turns out to be the best/easiest way to do what you want,
we could patch Docker to handle checkpoint requests from its
containers but I doubt that upstream will like this approach because
it leads to the question why limit the request to just checkpoint?
Why not also handle other commands like stop, pause, kill, etc. which
would require a new container-to-Docker API.

Since with each restore some of the container's file descriptors
change (via --inherit-fd), how about checking for a change to detect
restore?

--Saied

On Tue, May 12, 2015 at 2:48 PM, Ross Boucher <rboucher at gmail.com> wrote:
> The container is only running the one process, but I have pools of identical
> containers, and checkpoint/restore into ones unpredictably -- so the
> underlying things like mount points and file descriptors would change, which
> is what I'm using docker to manage.
>
> On Tue, May 12, 2015 at 2:46 PM, Ruslan Kuprieiev <kupruser at gmail.com>
> wrote:
>>
>> Oh, so the whole container is being dumped and not only that one process?
>> Hm, you might be able to just call criu_dump on whole container
>> from within that process just as I showed you in code below(but specify
>> container
>> pid) and get same results. The way that that return 1 in criu_dump works
>> is criu
>> puts a proper response packet into that service socket when restoring a
>> process tree,
>> so everything should work.
>>
>>
>> On 05/13/2015 12:36 AM, Ross Boucher wrote:
>>
>> That's an interesting idea. Though, my process is inside of a docker
>> container, and I think it would get upset by being restored into a different
>> container. I think I need the coordination docker is doing in order for my
>> system to work.
>>
>> On Tue, May 12, 2015 at 2:27 PM, Ruslan Kuprieiev <kupruser at gmail.com>
>> wrote:
>>>
>>> I'm saying that you might want to consider calling criu_dump() from a
>>> process that you are
>>> trying to dump. We call it self dump[1]. For example, using criu_dump()
>>> from libcriu it might look like:
>>>
>>> ...
>>> while (1) {
>>>     ret = criu_dump();
>>>     if (ret < 0) {
>>>         /*error*/
>>>     } else if (ret == 0) {
>>>        /*dump is ok*/
>>>     } else if (ret == 1) {
>>>       /*This process is restored*/
>>>       /*reestablish connection or do whatever needs to be done
>>>        * in case of broken connection */
>>>     }
>>>     /*accept connection and evaluate code*/
>>> }
>>> ...
>>>
>>> [1] http://criu.org/Self_dump
>>>
>>>
>>>
>>> On 05/12/2015 11:25 PM, Ross Boucher wrote:
>>>
>>> I'm not sure I follow. You're saying, the process that actually calls
>>> restore would get notified? Or, are you saying that somehow in the restored
>>> process I can access something set by criu?
>>>
>>> Assuming the former, I don't think that's necessary -- I already know
>>> that I've just restored the process. I could try to send a signal from the
>>> coordinating process and then use that signal to cancel the read thread,
>>> which would be mostly the same thing. But because that would have to travel
>>> through quite a few layers, it seems like it would be better and more
>>> performant to do it from within the restored process itself.
>>>
>>> Perhaps I am just misunderstanding your suggestion though.
>>>
>>>
>>> On Tue, May 12, 2015 at 12:37 PM, Ruslan Kuprieiev <kupruser at gmail.com>
>>> wrote:
>>>>
>>>> Hi, Ross
>>>>
>>>> When restoring using RPC or Libcriu response message contains "restored"
>>>> field set to true,
>>>> that help process to detect if it was restored. You say that every time
>>>> you restore the connection
>>>> is broken, right? So maybe you could utilize "restored" flag?
>>>>
>>>> Thanks,
>>>> Ruslan
>>>>
>>>> On 05/12/2015 09:59 PM, Ross Boucher wrote:
>>>>
>>>> In order to get support working in my application, I've resorted to a
>>>> hack that works but is almost certainly not the best way to do things. I'm
>>>> interested if anyone has suggestions for a better way. First, let me explain
>>>> how it works.
>>>>
>>>> The process I'm checkpointing is a node.js process that opens a socket,
>>>> and waits for a connection on that socket. Once established, the connecting
>>>> process sends code for the node.js process to evaluate, in a loop. The node
>>>> process is checkpointed between every message containing new code to
>>>> evaluate.
>>>>
>>>> Now, when we restore, it is always a completely new process sending code
>>>> to the node.js process, so the built in tcp socket restoration won't work.
>>>> We had lots of difficulty figuring out how to detect that the socket
>>>> connection had been broken. Ultimately, the hack we ended up using was to
>>>> simply loop forever on a separate thread checking the time, and noticing if
>>>> an unexplained huge gap in time had occurred. The looping thread looks like
>>>> this:
>>>>
>>>>
>>>> void * canceler(void * threadPointer)
>>>> {
>>>>     pthread_t thread = *(pthread_t *)threadPointer;
>>>>
>>>>     time_t start,end;
>>>>     time(&start);
>>>>
>>>>     while(true)
>>>>     {
>>>>         usleep(1000);
>>>>         time(&end);
>>>>         double diff = difftime(end,start);
>>>>
>>>>         if (diff > 1.0) {
>>>>             // THIS IS ALMOST CERTAINLY A RESTORE
>>>>             break;
>>>>         }
>>>>     }
>>>>
>>>>     // cancel the read thread
>>>>
>>>>     int result = pthread_cancel(thread);
>>>>
>>>>     return NULL;
>>>>
>>>> }
>>>>
>>>>
>>>>
>>>> Elsewhere, in the code that actually does the reading, we spawn this
>>>> thread with a handle to the read thread:
>>>>
>>>> pthread_create(&cancelThread, NULL, canceler, (void *)readThread);
>>>>
>>>>
>>>>
>>>> The rest of our code understand how to deal with a broken connection and
>>>> is able to seamlessly reconnect. This is all working well, but it seems like
>>>> there is probably a better way so I wanted to ask for suggestions. I also
>>>> tried getting things to work with a file based socket rather than a TCP
>>>> socket, but that proved even more difficult (and was far more complicated in
>>>> our architecture anyway, so I'd prefer not to return down that path).
>>>>
>>>> - Ross
>>>>
>>>> [1] From my other email thread, this video might help illustrate the
>>>> actual process going on, if my description isn't that clear:
>>>>
>>>> https://www.youtube.com/watch?v=F2L6JLFuFWs&feature=youtu.be
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> CRIU mailing list
>>>> CRIU at openvz.org
>>>> https://lists.openvz.org/mailman/listinfo/criu
>>>>
>>>>
>>>
>>>
>>
>>
>
>
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu
>