<div dir="ltr">Yes, thanks, that was the issue and I pushed a fix which is in the libcontainer criu branch. I have my images restoring into completely new containers reliably now. </div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 12, 2015 at 3:55 AM, Saied Kazemi <span dir="ltr"><<a href="mailto:saied@google.com" target="_blank">saied@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Ross,<br>
<br>
As Pavel mentioned, I was out on vacation and have just started<br>
catching up with a ton of email...<br>
<br>
I assume that you have fixed the issue by now (haven't looked at<br>
Github yet). FWIW, however, the new container exits because its<br>
standard descriptors (pipes) are not properly set up. This is because<br>
--inherit-fd replaces the "old fd" with the new one to be inherited.<br>
Since you are restoring to a brand new container, there is no "old fd"<br>
and, therefore, --inherit-fd doesn't do anything which means the<br>
process's standard file descriptors are not properly set up, hence the<br>
SIGPIPE.<br>
<br>
Sorry for the rant if you've already resolved the issue.<br>
<br>
--Saied<br>
<div><div class="h5"><br>
<br>
On Mon, Apr 27, 2015 at 3:55 PM, Ross Boucher <<a href="mailto:rboucher@gmail.com">rboucher@gmail.com</a>> wrote:<br>
> Just wanted to follow up here. The issue turned out to be that I was<br>
> providing the wrong pipe id to inherit_fd (resolved here:<br>
> <a href="https://github.com/docker/libcontainer/pull/557" target="_blank">https://github.com/docker/libcontainer/pull/557</a>)<br>
><br>
> On Fri, Apr 24, 2015 at 8:43 AM, Ross Boucher <<a href="mailto:rboucher@gmail.com">rboucher@gmail.com</a>> wrote:<br>
>><br>
>> Using a checkpointed file system doesn't seem to make a difference.<br>
>><br>
>> On Fri, Apr 24, 2015 at 8:26 AM, Ross Boucher <<a href="mailto:rboucher@gmail.com">rboucher@gmail.com</a>> wrote:<br>
>>><br>
>>> The containers are started from the same image and don't write to the<br>
>>> filesystem (though I suppose something somewhere could be writing without my<br>
>>> knowledge).<br>
>>><br>
>>> My next step was to use docker commit to checkpoint the filesystem as<br>
>>> well, and then create the new container based on that image. I'll try that<br>
>>> and see if it changes anything, even though I don't expect it to.<br>
>>><br>
>>> On Fri, Apr 24, 2015 at 8:23 AM, Pavel Emelyanov <<a href="mailto:xemul@parallels.com">xemul@parallels.com</a>><br>
>>> wrote:<br>
>>>><br>
>>>> On 04/24/2015 06:12 PM, Ross Boucher wrote:<br>
>>>> > Yeah, but I think there are other problems as well. I'm trying the<br>
>>>> > same restore process with<br>
>>>> > a more complex program and seeing odd behavior: the process gets<br>
>>>> > restored, but it seems to be<br>
>>>> > hung. I have a thread in this program that just prints in a loop every<br>
>>>> > second and it never<br>
>>>> > prints after being restored (again, this works fine if I restore into<br>
>>>> > the same container).<br>
>>>><br>
>>>> Hm... How do you make sure the filesystem of the container you restore<br>
>>>> into equals<br>
>>>> the filesystem of the container you dumped from?<br>
>>>><br>
>>>> The thing is -- if at least one byte in some library changes, criu<br>
>>>> doesn't notice it<br>
>>>> (as it doesn't mess with filesystems) and maps them back into processes.<br>
>>>> They _can_<br>
>>>> break due to this. E.g. if you have prelink running in container, it can<br>
>>>> make vary<br>
>>>> nasty stuff :)<br>
>>>><br>
>>>> -- Pavel<br>
>>>><br>
>>>> > On Fri, Apr 24, 2015 at 6:59 AM, Pavel Emelyanov <<a href="mailto:xemul@parallels.com">xemul@parallels.com</a><br>
>>>> > <mailto:<a href="mailto:xemul@parallels.com">xemul@parallels.com</a>>> wrote:<br>
>>>> ><br>
>>>> > On 04/24/2015 04:47 PM, Ross Boucher wrote:<br>
>>>> > > inherit_fd is being used -- this example works fine if I restore<br>
>>>> > to the same container,<br>
>>>> > > it's only breaking now that I'm attempting to restore into a<br>
>>>> > completely different container.<br>
>>>> ><br>
>>>> > So the pipe doesn't get inherited when you restore into different<br>
>>>> > container?<br>
>>>> ><br>
>>>> > > On Fri, Apr 24, 2015 at 4:50 AM, Pavel Emelyanov<br>
>>>> > <<a href="mailto:xemul@parallels.com">xemul@parallels.com</a> <mailto:<a href="mailto:xemul@parallels.com">xemul@parallels.com</a>><br>
>>>> > <mailto:<a href="mailto:xemul@parallels.com">xemul@parallels.com</a> <mailto:<a href="mailto:xemul@parallels.com">xemul@parallels.com</a>>>> wrote:<br>
>>>> > ><br>
>>>> > > On 04/24/2015 12:11 AM, Ross Boucher wrote:<br>
>>>> > > > Another update: I was intrigued by the exit code (which<br>
>>>> > implies SIGPIPE?), since the docker process<br>
>>>> > > > I was running was indeed piping:<br>
>>>> > > ><br>
>>>> > > > /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i +<br>
>>>> > 1); sleep 3; done'<br>
>>>> > > ><br>
>>>> > > > I tried the same process of checkpointing in one container<br>
>>>> > and restoring to another by writing to a file instead:<br>
>>>> > > ><br>
>>>> > > > /bin/sh -c 'i=0; while true; do echo $i > /tmp/foo;<br>
>>>> > i=$(expr $i + 1); sleep 3; done'<br>
>>>> > > ><br>
>>>> > > > And this worked correctly! So I've narrowed it done some<br>
>>>> > more, and I'll continue to look into it.<br>
>>>> > ><br>
>>>> > > If these are pipes indeed (docker terminals?) then the<br>
>>>> > --inherit-fd option should be used.<br>
>>>> > > Saied (from Google) did some work doing this for<br>
>>>> > docker+criu, he can shed more light, but<br>
>>>> > > he's on vacation right now :)<br>
>>>> > ><br>
>>>> > > -- Pavel<br>
>>>> > ><br>
>>>> > ><br>
>>>> ><br>
>>>> ><br>
>>>><br>
>>><br>
>><br>
><br>
><br>
</div></div>> _______________________________________________<br>
> CRIU mailing list<br>
> <a href="mailto:CRIU@openvz.org">CRIU@openvz.org</a><br>
> <a href="https://lists.openvz.org/mailman/listinfo/criu" target="_blank">https://lists.openvz.org/mailman/listinfo/criu</a><br>
><br>
</blockquote></div><br></div>