[CRIU] Criu issue : Error parsing proc fdinfo

Andrey Wagin avagin at gmail.com
Sun Jan 26 01:37:11 PST 2014


Hello Smain,

2014-01-22 Smain Kahlouch <smainklh at gmail.com>:
> Hello Andrew,
>
> I don't have a precise use case.
> I just wanted to test the CRIU live migration feature.

You can try to migrate an LXC container. Here are instructions how to create it:
http://wiki.criu.org/LXC#Create_and_start_a_container

Or you can look at our tests:
https://github.com/xemul/criu/tree/master/test

We have a shell script to execute them:
$ bash test/zdtm.sh ns/static/env00

>
> I thought it was possible to dump any level of the processes tree.

No, it isn't possible yet. Now we are trying to support the most
popular use case. If you will have a real use-case, we will try to
support it. This project is too young to support everything. Currently
our main target is to support live-migration of Linux Containers. I
don't think that users will have to call criu directly  for that, they
will call lxc-checkpoint or vzctl suspend and these utilities will
call criu with correct options, plugins, action scripts, ...

CRIU isn't an end-user tool, so you should be ready to overcame some
obstacles. A non-standard situation cab be solved with help
action-scripts and plugins.
http://criu.org/Plugins
http://criu.org/Action_scripts

Thank you for the interest to CRIU and good luck with future attempts;).

> I mean if i wanted to migrate only the varnish process i thought it was
> feasable.
>
> Actually i very new to namespaces i tried different levels of the following
> tree :
>
> init(1)-+-collectdmon(12847)---collectd(12849)-+-{collectd}(12854)
>         |                                      |-{collectd}(12855)
>         |                                      |-{collectd}(12856)
>         |                                      |-{collectd}(12857)
>         |                                      |-{collectd}(12858)
>         |                                      `-{collectd}(12860)
>
> |-docker(5802)-+-lxc-start(15502)---sh(15510)-+-bash(15536)---sshd(16187)
>         |              |
> |-cc-node(15674)---{cc-node}(15677)
>         |              |
> |-collectdmon(15722)---collectd(15723)-+-{collectd}(15725)
>         |              |                              |
> |-{collectd}(15726)
>         |              |                              |
> |-{collectd}(15727)
>         |              |                              |
> |-{collectd}(15728)
>         |              |                              |
> |-{collectd}(15729)
>         |              |                              |
> `-{collectd}(15730)
>         |              |
> |-rsyslogd(15546)-+-{rsyslogd}(15547)
>         |              |                              |
> |-{rsyslogd}(15548)
>         |              |                              |
> `-{rsyslogd}(15549)
>         |              |
> |-ruby(15744)---{ruby}(17173)
>         |              |
> `-varnishd(15694)---varnishd(15695)-+-{
>
>
> i got different errors :
>
> criu dump --tree 15510 --images-dir /data --leave-stopped
> (00.105553) Error (cr-dump.c:1472): A session leader of 15510(1) is outside
> of its pid namespace
> (00.106356) Error (cr-dump.c:1811): Dumping FAILED.
>
> criu dump --tree 15502 --images-dir /data --leave-stopped
> (00.009229) Error (namespaces.c:155): Can't dump nested pid namespace for
> 15510
> (00.009272) Error (namespaces.c:321): Can't make pidns id
> (00.009546) Error (cr-dump.c:1811): Dumping FAILED.
>
> criu dump --tree 5802 --images-dir /data --leave-stopped
> (00.012216) Error (namespaces.c:155): Can't dump nested pid namespace for
> 15510
> (00.012253) Error (namespaces.c:321): Can't make pidns id
> (00.012553) Error (cr-dump.c:1811): Dumping FAILED.
>
> criu dump --tree 12847 --images-dir /data --leave-stopped
> (00.027698) Error (sk-inet.c:139): Connected TCP socket, consider using
> tcp-established option.
> (00.027742) Error (cr-dump.c:1491): Dump files (pid: 12849) failed with -1
> (00.028319) Error (cr-dump.c:1811): Dumping FAILED.
>
> Maybe you could guide me, what kind of tests could i do please ?
>
> Regards,
> Smana
>
>
> 2014/1/22 Andrew Vagin <avagin at parallels.com>
>>
>> Hi Smain,
>>
>> On Wed, Jan 22, 2014 at 10:06:12AM +0100, Smain Kahlouch wrote:
>> > Hello all and thank you for your answer.
>> >
>> > my "unshare" version doesn't support the same options as you
>>
>> You can update util-linux from
>> https://git.kernel.org/cgit/utils/util-linux/util-linux.git/
>>
>> BTW: Is your container is executed in a separate user namespace?
>>
>> >
>> > unshare --help
>> >
>> > Usage:
>> >  unshare [options] <program> [args...]
>> >
>> > Options:
>> >  -h, --help        usage information (this)
>> >  -m, --mount       unshare mounts namespace
>> >  -u, --uts         unshare UTS namespace (hostname etc)
>> >  -i, --ipc         unshare System V IPC namespace
>> >  -n, --net         unshare network namespace
>> >
>> > Anyway i'll wait for a next criu version which will hopefully support
>> > namespaces :)
>>
>> I thought a while and understood that I need more information about
>> your use-case.
>>
>> CRIU supports pidns, but only if it's dumped with all tasks.
>>
>> e.g: "criu dump --tree 4601" will dump pidns and all processes of your
>> LXC container.
>>
>> You try to dump a sub-tree of processes from CT, and you do this from the
>> host system. In this case CT's pidns is an external resource.
>>
>> http://criu.org/What_cannot_be_checkpointed#External_resources
>>
>> Each external resource must be handled separately. And before adding
>> support of one more type of external resources in CRIU, we want to be
>> sure, that we have a real use-case for it.
>>
>> So could you describe your use-case in details?
>>
>> Thanks.
>>
>> >
>> > Regards,
>> >
>> >
>> > 2014/1/22 Andrew Vagin <avagin at parallels.com>
>> >
>> >     On Tue, Jan 21, 2014 at 02:40:16PM +0100, Smain Kahlouch wrote:
>> >     > Hello guys,
>> >     >
>> >     > I don't know if it's the right place to post this issue but I
>> > didn't find
>> >     > another way to contact you.
>> >     >
>> >     > I'm currently testing lxc features ("docker" to be precise) and
>> > I'm
>> >     facing the
>> >     > following message when I try to use criu :
>> >     >
>> >     >
>> >     > 1 - check if CRIU is working with my kernel :
>> >     >
>> >     > criu check --ms
>> >     > (00.012246) Warn  (tun.c:55): Skipping tun support check
>> >     > Looks good.
>> >     >
>> >     > 2 - Identify what i want to dump, for example varnish:
>> >     >
>> >     > init,1
>> >     >   ├─acpid,3886
>> >     >   ├─atd,3648
>> >     >   ├─auditd,3591
>> >     >   │   ├─audispd,3593
>> >     >   │   │   └─{audispd},3598
>> >     >   │   └─{auditd},3592
>> >     >   ├─cron,3781
>> >     >   ├─dbus-daemon,3883 --system
>> >     >   ├─docker,3732 -d -p /var/run/docker.pid -r=false -s devicemapper
>> >     >   │   ├─lxc-start,4592 -n
>> >     > 9f81f7bc33c83fc1369b9355c959b6f0d8c87c6758bb75af3ae726ba2bad053a
>> > -f...
>> >     >   │   │   └─sh,4601 -c /bin/bash -c
>> > '/usr/local/sbin/runservices.sh; /usr
>> >     /sbin/
>> >     > sshd -D'
>> >     >   │   │       ├─bash,4685 -c /usr/local/sbin/runservices.sh;
>> > /usr/sbin/
>> >     sshd -D
>> >     >   │   │       │   └─sshd,5022 -D
>> >     >   │   │       ├─cc-node,4774 /usr/bin/cc-node -d -p
>> > /var/run/cc-node.pid
>> >     >   │   │       │   └─{cc-node},4776
>> >     >   │   │       ├─collectdmon,4820 -P /var/run/collectdmon.pid -- -C
>> > /etc/
>> >     > collectd/collectd.conf
>> >     >   │   │       │   └─collectd,4822 -C /etc/collectd/collectd.conf
>> > -f
>> >     >   │   │       │       ├─{collectd},4823
>> >     >   │   │       │       ├─{collectd},4824
>> >     >   │   │       │       ├─{collectd},4825
>> >     >   │   │       │       ├─{collectd},4826
>> >     >   │   │       │       ├─{collectd},4827
>> >     >   │   │       │       └─{collectd},4828
>> >     >   │   │       ├─rsyslogd,4729 -c5
>> >     >   │   │       │   ├─{rsyslogd},4737
>> >     >   │   │       │   ├─{rsyslogd},4738
>> >     >   │   │       │   └─{rsyslogd},4739
>> >     >   │   │       ├─ruby,4837 /usr/bin/collectd-interface-daemon -p
>> > 5000 -l /
>> >     var/
>> >     > log -P /var/run -I ...
>> >     >   │   │       │   └─{ruby},7084
>> >     >   │   │       └─varnishd,4792 -P /var/run/varnishd.pid -a :8000 -T
>> > :6082
>> >     -f /
>> >     > etc/varnish/default.vcl -p ...
>> >     >   │   │           └─varnishd,4794 -P /var/run/varnishd.pid -a
>> > :8000 -T
>> >     :6082 -f
>> >     > /etc/varnish/default.vcl -p ...
>> >     >   │   │               ├─{varnishd},4795
>> >     >   │   │               ├─{varnishd},4796
>> >     >   │   │               ├─{varnishd},4797
>> >     >   │   │               ├─{varnishd},4798
>> >     >   │   │               ├─{varnishd},4800
>> >     >   │   │               ├─{varnishd},4801
>> >     >   │   │               ├─{varnishd},4802
>> >     >   │   │               ├─{varnishd},4803
>> >     >   │   │               ├─{varnishd},4804
>> >     >   │   │               ├─{varnishd},4805
>> >     >   │   │               ├─{varnishd},4806
>> >     >   │   │               ├─{varnishd},4807
>> >     >   │   │               ├─{varnishd},4808
>> >     >   │   │               ├─{varnishd},4809
>> >     >   │   │               ├─{varnishd},4810
>> >     >   │   │               ├─{varnishd},4811
>> >     >   │   │               ├─{varnishd},4812
>> >     >   │   │               └─{varnishd},4813
>> >     >
>> >     > 3 - try to dump
>> >     >
>> >     > criu dump --tree 4792 --images-dir /data/ --leave-stopped
>> >     > pie: Error (pie/parasite.c:243): mount failed (-1)
>> >     > pie: Error (pie/parasite.c:474): Close the control socket for
>> > writing
>> >     > >
>> >     > (00.095409) Error (parasite-syscall.c:787): Can't retrieve FD from
>> > socket
>> >     > (00.095454) Error (parasite-syscall.c:297): Message reply from
>> > daemon is
>> >     > trimmed (12/0)
>> >     > (00.095468) Error (cr-dump.c:1441): Can't get proc fd (pid: 4792)
>> >     > (00.096172) Error (cr-dump.c:1811): Dumping FAILED.
>> >     >
>> >     >
>> >     > I built a custom kernel from 3.12.6 debian sources.
>> >     >
>> >     > I followed the instructions in your website
>> >     >
>> >     > Could you please help me to fix that.
>> >
>> >     You are trying to dump a task from another pidns. Unfortunately it's
>> >     unsupported by now. I'm going to fix that. I think it will not
>> > require too
>> >     much time. Thank you for the report.
>> >
>> >     Currently you can workaround this issue, if you enter in this pidns
>> > and
>> >     mount procfs in /proc.
>> >
>> >     unshare -m nsenter -p -t PID criu.sh
>> >
>> >     # cat criu.sh
>> >     set -e
>> >     mount -make-rprivate /
>> >     mount -t proc proc /proc
>> >     criu dump --tree PID --images-dir /data/ --leave-stopped
>> >
>> >     >
>> >     > Thanks,
>> >     > Smana
>> >
>> >     > _______________________________________________
>> >     > CRIU mailing list
>> >     > CRIU at openvz.org
>> >     > https://lists.openvz.org/mailman/listinfo/criu
>> >
>> >
>> >
>
>
>
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu
>



More information about the CRIU mailing list