[CRIU] lxc - cgroup related restore error

Wed Jul 13 10:06:39 PDT 2016

On Wed, Jul 13, 2016 at 05:56:29PM +0200, Adrian Reber wrote:
> On Wed, Jul 13, 2016 at 09:30:02AM -0600, Tycho Andersen wrote:
> > On Wed, Jul 13, 2016 at 05:17:24PM +0200, Adrian Reber wrote:
> > > On Wed, Jul 13, 2016 at 08:27:42AM -0600, Tycho Andersen wrote:
> > > > On Wed, Jul 13, 2016 at 12:49:07PM +0200, Adrian Reber wrote:
> > > > > On Wed, Jul 13, 2016 at 01:41:34PM +0300, Cyrill Gorcunov wrote:
> > > > > > On Wed, Jul 13, 2016 at 12:29:01PM +0200, Adrian Reber wrote:
> > > > > > > 
> > > > > > > If I am trying to migrate a process while a LXC container is running on
> > > > > > > the source system the migration fails during restore on the destination
> > > > > > > system with:
> > > > > > > 
> > > > > > > Error (cgroup.c:1193): cg: Failed writing 0-3 to cpuset//lxc/c7/cpuset.cpus: Numerical result out of range
> > > > > > > Error (cgroup.c:1470): cg: Restoring special cpuset props failed!
> > > > > > > 
> > > > > > > This happens with CRIU 2.3 and latest GIT.
> > > > > > > 
> > > > > > > If I am running a LXC container on the destination system I still get
> > > > > > > this error. If I am stopping the LXC container on the source system the
> > > > > > > error disappears. This is again on a RHEL7 system with a 3.10.something
> > > > > > > kernel.
> > > > > > 
> > > > > > Looks like you're migratin into machine with less number of cpus?
> > > > > 
> > > > > Yes, that's true. Haven't checked that before. I am using two virtual
> > > > > machines and it seems like I have forgotten that I changed the specs.
> > > > > 
> > > > > But as the migration works when LXC is stopped it would be nice to have
> > > > > it working with LXC running. Migrating the container from one system to
> > > > > another also works without errors. Only migrating a process unrelated to
> > > > > the LXC container does not work.
> > > > 
> > > > Sorry, I'm not sure I understand this paragraph. What does it mean to
> > > > migrate when LXC is stopped?
> > > 
> > > I meant, I cannot migrate a process when a LXC container is running as I
> > > get the cgroup error from above. When no LXC container is running the
> > > cgroup error does not happen. More understandable now?
> > 
> > Hmm. So is the LXC container contained in the process's subtree? What
> > cpuset cgroup is it in (cat /proc/pid/cgroup for the task you're
> > trying to migrate)?
> 
> My test process is called 'minimal'. It malloc()s a page and reads from
> that page in a loop with sleeps in-between. That is the cgroup
> information of that:

Hmm. I think probably what's happening is our "empty" cgroup
preserving code is getting hit here: we preserve cgroups that we think
are "empty" (i.e. no tasks that criu is trying to checkpoint are in
them, but they are children of the task's current cgroup).

Since your task is in cpuset /, criu is trying to preserve *all* of
the cgroups in cpuset, which is why you're getting the write errors.

I'm not sure what the right way to fix this is (I don't know that
there really is a fix; other than having your tasks use cgroup
namespaces). You can probably work around it by putting the task
you're checkpointing in a /foo cgroup in the cpuset controller.

Tycho

> # cat /proc/15950/cgroup 
> 11:memory:/user.slice
> 10:hugetlb:/
> 9:devices:/user.slice
> 8:freezer:/
> 7:cpuacct,cpu:/user.slice
> 6:pids:/
> 5:cpuset:/
> 4:blkio:/user.slice
> 3:net_prio,net_cls:/
> 2:perf_event:/
> 1:name=systemd:/user.slice/user-0.slice/session-2.scope
> 
> This is the process tree of my container, which is unrelated to the
> process above:
> 
> 19440 pts/0    S      0:00 [lxc monitor] /var/lib/lxc c7
> 19445 ?        Ss     0:00  \_ /sbin/init
> 19476 ?        Ss     0:00      \_ /sbin/dhclient -H c7 -1 -q -lf /var/lib/dhclient/dhclient--eth0.lease -pf /var/run/dhclient-eth0.pid eth0
> 19477 ?        S      0:10      \_ /usr/bin/postgres -D /var/lib/pgsql/data -p 5432
> 19502 ?        Ss     0:01      |   \_ postgres: stats collector process   
> 19503 ?        Ss     0:00      |   \_ postgres: autovacuum launcher process   
> 19504 ?        Ss     0:00      |   \_ postgres: wal writer process   
> 19505 ?        Ss     0:00      |   \_ postgres: writer process   
> 19506 ?        Ss     0:00      |   \_ postgres: checkpointer process   
> 19507 ?        Ss     0:00      |   \_ postgres: logger process   
> 19478 ?        Ss     0:00      \_ /usr/sbin/rsyslogd -n
> 19479 ?        Ss     0:00      \_ /usr/sbin/sshd -D
> 19480 ?        Ss     0:00      \_ /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
> 19481 ?        Ss     0:00      \_ /usr/lib/systemd/systemd-logind
> 19482 ?        Ssl    0:24      \_ /usr/lib/jvm/jre/bin/java -Djava.security.egd=file:/dev/./urandom -classpath /usr/share/tomcat/bin/bootstrap.jar:/usr/share/tomcat/bin/tomcat-juli.jar:/usr/share/java/commons-daemon.jar 
> 19483 ?        Ss     0:00      \_ /usr/lib/systemd/systemd-journald
> 
> The process 'minimal' and the container 'c7' should be completely
> unrelated.
> 
> 		Adrian