[CRIU] Failing second checkpoint with iptables-restore v1.8.0 (nf_tables)
Andrei Vagin
avagin at virtuozzo.com
Fri Sep 7 21:59:05 MSK 2018
On Fri, Sep 07, 2018 at 09:16:41AM +0200, Adrian Reber wrote:
> On Thu, Sep 06, 2018 at 06:04:13PM +0200, Adrian Reber wrote:
> > On Thu, Sep 06, 2018 at 10:38:01AM +0200, Adrian Reber wrote:
> > > On Wed, Sep 05, 2018 at 04:14:04PM +0200, Adrian Reber wrote:
> > > > On Wed, Sep 05, 2018 at 11:14:14AM +0200, Adrian Reber wrote:
> > > > > I got a report of a checkpoint failure that is happening when dumping a
> > > > > already restored runc container.
> > > > >
> > > > > On the second 'runc checkpoint' the log says:
> > > > >
> > > > > ip6tables-restore v1.8.0 (nf_tables):
> > > > > line 2: CHAIN_USER_FLUSH failed (Device or resource busy): chain CRIU
> > > > > line 2: CHAIN_USER_ADD failed (File exists): chain CRIU
> > > > > (00.132118) Error (criu/util.c:811): exited, status=4
> > > > > (00.132144) Unfreezing tasks into 1
> > > > > (00.132154) Unseizing 15729 into 1
> > > > > (00.132175) Error (criu/cr-dump.c:1720): Dumping FAILED.
> > > > >
> > > > > This is criu 3.10 on a 4.18 kernel. This is the first time I am seeing a
> > > > > system with 'iptables-restore v1.8.0 (nf_tables)'. Not sure if that is
> > > > > related.
> > > > >
> > > > > I cannot reproduce it with runc currently. So right now I just wanted to
> > > > > reach it out if this is something anybody has already seen.
> > > >
> > > > I was able to reproduce this error and it is related to iptables 1.8.0
> > > > which has a multi call binary:
> > > >
> > > > /usr/sbin/iptables-restore -> xtables-nft-multi
> > > >
> > > > and the command line options of this iptables-restore program are
> > > > different (and with less features) than the iptables-restore from 1.6.*.
> > > >
> > > > I started to talk with one of the iptables maintainers.
> > >
> > > Upstream provided a fix:
> > >
> > > https://marc.info/?l=netfilter-devel&m=153616769327109&w=2
> > >
> > > But it still fails. I tried to simulate what CRIU does and I think it is
> > > something like this:
> > >
> > > echo -e "*filter\n:CRIU - [0:0]\n-I INPUT -j CRIU\n-A CRIU -m mark --mark 42 -j ACCEPT\nCOMMIT\n" | iptables-restore -w 1 --noflush
> > >
> > > With iptables 1.4.x and 1.6.x I can run the command multiple times and
> > > it just works. It keeps on adding the same rules to iptables.
> > >
> > > With iptables 1.8.0 this fails when running the second time with:
> > >
> > > iptables-restore v1.8.0 (nf_tables):
> > > line 2: CHAIN_USER_FLUSH failed (Device or resource busy): chain CRIU
> > > line 2: CHAIN_USER_ADD failed (File exists): chain CRIU
> > >
> > > But CRIU does not seem to add the rules twice, so it is unclear why I
> > > get the same error as adding the rules twice, although CRIU adds the
> > > rules only once.
> > >
> > > Is this related to some iptables locking? Maybe someone who still
> > > remembers the code can help out.
> >
> > So I really love answering my own emails ;)
> >
> > Another upstream iptables patch which brings us closer to a working
> > iptables-restore again:
> >
> > https://marc.info/?l=netfilter-devel&m=153624401918001&w=2
> >
> > Still not fixed completely as I still get errors from iptables, but we
> > are getting closer.
>
> There is a V2 of the patch above:
>
> https://marc.info/?l=netfilter-devel&m=153625521421967&w=2
>
> And now 'runc checkpoint' works again.
Thank you for working on this issue!
>
> Adrian
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu
More information about the CRIU
mailing list