[CRIU] Failing second checkpoint with iptables-restore v1.8.0 (nf_tables)
Adrian Reber
adrian at lisas.de
Thu Sep 6 11:38:01 MSK 2018
On Wed, Sep 05, 2018 at 04:14:04PM +0200, Adrian Reber wrote:
> On Wed, Sep 05, 2018 at 11:14:14AM +0200, Adrian Reber wrote:
> > I got a report of a checkpoint failure that is happening when dumping a
> > already restored runc container.
> >
> > On the second 'runc checkpoint' the log says:
> >
> > ip6tables-restore v1.8.0 (nf_tables):
> > line 2: CHAIN_USER_FLUSH failed (Device or resource busy): chain CRIU
> > line 2: CHAIN_USER_ADD failed (File exists): chain CRIU
> > (00.132118) Error (criu/util.c:811): exited, status=4
> > (00.132144) Unfreezing tasks into 1
> > (00.132154) Unseizing 15729 into 1
> > (00.132175) Error (criu/cr-dump.c:1720): Dumping FAILED.
> >
> > This is criu 3.10 on a 4.18 kernel. This is the first time I am seeing a
> > system with 'iptables-restore v1.8.0 (nf_tables)'. Not sure if that is
> > related.
> >
> > I cannot reproduce it with runc currently. So right now I just wanted to
> > reach it out if this is something anybody has already seen.
>
> I was able to reproduce this error and it is related to iptables 1.8.0
> which has a multi call binary:
>
> /usr/sbin/iptables-restore -> xtables-nft-multi
>
> and the command line options of this iptables-restore program are
> different (and with less features) than the iptables-restore from 1.6.*.
>
> I started to talk with one of the iptables maintainers.
Upstream provided a fix:
https://marc.info/?l=netfilter-devel&m=153616769327109&w=2
But it still fails. I tried to simulate what CRIU does and I think it is
something like this:
echo -e "*filter\n:CRIU - [0:0]\n-I INPUT -j CRIU\n-A CRIU -m mark --mark 42 -j ACCEPT\nCOMMIT\n" | iptables-restore -w 1 --noflush
With iptables 1.4.x and 1.6.x I can run the command multiple times and
it just works. It keeps on adding the same rules to iptables.
With iptables 1.8.0 this fails when running the second time with:
iptables-restore v1.8.0 (nf_tables):
line 2: CHAIN_USER_FLUSH failed (Device or resource busy): chain CRIU
line 2: CHAIN_USER_ADD failed (File exists): chain CRIU
But CRIU does not seem to add the rules twice, so it is unclear why I
get the same error as adding the rules twice, although CRIU adds the
rules only once.
Is this related to some iptables locking? Maybe someone who still
remembers the code can help out.
Adrian
More information about the CRIU
mailing list