[CRIU] Failing second checkpoint with iptables-restore v1.8.0 (nf_tables)

Adrian Reber adrian at lisas.de
Fri Sep 7 10:23:26 MSK 2018


On Thu, Sep 06, 2018 at 09:40:39PM +0100, Radostin Stoyanov wrote:
> On 06/09/18 17:04, Adrian Reber wrote:
> > On Thu, Sep 06, 2018 at 10:38:01AM +0200, Adrian Reber wrote:
> >> On Wed, Sep 05, 2018 at 04:14:04PM +0200, Adrian Reber wrote:
> >>> On Wed, Sep 05, 2018 at 11:14:14AM +0200, Adrian Reber wrote:
> >>>> I got a report of a checkpoint failure that is happening when dumping a
> >>>> already restored runc container.
> >>>>
> >>>> On the second 'runc checkpoint' the log says:
> >>>>
> >>>> ip6tables-restore v1.8.0 (nf_tables): 
> >>>> line 2: CHAIN_USER_FLUSH failed (Device or resource busy): chain CRIU
> >>>> line 2: CHAIN_USER_ADD failed (File exists): chain CRIU
> >>>> (00.132118) Error (criu/util.c:811): exited, status=4
> >>>> (00.132144) Unfreezing tasks into 1
> >>>> (00.132154) 	Unseizing 15729 into 1
> >>>> (00.132175) Error (criu/cr-dump.c:1720): Dumping FAILED.
> >>>>
> >>>> This is criu 3.10 on a 4.18 kernel. This is the first time I am seeing a
> >>>> system with 'iptables-restore v1.8.0 (nf_tables)'. Not sure if that is
> >>>> related.
> >>>>
> >>>> I cannot reproduce it with runc currently. So right now I just wanted to
> >>>> reach it out if this is something anybody has already seen.
> >>> I was able to reproduce this error and it is related to iptables 1.8.0
> >>> which has a multi call binary:
> >>>
> >>> /usr/sbin/iptables-restore -> xtables-nft-multi
> >>>
> >>> and the command line options of this iptables-restore program are
> >>> different (and with less features) than the iptables-restore from 1.6.*.
> >>>
> >>> I started to talk with one of the iptables maintainers.
> >> Upstream provided a fix:
> >>
> >> https://marc.info/?l=netfilter-devel&m=153616769327109&w=2
> >>
> >> But it still fails. I tried to simulate what CRIU does and I think it is
> >> something like this:
> >>
> >> echo -e "*filter\n:CRIU - [0:0]\n-I INPUT -j CRIU\n-A CRIU -m mark --mark 42 -j ACCEPT\nCOMMIT\n" | iptables-restore -w 1 --noflush
> >>
> >> With iptables 1.4.x and 1.6.x I can run the command multiple times and
> >> it just works. It keeps on adding the same rules to iptables.
> >>
> >> With iptables 1.8.0 this fails when running the second time with:
> >>
> >> iptables-restore v1.8.0 (nf_tables): 
> >> line 2: CHAIN_USER_FLUSH failed (Device or resource busy): chain CRIU
> >> line 2: CHAIN_USER_ADD failed (File exists): chain CRIU
> >>
> >> But CRIU does not seem to add the rules twice, so it is unclear why I
> >> get the same error as adding the rules twice, although CRIU adds the
> >> rules only once.
> >>
> >> Is this related to some iptables locking? Maybe someone who still
> >> remembers the code can help out.
> > So I really love answering my own emails ;)
> >
> > Another upstream iptables patch which brings us closer to a working
> > iptables-restore again:
> >
> > https://marc.info/?l=netfilter-devel&m=153624401918001&w=2
> >
> > Still not fixed completely as I still get errors from iptables, but we
> > are getting closer.
> 
> >From this announcement at https://lwn.net/Articles/759184/
> 
>     We currently recommend that distributions install the 'legacy' versions
>     by default for stable/production releases.
> 
>     For experimental releases we recommend that distributors make the
>     nf_tables commands available as an alternative so that the iptables,
>     ip6tables, iptables-restore, etc.  commands are created as symbolic
>     links to xtables-nft-multi.
> 
>     Advantages of the 'nf_tables' variant:
>      - No need to use the --wait option to iptables to avoid
>        concurrency issues (--wait is a no-op in the nf_tables versions)
> 
> 
> Does this issue exist with iptables v1.8.0 (legacy)?

I have not tested it, but it does not really help as I have to use what
the distribution provides. But it is fixed upstream now, so this is good
for CRIU when the new iptables will be used more widely.

		Adrian


More information about the CRIU mailing list