[CRIU] Failing second checkpoint with iptables-restore v1.8.0 (nf_tables)

Radostin Stoyanov rstoyanov1 at gmail.com
Thu Sep 6 23:40:39 MSK 2018


On 06/09/18 17:04, Adrian Reber wrote:
> On Thu, Sep 06, 2018 at 10:38:01AM +0200, Adrian Reber wrote:
>> On Wed, Sep 05, 2018 at 04:14:04PM +0200, Adrian Reber wrote:
>>> On Wed, Sep 05, 2018 at 11:14:14AM +0200, Adrian Reber wrote:
>>>> I got a report of a checkpoint failure that is happening when dumping a
>>>> already restored runc container.
>>>>
>>>> On the second 'runc checkpoint' the log says:
>>>>
>>>> ip6tables-restore v1.8.0 (nf_tables): 
>>>> line 2: CHAIN_USER_FLUSH failed (Device or resource busy): chain CRIU
>>>> line 2: CHAIN_USER_ADD failed (File exists): chain CRIU
>>>> (00.132118) Error (criu/util.c:811): exited, status=4
>>>> (00.132144) Unfreezing tasks into 1
>>>> (00.132154) 	Unseizing 15729 into 1
>>>> (00.132175) Error (criu/cr-dump.c:1720): Dumping FAILED.
>>>>
>>>> This is criu 3.10 on a 4.18 kernel. This is the first time I am seeing a
>>>> system with 'iptables-restore v1.8.0 (nf_tables)'. Not sure if that is
>>>> related.
>>>>
>>>> I cannot reproduce it with runc currently. So right now I just wanted to
>>>> reach it out if this is something anybody has already seen.
>>> I was able to reproduce this error and it is related to iptables 1.8.0
>>> which has a multi call binary:
>>>
>>> /usr/sbin/iptables-restore -> xtables-nft-multi
>>>
>>> and the command line options of this iptables-restore program are
>>> different (and with less features) than the iptables-restore from 1.6.*.
>>>
>>> I started to talk with one of the iptables maintainers.
>> Upstream provided a fix:
>>
>> https://marc.info/?l=netfilter-devel&m=153616769327109&w=2
>>
>> But it still fails. I tried to simulate what CRIU does and I think it is
>> something like this:
>>
>> echo -e "*filter\n:CRIU - [0:0]\n-I INPUT -j CRIU\n-A CRIU -m mark --mark 42 -j ACCEPT\nCOMMIT\n" | iptables-restore -w 1 --noflush
>>
>> With iptables 1.4.x and 1.6.x I can run the command multiple times and
>> it just works. It keeps on adding the same rules to iptables.
>>
>> With iptables 1.8.0 this fails when running the second time with:
>>
>> iptables-restore v1.8.0 (nf_tables): 
>> line 2: CHAIN_USER_FLUSH failed (Device or resource busy): chain CRIU
>> line 2: CHAIN_USER_ADD failed (File exists): chain CRIU
>>
>> But CRIU does not seem to add the rules twice, so it is unclear why I
>> get the same error as adding the rules twice, although CRIU adds the
>> rules only once.
>>
>> Is this related to some iptables locking? Maybe someone who still
>> remembers the code can help out.
> So I really love answering my own emails ;)
>
> Another upstream iptables patch which brings us closer to a working
> iptables-restore again:
>
> https://marc.info/?l=netfilter-devel&m=153624401918001&w=2
>
> Still not fixed completely as I still get errors from iptables, but we
> are getting closer.
Hi Adrian,

>From this announcement at https://lwn.net/Articles/759184/

    We currently recommend that distributions install the 'legacy' versions
    by default for stable/production releases.

    For experimental releases we recommend that distributors make the
    nf_tables commands available as an alternative so that the iptables,
    ip6tables, iptables-restore, etc.  commands are created as symbolic
    links to xtables-nft-multi.

    Advantages of the 'nf_tables' variant:
     - No need to use the --wait option to iptables to avoid
       concurrency issues (--wait is a no-op in the nf_tables versions)


Does this issue exist with iptables v1.8.0 (legacy)?

Radostin


More information about the CRIU mailing list