[CRIU] checkpoint/restore of an 32bit application on arm64

andi andi.platschek at gmail.com
Mon Oct 15 14:55:34 MSK 2018


Hi all,

over the weekend I got it to a point where I can actually dump+restore a 32bit application in my
64bit arm (with 32bit criu). HOWEVER the application that I am able to dump is very minimalistic:

int main(int argc, char* argv[])
{
         int count;

         while (1) {
                 count++;
         }
}

as soon as I add a library call in the loop, the restored application crashes:

[  256.988318] hello[1931]: unhandled level 3 translation fault (11) at 0x00000000, esr 0x82000007
[  256.988321] pgd = ffff80083d2a2000
[  256.988323] [00000000] *pgd=000000083d2a3003
[  256.988325] , *pud=000000083d2a4003
[  256.988326] , *pmd=000000083d2a5003
[  256.988328] , *pte=0000000000000000
[  256.988329]
[  256.988330]
[  256.988335] CPU: 0 PID: 1931 Comm: hello Not tainted 4.9.0-artech #108
[  256.988337] Hardware name: rwzweiCAB4GS08W001p01_f_f (DT)
[  256.988339] task: ffff80087017e000 task.stack: ffff80083ca74000
[  256.988343] PC is at 0x0
[  256.988345] LR is at 0xf7572d40
[  256.988347] pc : [<0000000000000000>] lr : [<00000000f7572d40>] pstate: 00000010
[  256.988348] sp : 00000000ffc39db0
[  256.988350] x12: 00000000000f4240
[  256.988352] x11: 00000000ffc39dd4 x10: 00000000f7616fac
[  256.988356] x9 : 0000000000000000 x8 : 0000000000000000
[  256.988360] x7 : 0000000000000000 x6 : 0000000000010348
[  256.988363] x5 : 0000000000000000 x4 : 0000000000000000
[  256.988367] x3 : 0000000000000000 x2 : 00000000000003e8
[  256.988370] x1 : 0000000000000000 x0 : 0000000000000000
[  256.988373]


*unless* the call is before the while(1) loop ... so this way I now have an application that I can
use to compare before/after restore. So far I have identified a couple of problems:

-> it seems that argv is not restored (/proc/<PID>/cmdline is wrong) -- any pointers were to look at?
    with a quick search for "cmdline" and "argv" I did not find out where the restore is happening!?

-> the auxv was messed up -- I fixed this one, comparing /proc/<PID>/auxv before/after restore looks good.

-> VmFlags slightly changed (is this expected behavior?) -- most of them are correct, here is the
    diff between before/after restore:

     # diff -y hello_before hello_after | grep VmFlags | grep "|"
     VmFlags: rd ex mr mw me dw sd                          | VmFlags: rd ex mr mw me ac sd
     VmFlags: rd mr mw me dw ac sd                          | VmFlags: rd mr mw me ac sd
     VmFlags: rd wr mr mw me dw ac sd                       | VmFlags: rd wr mr mw me ac sd
     VmFlags: rd ex mr mw me sd                             | VmFlags: rd ex mr mw me ac sd
     VmFlags: mr mw me sd                                   | VmFlags: mr mw me ac sd
     VmFlags: rd ex mr mw me dw sd                          | VmFlags: rd ex mr mw me ac sd
     VmFlags: rd mr mw me dw ac sd                          | VmFlags: rd mr mw me ac sd
     VmFlags: rd wr mr mw me dw ac sd                       | VmFlags: rd wr mr mw me ac sd
     VmFlags: rd wr mr mw me gd ac                          | VmFlags: rd wr mr mw me gd ac sd


     so it's mostly dw dropped and/or ac added and one additional sd (last line) -- so my first guess
     would be, that this is a problem that should be fixed, but won't lead to the above crash?

-> the second problem in the mappings is, that RSS, PSS, etc. values are quite different:

     # diff -y hello_before hello_after | grep -v VmFlags | grep "|"
     Anonymous:             0 kB                      | Anonymous:             4 kB
     Rss:                 788 kB                      | Rss:                   0 kB
     Pss:                 135 kB                      | Pss:                   0 kB
     Shared_Clean:        788 kB                      | Shared_Clean:          0 kB
     Referenced:          788 kB                      | Referenced:            0 kB
     Rss:                   8 kB                      | Rss:                   0 kB
     Pss:                   8 kB                      | Pss:                   0 kB
     Private_Dirty:         8 kB                      | Private_Dirty:         0 kB
     Referenced:            8 kB                      | Referenced:            0 kB
     Anonymous:             8 kB                      | Anonymous:             0 kB
     Rss:                   4 kB                      | Rss:                   0 kB
     Pss:                   4 kB                      | Pss:                   0 kB
     Private_Dirty:         4 kB                      | Private_Dirty:         0 kB
     Referenced:            4 kB                      | Referenced:            0 kB
     Anonymous:             4 kB                      | Anonymous:             0 kB
     Rss:                   8 kB                      | Rss:                   0 kB
     Pss:                   8 kB                      | Pss:                   0 kB
     Private_Dirty:         8 kB                      | Private_Dirty:         0 kB
     Referenced:            8 kB                      | Referenced:            0 kB
     Anonymous:             8 kB                      | Anonymous:             0 kB
     Rss:                 128 kB                      | Rss:                   0 kB
     Pss:                  18 kB                      | Pss:                   0 kB
     Shared_Clean:        128 kB                      | Shared_Clean:          0 kB
     Referenced:          128 kB                      | Referenced:            0 kB
     Rss:                   8 kB                      | Rss:                   0 kB
     Pss:                   8 kB                      | Pss:                   0 kB
     Private_Dirty:         8 kB                      | Private_Dirty:         0 kB
     Referenced:            8 kB                      | Referenced:            0 kB
     Anonymous:             8 kB                      | Anonymous:             0 kB
     Rss:                   4 kB                      | Rss:                   0 kB
     Pss:                   4 kB                      | Pss:                   0 kB
     Private_Dirty:         4 kB                      | Private_Dirty:         0 kB
     Referenced:            4 kB                      | Referenced:            0 kB
     Anonymous:             4 kB                      | Anonymous:             0 kB
     Rss:                   4 kB                      | Rss:                   0 kB
     Pss:                   4 kB                      | Pss:                   0 kB
     Private_Dirty:         4 kB                      | Private_Dirty:         0 kB
     Referenced:            4 kB                      | Referenced:            0 kB
     Anonymous:             4 kB                      | Anonymous:             0 kB
     Rss:                   8 kB                      | Rss:                   4 kB
     Pss:                   8 kB                      | Pss:                   4 kB
     Private_Dirty:         8 kB                      | Private_Dirty:         4 kB
     Referenced:            8 kB                      | Referenced:            4 kB
     Anonymous:             8 kB                      | Anonymous:             4 kB

     I *think* this is the problem to fix first to get going

[  294.336197] BUG: Bad rss-counter state mm:ffff800871f463c0 idx:1 val:-18
[  294.336201] BUG: Bad rss-counter state mm:ffff800871f463c0 idx:3 val:-1

any hints on any of those issues is much appreciated!
many thanks!
andi

On 10/9/18 6:15 PM, Dmitry Safonov wrote:
> On Tue, 9 Oct 2018 at 14:53, Cyrill Gorcunov <gorcunov at gmail.com> wrote:
>> On Tue, Oct 09, 2018 at 01:36:30PM +0200, andi wrote:
>>> Hi all,
>>>
>>> I was wondering if anyone has ever looked into the possibility of supporting CONFIG_COMPAT for arm?
>> Hi Andi. None I know of is currently working on compat mode for arm.
>>
>>> P.S.: In case someone is wondering at what point the restore fails:
>>>
>>> (00.055449) pie: 376: Error (criu/pie/restorer.c:1482): sys_prctl(PR_SET_MM, PR_SET_MM_MAP) failed with -14
>>> (00.055471) pie: 376: Error (criu/pie/restorer.c:1700): Restorer fail 376
>>> (00.055515) Error (criu/cr-restore.c:2266): Restoring FAILED.
>> Seems like it is EFAULT, so some value has been able to copy from userspace.
>> Hard to tell what exactly is happening since someone with good arm knowledge
>> is needed here.
> Just for the reference another thread about it a year ago:
> https://lists.openvz.org/pipermail/criu/2017-December/040025.html
>



More information about the CRIU mailing list