[CRIU] checkpoint/restore of an 32bit application on arm64

andi andi.platschek at gmail.com
Mon Oct 22 08:19:22 MSK 2018


Hi,

On 10/20/18 8:53 PM, Cyrill Gorcunov wrote:
> On Mon, Oct 15, 2018 at 2:55 PM andi <andi.platschek at gmail.com <mailto:andi.platschek at gmail.com>> wrote:
>
>     Hi all,
>
>
> Hi, really sorry for delay in response.
>
>     over the weekend I got it to a point where I can actually dump+restore a 32bit application in my
>     64bit arm (with 32bit criu). HOWEVER the application that I am able to dump is very minimalistic:
>
>     int main(int argc, char* argv[])
>     {
>              int count;
>
>              while (1) {
>                      count++;
>              }
>     }
>
>     as soon as I add a library call in the loop, the restored application crashes:
>
>     [  256.988318] hello[1931]: unhandled level 3 translation fault (11) at 0x00000000, esr 0x82000007
>     [  256.988321] pgd = ffff80083d2a2000
>     [  256.988323] [00000000] *pgd=000000083d2a3003
>     [  256.988325] , *pud=000000083d2a4003
>     [  256.988326] , *pmd=000000083d2a5003
>     [  256.988328] , *pte=0000000000000000
>     [  256.988329]
>     [  256.988330]
>     [  256.988335] CPU: 0 PID: 1931 Comm: hello Not tainted 4.9.0-artech #108
>     [  256.988337] Hardware name: rwzweiCAB4GS08W001p01_f_f (DT)
>     [  256.988339] task: ffff80087017e000 task.stack: ffff80083ca74000
>     [  256.988343] PC is at 0x0
>     [  256.988345] LR is at 0xf7572d40
>     [  256.988347] pc : [<0000000000000000>] lr : [<00000000f7572d40>] pstate: 00000010
>     [  256.988348] sp : 00000000ffc39db0
>     [  256.988350] x12: 00000000000f4240
>     [  256.988352] x11: 00000000ffc39dd4 x10: 00000000f7616fac
>     [  256.988356] x9 : 0000000000000000 x8 : 0000000000000000
>     [  256.988360] x7 : 0000000000000000 x6 : 0000000000010348
>     [  256.988363] x5 : 0000000000000000 x4 : 0000000000000000
>     [  256.988367] x3 : 0000000000000000 x2 : 00000000000003e8
>     [  256.988370] x1 : 0000000000000000 x0 : 0000000000000000
>     [  256.988373]
>
>
> This sigsegv is due to lack of page data. I know quite a little about arm64 implementation thus we need someone from arm64 camp to help you debug this problem.
jup that's the problem -- the reason is, that TASK_SIZE of a 32bit application on 32bit Linux is different than on a 64bit Linux.

My problem was, that I only looked if /proc/<pid>/smaps looked ok, but I did overlook that the pages.img file was much too small. So after fixing the task_size I can
dump&restore a simple applications, and for failures I do get better traces with backtrace! YEAY! ;-)

anyhow: criu assumes in compel/arch/arm/src/lib/infect.c that TASK_SIZE_MAX is 0xbf000000 -- which does not hold on
64bit linux (TASK_SIZE_32 is defined as 0x100000000 in linux/arch/arm64/include/asm/memory.h).
And in  a few other places kerndat.task_size is used, which is not correct either.

So there is more work to do, but at least it looks a bit more promising now. I'll post progress updates + the problems I run into every now and then.

many thanks!
best regards,
andi



More information about the CRIU mailing list