[CRIU] checkpoint/restore of an 32bit application on arm64

andi andi.platschek at gmail.com
Mon Oct 22 10:35:40 MSK 2018


Hi,

On 10/22/18 9:12 AM, marco xu wrote:
> Hi andi,
>
> thanks a lot for the information!!
>
> “/My problem was, that I only looked if /proc/<pid>/smaps looked ok, but I did overlook that the pages.img file was much too small. So after fixing the task_size I can //dump&restore a simple applications, and for failures I do get better traces with backtrace! YEAY! ;-)/
> /anyhow: criu assumes in compel/arch/arm/src/lib/infect.c that TASK_SIZE_MAX is 0xbf000000 -- which does not hold on 64bit linux (TASK_SIZE_32 is defined as 0x100000000 in linux/arch/arm64/include/asm/memory.h). And in  a few other places kerndat.task_size is used, which is not correct either./”
> i also noticed the TASK_SIZE issue and i still got dump failure after i remove the task_size check in mmap_seized.
>         // if (err < 0 || map > kdat.task_size)
>         //        map = 0;
> how do u fix  the task_size problem?

there are a few spots -- I am not yet sure if this an exhaustive list, but:

-> changed it in compel/arch/arm/src/lib/infect.c in compel_task_size()

-> criu/pie/restorer.c in unmap_old_vmas()

-> criu/mem.c -- there a couple of checks against kdat.task_size

>
> btw, what is your criu/kernel version?what i am using is criu1.6&4.4.22 kernel. looks like there are lots of patching and upgrade work

1.6. is really old (June 2015) -- you should probably upgrade to a newer version.

I am currently working on v3.9 and a 4.9 series kernel.

best!
andi
>
> best regards,
> marco
>
> On Mon, Oct 22, 2018 at 1:21 PM andi <andi.platschek at gmail.com <mailto:andi.platschek at gmail.com>> wrote:
>
>     Hi,
>
>     On 10/20/18 8:53 PM, Cyrill Gorcunov wrote:
>     > On Mon, Oct 15, 2018 at 2:55 PM andi <andi.platschek at gmail.com <mailto:andi.platschek at gmail.com> <mailto:andi.platschek at gmail.com <mailto:andi.platschek at gmail.com>>> wrote:
>     >
>     >     Hi all,
>     >
>     >
>     > Hi, really sorry for delay in response.
>     >
>     >     over the weekend I got it to a point where I can actually dump+restore a 32bit application in my
>     >     64bit arm (with 32bit criu). HOWEVER the application that I am able to dump is very minimalistic:
>     >
>     >     int main(int argc, char* argv[])
>     >     {
>     >              int count;
>     >
>     >              while (1) {
>     >                      count++;
>     >              }
>     >     }
>     >
>     >     as soon as I add a library call in the loop, the restored application crashes:
>     >
>     >     [  256.988318] hello[1931]: unhandled level 3 translation fault (11) at 0x00000000, esr 0x82000007
>     >     [  256.988321] pgd = ffff80083d2a2000
>     >     [  256.988323] [00000000] *pgd=000000083d2a3003
>     >     [  256.988325] , *pud=000000083d2a4003
>     >     [  256.988326] , *pmd=000000083d2a5003
>     >     [  256.988328] , *pte=0000000000000000
>     >     [  256.988329]
>     >     [  256.988330]
>     >     [  256.988335] CPU: 0 PID: 1931 Comm: hello Not tainted 4.9.0-artech #108
>     >     [  256.988337] Hardware name: rwzweiCAB4GS08W001p01_f_f (DT)
>     >     [  256.988339] task: ffff80087017e000 task.stack: ffff80083ca74000
>     >     [  256.988343] PC is at 0x0
>     >     [  256.988345] LR is at 0xf7572d40
>     >     [  256.988347] pc : [<0000000000000000>] lr : [<00000000f7572d40>] pstate: 00000010
>     >     [  256.988348] sp : 00000000ffc39db0
>     >     [  256.988350] x12: 00000000000f4240
>     >     [  256.988352] x11: 00000000ffc39dd4 x10: 00000000f7616fac
>     >     [  256.988356] x9 : 0000000000000000 x8 : 0000000000000000
>     >     [  256.988360] x7 : 0000000000000000 x6 : 0000000000010348
>     >     [  256.988363] x5 : 0000000000000000 x4 : 0000000000000000
>     >     [  256.988367] x3 : 0000000000000000 x2 : 00000000000003e8
>     >     [  256.988370] x1 : 0000000000000000 x0 : 0000000000000000
>     >     [  256.988373]
>     >
>     >
>     > This sigsegv is due to lack of page data. I know quite a little about arm64 implementation thus we need someone from arm64 camp to help you debug this problem.
>     jup that's the problem -- the reason is, that TASK_SIZE of a 32bit application on 32bit Linux is different than on a 64bit Linux.
>
>     My problem was, that I only looked if /proc/<pid>/smaps looked ok, but I did overlook that the pages.img file was much too small. So after fixing the task_size I can
>     dump&restore a simple applications, and for failures I do get better traces with backtrace! YEAY! ;-)
>
>     anyhow: criu assumes in compel/arch/arm/src/lib/infect.c that TASK_SIZE_MAX is 0xbf000000 -- which does not hold on
>     64bit linux (TASK_SIZE_32 is defined as 0x100000000 in linux/arch/arm64/include/asm/memory.h).
>     And in  a few other places kerndat.task_size is used, which is not correct either.
>
>     So there is more work to do, but at least it looks a bit more promising now. I'll post progress updates + the problems I run into every now and then.
>
>     many thanks!
>     best regards,
>     andi
>
>     _______________________________________________
>     CRIU mailing list
>     CRIU at openvz.org <mailto:CRIU at openvz.org>
>     https://lists.openvz.org/mailman/listinfo/criu
>



More information about the CRIU mailing list