[CRIU] [PATCH] [RFC] perf: Here is an example how to trace syscalls

Wed Oct 8 05:19:11 PDT 2014

Hi Andrew,

On 10/06/2014 10:25 AM, Andrew Vagin wrote:
> On Mon, Oct 06, 2014 at 09:30:02AM -0400, Christopher Covington wrote:
>> On 10/01/2014 04:03 AM, Andrew Vagin wrote:
>>> On Tue, Sep 30, 2014 at 01:23:55PM -0400, Christopher Covington wrote:

>> How do you differentiate parasite versus pre-parasite/post-parasite original code?
> 
> Here is an example of output:
> $ perf trace report -i /root/git/criu/test/dump/static/env00/1127/1/perf.dump
> ....
>     29.946 ( 0.012 ms): criu/1146 write(fd: 1023, buf: 0x6b5940, count: 44                              ) = 44
>     29.959 ( 0.012 ms): criu/1146 recvfrom(fd: 31, ubuf: 0x7fffa48f4e60, size: 12, flags: WAITALL       ) = 12
>     29.973 ( 0.013 ms): criu/1146 write(fd: 1023, buf: 0x6b5940, count: 31                              ) = 31
>     29.999 ( 0.088 ms): env00/1127 recvfrom(fd: 3, ubuf: 0x7fee6cdf0998, size: 12, flags: WAITALL        ) = 12
>     30.013 ( 0.013 ms): env00/1127 write(fd: 4, buf: 0x7fee6cdf0878, count: 26                           ) = 26
>     30.016 ( 0.002 ms): env00/1127 rt_sigaction(sig: HUP, oact: 0x7fee6cde6c60, sigsetsize: 8            ) = 0
>     30.018 ( 0.001 ms): env00/1127 rt_sigaction(sig: INT, oact: 0x7fee6cde6c80, sigsetsize: 8            ) = 0
> ...
> 
> You can see how criu sends a command to parasite and then it handles the command.
> 
> And ont more example is how to get top 10 of syscalls:
> [root at avagin-fc19-cr criu]# perf trace report -s -i /root/git/criu/test/dump/static/env00/1127/1/perf.dump | sort -n -k 2 -r | head -n 10
>    syscall            calls      min       avg       max      stddev
>                                (msec)    (msec)    (msec)        (%)
>    write                764     0.012     0.017     0.110      2.62%
>    close                165     0.001     0.001     0.009      3.39%
>    openat               147     0.012     0.038     0.211      5.19%
>    read                  93     0.001     0.012     0.077     13.78%
>    ptrace                86     0.002     0.015     0.263     26.51%
>    rt_sigaction          62     0.001     0.002     0.015     15.55%
>    rt_sigaction          62     0.001     0.001     0.010     11.14%
>    fstat                 45     0.001     0.002     0.027     26.85%
>    prctl                 42     0.001     0.001     0.002      1.93%
>    prctl                 42     0.001     0.001     0.001      1.75%
> 

This looks very useful for examining CRIU behavior given a relatively quiet
dumped process (dumpee?). However if the process being dumped (dumpee) is
noisily making many of its own system calls, won't these be indistinguishably
mixed with the parasite's system calls?

The reason I ask is because I think a solution to this, such as some kind of
instrumentation, trigger, or both to ensure only the parasite is measured
might help solve I problem I have a very crude solution to at the moment. That
problem is essentially the opposite one: profiling only the restored dumped
process (dumpee), making sure no parasite code, signal handling, nor ptrace
side effects are included. The very crude approach I'm currently using is
checkpoint early and run long enough that the system call frequency should be
back to normal restored process (dumpee) levels before starting perf.

Thanks,
Christopher

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by the Linux Foundation.