[CRIU] [PATCH 3/4] arm64: provide own memcpy for pie
Kir Kolyshkin
kir at virtuozzo.com
Tue Nov 8 15:35:25 PST 2016
On 11/08/2016 11:18 AM, Dmitry Safonov wrote:
> On 11/08/2016 09:51 PM, Kir Kolyshkin wrote:
>>
>>
>> On 11/08/2016 04:09 AM, Dmitry Safonov wrote:
>>> On 11/08/2016 12:23 AM, Kir Kolyshkin wrote:
>>>> Chris, Dmitry,
>>>>
>>>> Can you please take a look at this (and, ideally, test on amd64 with
>>>> both clang and gcc)?
>>>
>>> Well, looks ok to me, the small nip is that we can omit if(n) check
>>> here, if I read asm rightly:
>>>
>>>>> + if (n)
>>>>> + memcpy_arm64(to, from, n);
>>>>> + return to;
>>
>> This part I just copied from ppc64 implementation by Laurent.
>> I guess the reason is to save on a function call if n==0. If this
>> doesn't make sense, let me know and I'll drop it.
>
> Ok, keep it.
>
>>> I don't have arm64 board, but have a qemu VM.
>>> Anyway, it's quite big pile of code and if we can't see gain - and
>>> don't have arm64 boards to prove it's perf impact (I don't think it
>>> will be visible on qemu), what the fuss about?
>>> Don't get me wrong - I don't object the patch, it's ok to me, but
>>> why not simplify at this moment and just use for(...) a[i] = b[i]?
>>
>> Well, I don't like the complexity of arm64 memcpy() code either,
>> but I am not an expert so I can't complain.
>>
>> Now, the reason why asm is used is described in commit 214e280
>> ("pie: provide own memcpy for x86"), to which I referred in my commit.
>> Let me quote:
>>
>>> Another argument in favor of (3) is we already have implementation of
>>> builtin_memcpy() optimized for x86.
>>>
>>> The only problem is it is not named memcpy(). Using assembler file (.S)
>>> we can have a function with two names (entry points).
>>
>> It's just I don't know a way to achieve the same effect in C.
>
> Well, currently builtin_* functions are always_inline, so if just add
> i.e. memcmp(...) { return builtin_memcmp(...); } that will be like a
> synonym.
>
> For the other choice: de-inline those functions and customize PIE's
> linker script, somehow like:
>
> > --- a/criu/pie/pie-reloc.lds.S.in
> > +++ b/criu/pie/pie-reloc.lds.S.in
> > @@ -3,6 +3,10 @@ SECTIONS
> > .text : {
> > *(.head.text)
> > *(.text*)
> > + memcmp = builtin_memcmp ;
> > + memcpy = builtin_memcpy ;
> > + memset = builtin_memset ;
> > + strncmp = builtin_strncmp ;
> > *(.compel.init)
> > *(.compel.exit)
>
> So it'll be the real synonym.
>
> The only difference will be if this is a function for compiler-inserted
> calls or for all calls.
That's an interesting approach indeed, unfortunately I failed to make it
work, it always complains like this:
ld -r -T criu/pie/pie.lds-native.S -o criu/pie/restorer.built-in.bin.o
criu/pie/restorer.built-in.o criu/pie/native.lib.a
criu/pie/pie.lds-native.S:8: undefined symbol `builtin_memcmp'
referenced in expression
Andrey suggested using alias function attribute. I will play with it and
let you know.
>
>>> I mean, if anything got broken in this asm (by any reason, you could
>>> see the history on the source link had some fixes already) - the people
>>> start comming with "help me, criu segfaults/corrupts memory/silently
>>> dies on arm64", which in my POV is harder to fix than for() loop,
>>> which I'm sure is rightly handled by any compiler.
>>>
>>> So, again I don't object, I'm ok if the patch applied - just don't get
>>> a reason for it.
>
More information about the CRIU
mailing list