Why do gcc and clang generate mov reg,-1

I am using compiler explorer to look at some outputs from gcc and clang to get an idea of what assembly these compilers emit for some code. Recently I looked at the output of this code.

int compare_int64(int64_t left, int64_t right)
{
    return (left < right) ? -1 : (left > right) ? 1 : 0;
}

The point of this exercise is not for C++ where this code might be inlined anyway but when such functions are being called.

With -O3 this is the output:

clang:

xor     ecx, ecx
cmp     rdi, rsi
setg    cl
mov     eax, -1
cmovge  eax, ecx
ret

gcc:

xor     eax, eax
cmp     rdi, rsi
mov     edx, -1
setg    al
cmovl   eax, edx
ret

I noticed this code is 17 bytes in size which is just 1 byte over a nice 16 byte (the default code alignment for x64 in another non-C++ compiler I am using is 16). For the gcc code shown I was thinking of either using lea edx,[eax-1] or or edx,-1 (latter before the cmp of course) to reduce the code size. Interestingly when using -Os gcc inserts a jl instruction which is kinda disastrous for the performance of that function.

I am no expert and looked into the instruction tables manual by Agner Fog and if I am not mistaken for mov, lea and or the timings/latency are equal.

So the actual question(s): Why do both compilers use a 5byte size instruction instead of a shorter 3- or 4-byte instruction? Would it be harmless to replace the mov reg,-1 with lea reg,[eax-1] or or reg,-1?



from Recent Questions - Stack Overflow https://ift.tt/3xn7BO0
https://ift.tt/eA8V8J

Comments

Popular posts from this blog

Today Walkin 14th-Sept

Network Error and Timeout on Authorize.net JS

Spring Elasticsearch Operations