Why do gcc and clang generate mov reg,-1
I am using compiler explorer to look at some outputs from gcc and clang to get an idea of what assembly these compilers emit for some code. Recently I looked at the output of this code.
int compare_int64(int64_t left, int64_t right)
{
return (left < right) ? -1 : (left > right) ? 1 : 0;
}
The point of this exercise is not for C++ where this code might be inlined anyway but when such functions are being called.
With -O3 this is the output:
clang:
xor ecx, ecx
cmp rdi, rsi
setg cl
mov eax, -1
cmovge eax, ecx
ret
gcc:
xor eax, eax
cmp rdi, rsi
mov edx, -1
setg al
cmovl eax, edx
ret
I noticed this code is 17 bytes in size which is just 1 byte over a nice 16 byte (the default code alignment for x64 in another non-C++ compiler I am using is 16). For the gcc code shown I was thinking of either using lea edx,[eax-1] or or edx,-1 (latter before the cmp of course) to reduce the code size. Interestingly when using -Os gcc inserts a jl instruction which is kinda disastrous for the performance of that function.
I am no expert and looked into the instruction tables manual by Agner Fog and if I am not mistaken for mov, lea and or the timings/latency are equal.
So the actual question(s): Why do both compilers use a 5byte size instruction instead of a shorter 3- or 4-byte instruction? Would it be harmless to replace the mov reg,-1 with lea reg,[eax-1] or or reg,-1?
from Recent Questions - Stack Overflow https://ift.tt/3xn7BO0
https://ift.tt/eA8V8J
Comments
Post a Comment