This is a direct follow-up to my previous post on gcc optimizations. The main code example, integer division by 7, came from a session of the Compilers class I took this semester. The professor was demonstrating the use of gcc to show example assembly for certain operations: arithmetic, branching, etc. We were all expecting something like the very first assembly example, using the integer division instruction the 80486 provides, and we all had the same flabbergasted reaction to the code gcc actually produced. That course gave me a healthy respect for the fact that compiler writers are, by and large, the smartest people on the planet.
I had a similar moment with gcc producing something unexpected while I was writing the previous post. The benchmark code at the very bottom of it, specifically.
int main(int argc, char *argv[]) { int i = INT_MIN; do { divideBySeven(i); i++; } while (i != INT_MIN); return 0; }
This code calls the divideBySeven
function once for every single possible integer. The index starts with the very smallest, INT_MIN
, and goes up until it reaches the very largest, INT_MAX
. When it increments the index again, it overflows, wrapping back around to INT_MIN
again and exiting the loop.
But this is actually the second version of that loop that I wrote. The first version had a significant problem. Can you find it below?
int main(int argc, char *argv[]) { int i; for (i = INT_MIN; i <= INT_MAX; i++) { divideBySeven(i); } return 0; }
The problem is that INT_MAX
is the largest possible integer. Thus, by definition, every single integer is less than or equal to it. The exit condition will always be true, and so the program will never terminate.
I didn’t actually run this version. I assumed it worked and compiled to assembly so that I could insert the hand-written version of division by seven using the integer division instruction. While looking at the code, I saw that the for loop had been rendered as follows:
movl $-2147483648, -8(%ebp) # i = INT_MIN .L4: movl -8(%ebp), %eax movl %eax, (%esp) # set up argument for function call call divideBySeven addl $1, -8(%ebp) # i++ jmp .L4
Ordinarily, for a loop like this, you’d have a comparison instruction and a conditional jump back to the start based on whatever the condition is. For example, with this code, I might compare i
with INT_MAX
and use jle
to jump back to .L4
if i
is smaller1. However, gcc has determined that the condition cannot possibly be false, and so, instead of generating a comparison, output an unconditional jump, setting up an explicit infinite loop that saves a few clock cycles per iteration.
That’s right: gcc has optimized my infinite loop.
1: The code would also have to be changed so that the comparison is made before the loop body executes for the first time. This could be done by leaving the unconditional jump as-is, putting the comparison right after the label, and jumping out of the loop if the condition is false, or by changing the unconditional jump to a comparison and conditional jump to .L4
, adding a new label right before the comparison, and putting an unconditional jump to that new label right before the .L4
label. gcc produces the latter structure. ^
Post a Comment