Willus.com's 2011 Win32/64 C Compiler Benchmarks:
2002 Benchmarks, Intel did very well. On the two benchmarks common
to both 2002 and 2011 (BW1D and LAME), the average performance improvement
of the fastest Intel-compiled code over the fastest gcc-compiled code
was almost 80%. Intel had a similar performance margin over Microsoft in 2002.
In 2011, on the otherhand, the result is similar, but the balance of power
Intel is still the
overall winner based on a geometric mean of the performance scores, but
this time the margins over
Microsoft are only 7 - 26% (over gcc,
depending on the compile flags and 32/64-bit--the best results, 64-bit, averaged
18% faster than gcc's) and 13 - 22% (over microsoft), respectively,
and on six of the ten benchmarks, the fastest gcc-compiled executable is
either faster or equivalent to (within 6%) the fastest Intel-compiled executable
(MS VC 2010 only had one result within 6% of Intel's fastest score).
This sort of performance seems roughly in line with what
are getting or have gotten recently between Intel and gcc
(Principled Technologies benchmarked Intel v9 vs. gcc 4.1.1 in Linux on
SPEC CPU2000 SPECfp_rate_base in 2006 and showed Intel to be
In addition, the fastest gcc 4.6.3 results show an average 20% improvement
over gcc 3.4.2 (32-bit to 32-bit).
Clearly the field has gained significant margin on the performance king,
and that's good news for FOSS fans.
The GCC team is to be commended, but they've still got work to do.
Intel remains the compiler to beat.
Digital Mars and Tiny CC
certainly leave a lot to be desired in terms of
run-time performance of their compiled executables (2.3X and 4.5X slower than Intel,
respectively, on average),
but for ease of setup and use, install size, and
fast compiling, they
are hard to beat, especially Tiny CC, which, on average, compiles and links C
applications nearly 4X faster than its closest competitor (Microsoft),
7X faster than gcc 4.6.3, and
over 10X faster than Intel! If your codes are small and/or already run plenty
fast enough, Tiny CC is an excellent, easy-to-use option. If you need C++,
too, then Digital Mars is your best option for installation compactness.
I also want to give a nod to
Although it was not included in the results because it had trouble
compiling all of the benchmarks
(it is similar in capability and vintage to Digital Mars), it comes with
an excellent set of tools and a good install package if you are just starting
out and want to try out C programming with an easy-to-setup, easy-to-use compiler.
Intel's Transcendent Transcendental Performance
Two of Intel's clearest-cut victories are in BW1D and TRANSCEND. Both of these
benchmarks make heavy use of sincos(), pow(), and other transcendental C math
functions, so I wrote some programs
to isolate the performance of these functions a la my
MinGW fast math function testing code.
Sure enough, Intel is definitely doing some
in-line transcendental magic. On my sincos() loop test, the Intel-compiled version
is 3 times faster than gcc's best result. On the pow() loop, Intel's margin
increases--over 4 times. But the biggest margin is the exp() test loop: the
Intel-compiled code is 8 times faster than gcc! I am sure this kind of margin
on transcendentals is what lifts Intel to its impressive performance in
BW1D and TRANSCEND. Hopefully some talented programmer can disassemble Intel's lightning
fast transcendental C calls and mimic them in an open-source fast math library that
can be tightly coupled to gcc. With something like that, I think gcc would
be neck-and-neck with Intel for the best overall performance in these benchmarks.
[Update 1-19-2012: There is a discussion of the fast Intel math results in
this gcc mailing list thread.]
It sure doesn't look like the multi-threaded (automatic parallelization,
denoted by an X under the // column in the results tables)
compile options do anything on these examples (or I don't write code
that parallelizes well!).
In many cases, the performance gets worse, but when I watched
my CPU performance in Task Manager, I could see that BW1D, for example,
very occasionally utilized 2 threads with the Intel-compiled version (not so for
the gcc version).
I did some tests to convince myself that the compile options
really work as advertised. They make a huge
difference, for example, on the Intel-compiled versions of my transcendental function
tests (because these consisted of a simple loop)--increasing the performance
by a factor of 3 on my 4-thread CPU. But for these
same transcedental test loops as compiled by gcc, the automatic parallelization
had no effect. I was able to write an example of code that gcc could parallelize, but
I had to spell out the parallel loops very obviously for gcc to turn them
into multi-threaded loops.
Intel's compiler seems to have more advanced automatic parallelization capability
IPO, Profiling, and 64 bits
Interestingly, Intel gets more mileage out of the inter-procedural optimization (IPO) flags
than gcc, but gcc gets a bit more out of the profiling flags. With
Intel-compiled codes, the average benefit from IPO (-Qipo) is 6 to 10%, compared to
0 - 2% with gcc-compiled codes (-flto). Microsoft scores a 5%
average IPO boost on 32-bit codes, but, strangely, 64-bit IPO with MS VC actually degraded
performance on average. With Intel-compiled codes, the average
benefit from profiling is 0 - 5%, compared to 4 - 9% with gcc-compiled codes.
The advantage of compiling to 64-bit executables is
significant--64-bit codes are faster than their 32-bit equivalents on every benchmark
for Intel and gcc--by 10% to 20% on average,
and in some cases by up to 70% (Crafty). A couple of MSVC's 64-bit results
were actually slower than the 32-bit equivalents, but overall the
performance improvement averaged around 10% for 64-bit VC 2010 codes.
These benchmarks did not make me want to switch from MinGW/gcc for my
day-to-day compiling. Intel
may have the performance edge, but the cleaner setup for gcc, its FOSS roots,
its much smaller disk footprint, its wide array of compiling options, and its
robust execution are all big draws for me. Plus gcc supports C++, Java, Fortran 95,
Objective C, and Ada, and it supports a number of platforms, which makes
porting my codes between Windows and Linux easier. I recommend my gcc 4.6.3 build.
Like I pointed out in 2002, in the end, picking the right compiler should involve
more than just a few benchmark results.
You'll want to find out what you are
comfortable with, much of which will include how easy the compiler
is to use, what extra tools it comes with, how good the documentation is,
and exactly what you need it for.
If you are a serious programmer, I highly recommend that you try more
than one of these compilers.
You can find links to these compilers on my
Win32 C/C++ Compilers web page.
I'd again like to thank my very patient family for giving me the time to post this
Go to the next page if you would like to post a comment.