Willus.com's 2002 Win32 Compiler Benchmarks: I. BACKGROUND
1. Overview
First published January 30, 2002; last updated Dec 29, 2011 [see update
section below].
[Be sure to also check out my latest 2011 benchmarks.]
I've written a lot of programs (mostly in C) to do
numerical computations and simulations, and ever since I had to wait for
one to finish, I've been interested in faster computers and faster
compilers. As a graduate student, I grew to appreciate the
availability of free software, particularly the GNU C compiler, a superb
product that is the result of nearly 20 years of software
community work. My first 32-bit PC compiler was DJ Delorie's port of
GNU C to the MS-DOS environment, and it was very nice to be able to
compile code for a flat, 32-bit environment.
Since that time, almost ten years ago, I have migrated primarily to a
Win32 environment, and there are now several available Win32 C/C++
compilers (some of which also compile FORTRAN),
many of them being free. Because of my interest in the
topic, I've tried to outline them on my
Win32 C/C++ Compilers web page,
and now, for the first
time, I've finally done a set of benchmarks to try to give some sense
of comparison between the different compilers.
Updates and Other Links
I am certainly not the first to benchmark compilers. There are numerous
other excellent reviews, including a
Win32 Fortran Compiler Comparison, and
The Great Computer Language Shootout. You can no doubt find many others using
google.com.
There is also a
G4 spec2K discussion thread with some relevance to compiler
performance on Google Groups.
[Update 6/02:] Thanks to "Justus" for this
link
to a page on Intel C++ vs. Gnu C.
[Update 8/02:] In their July 2002 issue,
HP World Magazine compared gcc 2.95, gcc 3.04, Intel's Win32 compiler
(Windows 2000 Server), and Intel's Linux compiler (SuSE Linux 8.0) on
a 600 MHz Pentium III HP Vectra PC. Scores (bigger=faster): gcc 2.95 scored
89, gcc 3.04 scored 100, Intel Win32 scored 136, and Intel Linux scored 162.
The benchmark is called the "OBLcpu benchmark 2.0," which, I guess, is
HP World's standard suite of C++ CPU benchmark programs.
[See 10/02 update on summary page for comments on
the -ffast-math flag and MinGW 2.0 performance.]
[Update 12/02:] I missed this article until now, but
in July, Dean Kent at
Real World Technologies
did a
compiler
comparison
involving Borland, Delphi, Metrowerks, and Microsoft.
[Update 2/03:] Here's a great link
comparing
Intel C++ v7.0 to gcc 3.2.1 from coyotegulch.com.
Note that the comparison is in Linux, not Win32.
There's also
a long discussion thread about compiler performance here
at developers.slashdot.org.
[Update 9/03:] AMD has published the first SPEC
CFP2000 benchmark that I am aware of where
they
use the -ffast-math flag with gcc (and they got a very good score). This
means that -ffast-math is a legitimate SPEC compiler option.
[Update 10/03:] See my
MinGW/Gnu C Tips page for some in-line x87 math
routines (sincos, pow, exp, and atan2) which boost the performance
of the MinGW/Gnu C compilers on i386 architectures.
[Update 2/04:] Added results for 3.06 GHz Xeon
with 1 MB cache and Athlon 64 3000+ (laptop).
[Update 3/04:] Added results for Athlon 64 3200+
(desktop).
[Update 6/04:] Scott Robert Ladd at
coyotegulch.com is keeping up his
compiler comparisons between gcc and Intel
(unlike me!). This time he does gcc v3.4 and the yet to be released v3.5
(since renamed to v4.0)!
[Update 7/04:] Since I've read that Intel
is betting a big part of their future on the Pentium-M core (or
something like it), I added results for a 900 MHz Pentium-M
(my wife's Panasonic Toughbook W2 laptop). This required a whole new
category of processor since a Pentium-M is sort of between a P-III and
a P4 (it supports SSE-2 and has a 400 MHz FSB). See
the 1/05 update below, also.
[Update 8/04:] Added results for a 3.6 GHz Xeon (90 nm Nacona core w/EM64T) with 1 MB cache.
[Update 9/04:] Scott Robert Ladd is at it again,
testing GCC 4.0 against Intel 8.1.
[Update 1/05:] I was able to add results from
a 2 GHz Pentium-M (with 2 MB L2 cache) to my benchmarks.
Intel seems well justified in
focusing their efforts on bringing this chip to the desktop world
to replace the Netburst architecture. It
is definitely on par with a 3.6 GHz Pentium-4 and an
Athlon64 3200+. In fact, using the best scores on my five benchmarks
for each processor, averaged and normalized to the 3.6 GHz Xeon (higher
is slower), the results were 1.0 for the Xeon, 1.01 for the Pentium-M,
and 1.15 for the Athlon64 3200+. I would not have guessed that
the Pentium-M would perform so well. I can only guess that Intel
doesn't trumpet the performance more loudly because they don't want
to hurt their Pentium-4 sales.
[Update 5/05:] Scott Robert Ladd published a GCC 4.0 Review.
[Update 6/06:] Principled Technologies did a nice
Intel v9 vs. Gcc 4.1.1 comparison on Linux using SPEC CPU2000. Intel's performance is about 20% better.
[Update 11/06:] If you haven't viewed the
GCC benchmarks page, it's
worth a look. There are loads of plots of SPEC performance on a daily
basis, and there are some tantalizing jumps in the 32-bit performance
on Athlon systems of late (specfp2k). There are also comparisons to
Intel 7.1 and PGI.
[Update 3/07:] I checked through my
comments section on the summary page for the first time in over a year
and had to disable the posting mechanism because of spammers. I'll try to get
the original comments back up soon.
[Update 9/07:] I just found the
Daresbury
Laboratory distributed computing site, which has several excellent
benchmark results, including a 2005 comparison between
Pathscale, Intel, PGI, and GCC compilers for the AMD64 platform.
[Update 1/10:] I just ran some quick
MinGW gcc benchmarks on Windows 7.
[Update 11/11:] I just added some new results to my
MinGW gcc benchmarks page.
[Update 12/11:] I posted my new
2011 compiler benchmarks.
Caveat Emptor!
The usual benchmark caveat goes something like this: these are only
benchmarks, and as such may not represent the results you will actually
get if you compare these compilers yourself. There is probably a
different code that could be devised which would score best on each
different compiler. But I did try to take advantage of
honest-to-goodness simulations that I actually use (and mostly that I or
my colleagues actually wrote the code for). Also, I'm a reasonable
programmer (I know what it means to "vectorize" C programs for a Cray), but I
don't pretend to be an expert on writing super-fast, super-efficient
code. Mostly I try to write code as elegantly as possible without doing
anything that will result in an obvious performance penalty. Several
mathematical routines in these simulations were taken from Numerical
Recipes in C, though the Numerical Recipes routines typically don't
account for most of the CPU time used.
|