|
|
Overview
This page has some useful links for the MinGW C compiler and lists some
useful information/tips that I've learned about using MinGW. Click on
any of the links in the left (blue) or right (green) tables.
Benchmarks
DEC 29, 2011 See my latest 2011 Win32/64 C Compiler Benchmarks.
I did my first compiler benchmarks in 2002. Below are some more recent ones that are more limited in scope.
NOV 19, 2011 (There are older 2010 benchmarks below. Scroll down for those.)
Today I benchmarked
MinGW gcc 4.5.2 (32-bit and 64-bit), gcc 4.4.x (32-bit and 64-bit), and
Fabrice Bellard's TCC, an amazingly small compiler that
emphasizes compile speed. All benchmarks were compiled and run on my home PC, a Core i5-670 with
3.7 GHz turbo-boost speed running 64-bit Windows 7 (SP 1). TCC lives up to its claims of
compiling almost 10 times faster than gcc, but, as might be expected, that compile
performance comes with a
significant penalty in executable performance (3 - 4 times slower in these two benchmarks).
For what it is intended to be--a very lightweight, pseudo-scriptable C compiler,
TCC is a remarkable achievement.
Benchmark #1, Image manipulation: crop and re-size 163 images
Lines of code: Approx. 250 K
Compiler |
Compiler Options |
Exe Type |
Compile Time |
Exe Size |
Exe Run Time |
gcc 4.4.0 |
-O3 -ffast-math -m32 |
32-bit |
70.0 s |
2.09 MiB |
105.4 s |
gcc 4.4.5 |
-O3 -ffast-math -m64 |
64-bit |
74.0 s |
2.33 MiB |
78.9 s |
gcc 4.5.2 |
-O3 -ffast-math -m32 |
32-bit |
93.5 s |
2.24 MiB |
105.7 s |
gcc 4.5.2 |
-O3 -ffast-math -m64 |
64-bit |
92.2 s |
2.28 MiB |
76.5 s |
tcc 0.9.25 |
(none) |
32-bit |
10.2 s |
2.22 MiB |
325.1 s |
Benchmark #2, Beam-Wave Interaction Simulator (heavy use of floating point)
Lines of code: Approx. 350 K
Compiler |
Compiler Options |
Exe Type |
Compile Time |
Exe Size |
Exe Run Time |
gcc 4.4.0 |
-O3 -ffast-math -m32 |
32-bit |
99.5 s |
3.54 MiB |
29.0 s |
gcc 4.4.5 |
-O3 -ffast-math -m64 |
64-bit |
102.3 s |
3.98 MiB |
25.3 s |
gcc 4.5.2 |
-O3 -ffast-math -m32 |
32-bit |
127.1 s |
3.79 MiB |
27.6 s |
gcc 4.5.2 |
-O3 -ffast-math -m64 |
64-bit |
126.4 s |
3.95 MiB |
25.8 s |
tcc 0.9.25 |
(none) |
32-bit |
13.0 s |
3.96 MiB |
93.8 s |
FEB 2, 2010
I was able to benchmark some different versions of MinGW gcc with both 32-bit
and 64-bit executables, mostly under Windows 7 64-bit (freshly installed).
I ran a couple of benchmarks. Interestingly, for one benchmark, going from
gcc 3.X to gcc 4.X made a factor of two difference in the speed. For another,
it made almost no difference.
Other notes: both benchmarks are single threaded;
using the -march=native and
-mtune=native (I think -mtune=native is the default) didn't buy much over just
using -O3 -ffast-math; and 64-bit bought about a 15% - 25% improvement over 32-bit.
Benchmark #1, Crop and re-size 200 images
Compiler/Flags |
Compiled on |
Exe Type |
Exe Run Time |
Run on |
gcc 3.4.2 -O3 -ffast-math |
AMD 3200+ |
32-bit |
548 s |
Win XP/AMD 3200+ 2.0 GHz |
gcc 3.4.2 -O3 -ffast-math |
AMD 3200+ |
32-bit |
451 s |
Win 7/Core i5 3.46 GHz |
gcc 4.4.3 (no optimization) |
Core i5 |
64-bit |
420 s |
Win 7/Core i5 3.46 GHz |
gcc 3.4.2 -O3 -ffast-math |
Core i5 |
32-bit |
254 s |
Win 7/Core i5 3.46 GHz |
gcc 4.4.0 -O3 -ffast-math |
Core i5 |
32-bit |
115 s |
Win 7/Core i5 3.46 GHz |
gcc 4.4.3 -O3 -ffast-math |
Core i5 |
64-bit |
90 s |
Win 7/Core i5 3.46 GHz |
gcc 4.4.3 -O3 -ffast-math -march=native -mtune=native |
Core i5 |
64-bit |
90 s |
Win 7/Core i5 3.46 GHz |
Benchmark #2, Beam-Wave Interaction Simulator
Compiler/Flags |
Compiled on |
Exe Type |
Exe Run Time |
Run on |
gcc 3.2 -O3 -ffast-math |
AMD 3200+ |
32-bit |
79 s |
Win XP/AMD 3200+ 2.0 GHz |
gcc 3.2 -O3 -ffast-math |
AMD 3200+ |
32-bit |
40 s |
Win 7/Core i5 3.7 GHz* |
gcc 3.4.2 -O3 -ffast-math |
Core i5 |
32-bit |
28.9 s |
Win 7/Core i5 3.7 GHz* |
gcc 4.4.0 -O3 -ffast-math |
Core i5 |
32-bit |
29.0 s |
Win 7/Core i5 3.7 GHz* |
gcc 4.4.3 -O3 -ffast-math |
Core i5 |
64-bit |
25.8 s |
Win 7/Core i5 3.7 GHz* |
* - 3.7 GHz is clock speed in turbo boost mode.
Benchmark #3, Info-Zip's Zip 3.1/Unzip 6.0
All versions were run on a core-i5 670 with turbo-boost (Windows 7).
Run times were to zip and unzip (-t) 10.4 GB of mostly JPEG files.
Compiler/Flags |
Compiled on |
Exe Type |
Zip Run Time |
Unzip Run Time |
gcc 3.2 -O3 |
AMD 3200+ |
32-bit |
607 s |
110 s |
gcc 3.4.2 -O3 |
Core i5 |
32-bit |
569 s |
111 s |
gcc 4.4.0 -O3 |
Core i5 |
32-bit |
543 s |
109 s |
gcc 4.4.3 -O3 |
Core i5 |
64-bit |
513 s |
110 s |
Benchmark #4, Info-Zip's Zip 3.1/Unzip 6.0 using bzip2 compression
All versions were run on a core-i5 670 with turbo-boost (Windows 7).
Run times were to zip (-Z bzip2) and unzip (-t) 2.3 GB of mostly JPEG files.
(Strange to note the large run time for gcc 4.4.0 32-bit.)
Compiler/Flags |
Compiled on |
Exe Type |
Zip Run Time |
Unzip Run Time |
gcc 3.2 -O3 |
AMD 3200+ |
32-bit |
448 s |
200 s |
gcc 3.4.2 -O3 |
Core i5 |
32-bit |
486 s |
197 s |
gcc 4.4.0 -O3 |
Core i5 |
32-bit |
711 s |
217 s |
gcc 4.4.3 -O3 |
Core i5 |
64-bit |
389 s |
181 s |
Starter Links and References
There are several good links to MinGW documentation at the
MinGW site. Advanced users
can quickly find answers to many technical questions by searching through the
MinGW
users mail archive.
For introductions, try
Colin Peters' Programming
Win32 with GNU C and C++ page (local mirror here).
A development environment
for MinGW can be obtained from the folks at
bloodshed.net.
Compile Flags
I do a lot of numeric (double precision floating point) programming, and
my experience is that the -O3 -ffast-math compile flag combination
consistently yields just about the best result. Be warned that
-ffast-math
takes some math shortcuts and does not follow all IEEE error-handling
conventions, so if you use this flag, you should verify your results,
especially if you need very high accuracy. My experience is that it
is well worth the speed boost to use this flag. I have seen on the
MinGW users mail archive that
-Os (optimize for small code size) can
also yield the best results in some cases, perhaps because it allows the
code to fit better into the CPU's L1 or L2 cache. I also prefer to
use the -Wall flag
to report all warnings. This is an excellent practice.
Some Fast Math Functions
[Note 2011: My inline math functions are essentially not necessary
anymore with recent versions of gcc, but I leave this page up out of historical
interest. See Note 3 below.]
When MinGW's
pow function became 10x slower in release 3.0 and caused some of my
codes which used it heavily to become much slower, I started investigating
ways to implement some faster math functions. I first patched the 3.0
pow() function to go back to how it was in 2.0, but then I decided to
be more aggressive.
The floating point unit in most modern Intel and AMD CPU's (e.g. Pentiums
and Athlons) has many built-in transcendental functions such as sine,
cosine, arc-tangent, etc. These built-ins are automatically used by
the Microsoft C run-time library DLL which MinGW links to by default,
but making calls to the DLL typically incurs significant overhead.
You can use the header file here to in-line some of these functions
for faster performance on Pentiums and Athlons. It requires use of
the -ffast-math compile flag. I took
some of the code from Chapter 14 (pp. 807-808) of the Art of Assembly
Language link below. Note that the exp() and atan2() in-line versions
are actually slower on a 64-bit Opteron compile (SuSE Linux 8.0).
Also note that these in-line functions do not do any error
checking or trapping of any kind.
NOTE! My in-line pow() function now returns correct
results if the first argument is zero (Rev 1.01).
NOTE 2! GCC
v4.0 will include a more complete set of fast math intrinsics for
x87-compatible processors, including fsincos.
NOTE 3! (4-11-2010) I've noticed lately that the difference
between my in-lines and the gcc 3.x/4.x defaults depends significantly on
what arguments are sent to the functions. Sometimes mine are faster; sometimes
the gcc defaults are faster. In general, with gcc 4.x, I've found that only
my sincos in-line gives me any benefit over the gcc default on Core 2 processors,
and it's not by much.
x87inline.h
|
x87test.c
|
Art of Assembly
In-line
Assy How-To
|
In-line
Assy Linux Docs
|
Gnu C In-line Assy docs
Results: PIII
|
P4 Xeon
|
Opteron (32-bit)
|
Opteron (64-bit)
MinGW and Win32 DLLs
A very cool feature in MinGW is that you can put
DLL files directly in the compile/link command
(just like .o files) to be linked into your program without the
need to create library stub files. For more about how to do that,
try
Colin
Peters' DLL Page (local mirror here)
(from Colin Peters' Win32 Programming Page--local mirror here).
Colin Peters started MinGW.
Also, regarding function name mangling (decorating) in MinGW, try
Wu YongWei's Page (local archived copy).
No Globbing
By default compile, if you run a MinGW compiled command-line utility and
pass it a wildcard argument such as *.c, it acts exactly as a unix
utility and looks for every file ending in .c in your current file
directory, replacing the *.c argument with the name of every one of
those files so that your program never actually sees the *.c. (In unix/linux,
it's actually the shell that does this work.) To prevent MinGW C programs
from mimicking this "globbing," (e.g. to actually see "*.c" as the command
argument) put CRT_noglob.o
(in the MinGW library directory) at the beginning of your link list when linking.
Or, put this into your C code to turn on or off from inside your program:
extern int _CRT_glob = 0; /* 0 turns off globbing; 1 turns it on */
No Console Window
To make sure your application doesn't open a console window, use
the -mwindows flag when linking.
Binary stdout
If you want the output from the stdout FILE stream to be
binary (no translation of CR/LF chars):
#ifdef __MINGW32__
/* Required header file */
#include <fcntl.h>
#endif
...
#ifdef __MINGW32__
/* Switch to binary mode */
_setmode(_fileno(stdout),_O_BINARY);
#endif
Seeing all predefined macros
Here's how to see all predefined macros from gcc. Create a file test.c that has one line in it:
int main(void) {}
Then compile with this command:
gcc -dM -E test.c
Even easier (I got an e-mail tip on this)
echo . | gcc -dM -E -
(These work with any gcc port.)
Seeing all linker references (link map)
Here's how to see all function references and which library and object module they came from:
gcc -Wl,--print-map ...
|
|
|
|
|