From: eLinux.org
Compiler Optimization
Here’s a good overview on compiler optimizations:
http://en.wikipedia.org/wiki/Compiler_optimization
Here’s some info about GCC optimization techniques:
http://www.redhat.com/software/gnupro/technical/gnupro_gcc.html
Effects of optimization options are explained in this LJ
article.
A note of warning from Gentoo
wiki on optimization
flags:
-O3: This is the highest level of optimization possible, and also the
riskiest. It will take a longer time to compile your code with this
option, and in fact it should not be used system-wide with gcc 4.x. The
behavior of gcc has changed significantly since version 3.x. In 3.x, -O3
has been shown to lead to marginally faster execution times over -O2,
but this is no longer the case with gcc 4.x. Compiling all your packages
with -O3 will result in larger binaries that require more memory, and
will significantly increase the odds of compilation failure or
unexpected program behavior (including errors). The downsides outweigh
the benefits; remember the principle of diminishing returns. Using -O3
is not recommended for gcc 4.x.
In the following
e-mail, Jim Wilson,
who apparently supports gcc, writes:
From: Jim Wilson <wilson at specifixinc dot com>
Date: Thu, 29 Apr 2004 15:58:28 -0700
Subject: Re: optimization issue about -O2 and -Os
------------------------------------------------------------
...
The -Os option is buggy. You might want to report a bug into our bugzilla
bug datase. See http://gcc.gnu.org/bugs.html for more info on reporting bugs.
Though the -Os option is based on the -O2 option, it is a different option, that
generates different code, and has different bugs.
Tim Riker: this is a bit overly
dramtic. -Os is widely used and widely supported. The link is to a
thread about general information and does not refer to any specific bug
from what I can see. Try -Os out. If you have issues, try -O2 instead.
In general -Os will work. Be very careful in tweaking kernel
optimizations. There is kernel code that only works with the existing
optimizations.
Gentoo has also a very good overview over Safe
Cflags for different
architectures and cpus.
Link-time optimization (LTO)
- gcc front-ends (parsers) produce GIMPLE, which is in “static single
assignment” (SSA) form - Then, gcc optimizes the code, and converts to RTL (Register Transfer
Language) - RTL is converted to assembler by an architecture-specific back-end.
Then the assembler is called to convert to machine code - Finally, the linker is called to combine object files
gcc LTO support
- if -flto is used, then LTO information (GIMPLE) is stored in a
special ELF section of a .o file, and used at link time to perform
more optimization - You may need to use -fwhole-program in conjunction with -flto at
link time in order to get the full set of optimizations - Using this option requires a lot of memory and takes more time to
build the kernel - Some resources:
Linux kernel LTO support
Andi Kleen produced a set of patches to support LTO in the Linux kernel
(originally for version 3.6 of the kernel and gcc 4.7)
- Link-time optimization for the kernel
(LWN.net) - Code is available at:
https://github.com/andikleen/linux-misc- see the lto-3.x branches
- note that the code requires the const-sections patches, gcc 4.7
and a special binutils as well, in order to work- as of August 2012, this code was considered highly
experimental
- as of August 2012, this code was considered highly