Compilation
Language standard
Most programming languages are defined through norms or standards. For example the C language have several standards:
- K&R C
- ANSI C (C89) – ISO C (C90)
- C99
- C11
- C embedded
These standards define the syntax of the language, the meaning of these syntactic constructs regarding the resulting machine code. Most of them are a revision of the previous standard, incorprorating changes and new features.
The Intel Suite compilers, as well as the GNU Compiler Collection supports most releases the C, C++ and Fortran languages. However, some compilation options and optimisation as well as support for specific features of the most recent iterations of these standards are compiler-specific. Please refer to their respective documentations for more information.
Available compilers
Several well-known compilers are available for the C, C++ and Fortran languages. The most common being:
- Intel Compiler suite (icc, icpc, ifort)
- GNU compiler suite (gcc, g++, gfortran)
To get a complete list of available compilers and versions use the search parameter of module
$ module search compiler
We recommend to use the Intel Compiler Suite for better performances.
Here is how you would basically compile a serial code
$ icc [options] -o serial_prog.exe serial_prog.c
$ icpc [options] -o serial_prog.exe serial_prog.cpp
$ ifort [options] -o serial_prog.exe serial_prog.f90
Intel
Compiler flags
The following sections are an overview of the most frequent options for each compiler.
C/C++
The Intel compilers icc and icpc use mostly the same options. Their behaviors differ slightly: icpc assumes that all source files are C++, whereas icc distinguishes between .c and .cpp filenames.
Basic flags:
-o exe_file
:names
the executableexe_file
-c
: generates the corresponding object file, without creating an executable.-g
: compiles with the debug symbols.-I dir_name
: specifies the path of the include files.-L dir_name
: specifies the path of the libraries.-l bib
: asks to link thelibbib.a
library
Preprocessor:
-E
: preprocess the files and sends the result to the standard output-P
: preprocess the files and sends the result infile.i
-Dname=
: defines thename
variable-M
: creates a list of dependencies
Practical:
-p
: profiling with gprof (needed at compilation time)-mp
,-mp1
: IEEE arithmetic, mp1 is a compromise between time and accuracy
To tell the compiler to conform to a specific language standard :
-std=val
: where val can take the following values (cf. man icc or man icpc)- c++14 : Enables support for the 2014 ISO C++ standard features.
- c++11 : Enables support for many C++11 (formerly known as C++0x) features.
- C99 : Conforms to The ISO/IEC 9899:1999 International Standard for the C language.
- …
Note
If the desired standard is not fully supported by the current version of the Intel compiler (some features of the standard are not yet implemented), it might be supported by the last version of the GNU compilers. Since the Intel suit compilers are compiled against the gcc system (the one from the OS), load a recent version of the GNU compilers might solve this issue.
Fortran
The Intel compiler for Fortran is ifort
.
Basic flags:
-o
exe_file: name the executableexe_file
-c
: generate the object file without creating an executable.-g
: compile with the debug symbols.-I dir_name
: adddir_name
to the list of directories where include files are looked for.-L dir_name
: adddir_name
to the list of directories where libraries are looked for.-l bib
: link thelibbib.a
library
Run-time check
-C
or-check
: generates a code which ends up in ‘run time error’ (ex: segmentation fault)
Preprocessor:
-E
: pre-process the files and send the result to the standard output-P
: pre-process the files and send the result tofile.i
-Dname=
: assign the value value to the variablename
-M
: creates a list of dependencies-fpp
: pre-process the files and compile
Practical:
-p
: compile for profiling with gprof. You will not be able to use gprof otherwise.-mp
,-mp1
: IEEE arithmetic (mp1 is a compromise between time and accuracy)-i8
: promote integers to 64 bits by default-r8
: promote reals to 64 bits by default-module dir
: send/read the files*.mod
in thedir
directory-fp-model strict
: strictly adhere to value-safe optimizations when implementing floating-point calculations, and enable floating-point exception semantics. This may slow down your program.
To tell the compiler to conform to a specific language standard :
-stand=val
: where val can take the following values (cf. man ifort)- f15: Issues messages for language elements that are not standard in draft Fortran 2015.
- f08: Tells the compiler to issue messages for language elements that are not standard in Fortran 2008
- f03: Tells the compiler to issue messages for language elements that are not standard in Fortran 2003
- …
Note
Please refer to the man pages for more information about the compilers.
Optimization flags
Compilers provide many optimization options: this section describes them.
Basic optimization options :
-O0
,-O1
,-O2
,-O3
: optimization levels - default:-O2
-opt_report
: writes an optimization report to stderr (-O3
required)-ip
,-ipo
: inter-procedural optimizations (mono and multi files). The command xiar must be used instead of ar to generate a static library file with objects compiled with-ipo
option.-fast
: default high optimization level (-O3 -ipo -static
).-ftz
: considers all the denormalized numbers (like INF or NAN) as zeros at runtime.-fp-relaxed
: mathematical optimization functions. Leads to a small loss of accuracy.-pad
: makes the modification of the memory positions operational (ifort only)
Warning
The -fast
option is not allowed with MPI because the MPI context needs some libraries which only exist in dynamic mode. This is incompatible with the -static
option. You need to replace -fast
by -O3 -ipo
.
Vectorization flags
Some options allow to use specific vectorization instructions of Intel processors to optimize the code. They are compatible with most Intel processors. The compiler will try to generate these instructions if the processor allows them.
-xcode
: Tells the compiler which processor features it may target, including which instruction sets and optimization it may generate. “code” is one of the following:- CORE-AVX2
- AVX
- SSE4.2
- SSE2
-xHost
: Applies the highest level of vectorization supported depending on the processor where the compilation is performed. The login nodes may not have the same level of support as the compute nodes. So this option is to be used only if the compilation is done on the targeted compute nodes.-axcode
: Tells the compiler to generate a single executable with multiple levels of vectorization. “code” is a comma-separated list of instructions sets.
The default level of vectorization is sse2. However, it is only be activated for optimization level -O2
and more.
-vec-report[=n]
: depending on the value of n, the option-vec-report
enables information reports by the vectorizer.
Warning
A code compiled for a given instruction set will not run on a processor that only supports a lower instruction set
Default compilation flags
By default each of the Intel compiler provide the -sox
option which allows to save all the options provided at the compilation time in the comment section of the ELF binary file. To display the comment section :
$ icc -g -O3 hello.c -o helloworld
$ readelf -p .comment ./helloworld
String dump of section '.comment':
[ 0] GCC: (GNU) <x.y.z> (Red Hat <x.y.z>)
[ 2c] -?comment:Intel(R) C Intel(R) 64 Compiler for applications running on Intel(R) 64, Version <x.y.z> Build <XXXXXX> : hello.c : -sox -g -O3 -o helloworld
GNU
Compiler flags
Basic flags:
-o exe_file
: names the executableexe_file
-c
: generates the corresponding object file, without creating an executable.-g
: compiles with the debug symbols.-I dir_name
: specifies the path where the include files are located.-L dir_name
: specifies the path where the libraries are located.-l bib
: asks to link thelibbib.a
library
To tell the compiler to conform to a specific language standard (g++/gcc/gfortran) :
-std=val
: where val can take the following values (cf. man gcc/g++/gfortran)- c++14 : Enables support for the 2014 ISO C++ standard features.
- C99 : Conforms to The ISO/IEC 9899:1999 International Standard.
- f03: Tells the compiler to issue messages for language elements that are not standard in Fortran 2003
- f08: Tells the compiler to issue messages for language elements that are not standard in Fortran 2008
- …
Below are some specific flags for the gfortran commands.
Debugging:
-Wall
: short for “warn about all”, warns about usual causes of bugs, such as having a subroutine or function named like a built-in one, or passing the same variable as an intent(in) and an intent(out) argument of the same subroutine-Wextra
: used with-Wall
, warns about even more potential problems, like unused subroutine arguments-w
: inhibits all warning messages (not recommended)-Werror
: considers any warning as an error
Optimization flags
Compilers provide many optimization options: this section describes them.
Basic optimization options :
-O0
,-O1
,-O2
,-O3
: optimization levels - default:-O0
Some options allow usage of specific set of instructions for Intel processors, to optimize code behavior. They are compatible with most Intel processors. The compiler will try to use them if the processor allows them.
-mavx2
/-mno-avx2
: Switch on or off the usage of said instruction set.-mavx
/-mno-avx
: idem.-msse4.2
/-mno-sse4.2
: idem.
Available numerical libraries
MKL library
The Intel MKL library is integrated in the Intel package and contains:
- BLAS, SparseBLAS;
- LAPACK, ScaLAPACK;
- Sparse Solver, CBLAS ;
- Discrete Fourier and Fast Fourier transform
If you don’t need ScaLAPACK:
$ module load mkl
$ ifort -o myexe myobject.o ${MKL_LDFLAGS}
If you need ScaLAPACK:
$ module load scalapack
$ mpif90 -o myexe myobject.o ${SCALAPACK_LDFLAGS}
We provide multi-threaded versions for compiling with MKL:
$ module load feature/mkl/multi-threaded
$ module load mkl
$ ifort -o myexe myobject.o ${$MKL_LDFLAGS}
$ module load feature/mkl/multi-threaded
$ module load scalapack
$ mpif90 -o myexe myobject.o ${SCALAPACK_LDFLAGS}
To use multi-threaded MKL, you have to set the OpenMP environment variable OMP_NUM_THREADS
.
We strongly recommend you to use the MKL_XXX
and SCALAPACK_XXX
environment variables made available by the mkl and scalapack modules.
FFTW
FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, with an arbitrary input size, and with both real and complex data. It is provided by the fftw3/gnu module. The variables FFTW3_CFLAGS
and FFTW3_LDFLAGS
should be used to compile a code using fftw routines.
$ module load fftw3/gnu
$ icc ${FFTW3_CFLAGS} -o test example_fftw3.c ${FFTW3_LDFLAGS}
$ ifort ${FFTW3_FFLAGS} -o test example_fftw3.f90 ${FFTW3_LDFLAGS}
Intel MKL also provides Fourier transform functions. FFTW3 wrappers are able to link programes so that they can use Intel MKL Fourier transforms instead of the FFTW3 library, without changing the source code. The correct compiling options are provided by the fftw3/mkl module.
$ module load fftw3/mkl
$ icc ${FFTW3_CFLAGS} -o test example_fftw3.c ${FFTW3_LDFLAGS}
$ ifort ${FFTW3_FFLAGS} -o test example_fftw3.f90 ${FFTW3_LDFLAGS}
Compiling for Skylake
With the -ax
option, icc and ifort can generate code for several architectures.
For example, from a Broadwell login nodes, you can generate an executable with both AVX2 (Broadwell) and AVX512 (Skylake) instructions set. To do so, you need to add the -axCORE-AVX2,CORE-AVX512
option to icc or ifort.
An executable compiled with -axCORE-AVX2,CORE-AVX512
can be run on both Broadwell and Skylake as the best instruction set available on the architecture will be chosen.
Compiling for Rome/Milan
With the -m
option, icc and ifort can generate specific instruction sets for Intel and non-Intel processors.
AMD Rome and Milan processors are able to run AVX2 instructions.
To generate an AVX2 instructions for AMD processors, you need to add the -mavx2
option to icc or ifort.
An executable compiled with -mavx2
can run on both AMD and Intel processors.
Note
- The
-mavx2
option is compatible with gcc. - Both
-mavx2
and-axCORE-AVX2,CORE-AVX512
options can be used simultaneously with icc and ifort to generate both specific instructions for Intel processors and more generic instructions for AMD processors.