Getting Started
Welcome to your journey into CEA’s high-performance computing! This guide will help you learn the basics of working with CEA’s supercomputers. It covers job execution, module management, and the effective use of storage systems and services.
Accessing the HPC System
To start utilizing our supercomputer, first establish an SSH connection as follows:
$ ssh <login>@|fdqn|
For more detailed instructions on accessing the system, please see the Interactive access section.
Setting up Your User Environment
Upon connecting to the supercomputer, you have the opportunity to establish your user environment using the Environment-Modules tool.
Modules is a tool that allows you to control the software packages in your environment. It enables users to dynamically adjust their environment for their specific needs at any given time, without conflicting with other uses.
To see a list of all available software modules, you can use the module avail command:
$ module avail
To check which modules are currently loaded in your environment, use the module list command:
$ module list
The most common task is loading a module, which sets up your environment to use a specific software package. To load a module, use the module load command followed by the name of the software package:
$ module load python3
$ module load mpi
$ module load cuda
...
Once you have set up a suite of modules that fits your workflow, we recommend saving this setup using the module save command:
$ module save
With this, your module configuration will persist across different SSH connections, providing a consistent working environment.
You can unload a module with module unload <module_name> command or purge them all with module purge.
For more detailed information on using software modules and setting up your environment, please refer to the Environment management section of this guide.
Compiling Code on the Cluster
Our cluster provides a variety of compilers through the module system, such as GCC, Intel, and NVHPC.
For instance, to compile a Fortran program with MPI using GCC (Gnu Compiler Collection) you can use the following commands:
$ module load gcc
$ module load mpi
$ mpifort -o program_name program.f90
For those wanting to use other programming languages,compilers, or want more details about developping and compiling on the cluster, please see the Parallel programming section .
Using Python on the Cluster
Our cluster supports several Python distributions for different uses described in Python section.
For instance, for Python programming with parallel processing using MPI (includes mpi4py):
$ module load mpi python3
Once the appropriate module is loaded, you can start a Python interpreter by simply typing python in your terminal, or run a Python script using python script_name.py.
$ module load mpi python3
$ python script.py
To list all available Python packages in your current environment, run pip3 list.
For a more detailed guide on using Python on our cluster, including Machine learning and AI, refer to the Python section.
Submitting Your First Job
Working with our supercomputer involves job submissions. The login node serves for preparatory tasks and is not meant for calculations. Instead, computationally intensive tasks, even Python scripts, should run on compute nodes.
Job submission involves creating a script and then submitting this to the job scheduler. This approach ensures efficient usage and fair access to computational resources.
Creating a Job Script
A job script is a simple text file containing directives for the job scheduler and the commands you want to run.
Here is basic example using MPI:
$ nano job_mpi.sh
Inside the editor:
#!/bin/bash
#MSUB -r my_job_mpi # Job name
#MSUB -n 32 # Number of tasks to use
#MSUB -c 1 # Number of cores (or threads) per task to use
#MSUB -T 1800 # Elapsed time limit in seconds of the job (default: 7200
#MSUB -o my_job_mpi_%I.o # Standard output. %I is the job id
#MSUB -e my_job_mpi_%I.e # Error output. %I is the job id
#MSUB -A <project> # Project ID
#MSUB -q |default_CPU_partition| # Partition name (see ccc_mpinfo)
set -x
cd ${BRIDGE_MSUB_PWD}
ccc_mprun ./a.out
For more job scripts examples (OpenMP, CUDA…), please refer to the job submission scripts examples section.
Submitting the Job
Once your job script is ready, you can submit it to the scheduler:
$ ccc_msub job_mpi.sh
For small tests you may use an interactive session as described in the interactive submission section.
For a more detailed guide on creating job scripts and submitting jobs, refer to the Job subsmission section.
Understanding Data Spaces
Our supercomputer offers multiple file systems, each optimized for different use cases.
Remember, choosing the right file system for your needs can significantly impact the performance and efficiency of your jobs. For more details, look at the Data spaces section.
HOME
The HOME directory is best used for storing small, essential files. Its space and performance are limited, and it is subject to snapshots for backup. Avoid storing large datasets here.
It is accessible through the $CCCHOME
variable.
This space is always available.
SCRATCH
The SCRATCH space offers high performance and is designed for temporary storage during your jobs. Files in SCRATCH are purged every 60 days, so remember to move important data to a more permanent location.
It is accessible through the $CCCSCRATCHDIR
variable.
By default this space is unavailable for your jobs. You need to have #MSUB -m scratch
at the beggining of your submission script to use it.
WORK
The WORK space is similar to SCRATCH but offers lower performance. It is not subject to purging, making it suitable for installing products, storing documents, and holding work-in-progress data.
It is accessible through the $CCCWORKDIR
variable.
By default this space is unavailable for your jobs.
You need to have the #MSUB -m work
at the beggining of your submission script to use it.
STORE
The STORE space provides extensive storage for archives, old results, and other large data that you do not need to access frequently.
Be aware that data in STORE may migrate to magnetic tapes over time. Retrieving this data can take between 20-40 minutes. For managing such scenarios, please refer to the Data Management on STORE section.
It is accessible through the $CCCSTOREDIR
variable.
By default this file system is unavailable for your jobs.
You need to the #MSUB -m store
at the beggining of your submission script to use it.
TMP
The TMP is a local, non-shared file system on each node. It is ideal for transient files used in minor operations.
It is accessible through the $TMPDIR
variable.
Further Topics
Beyond the basics, our supercomputer supports advanced features that can enhance your computational work:
Debugging & Profiling
Optimize your code with debugging and profiling. See the Debugging section and Profiling section for details.
GPU Programming
Utilize our GPU accelerators to boost your work. Refer to the GPU-accelerated computing section for more.
Containers & Virtualization
Achieve portable and reproducible environments with containers and virtualization. Details in the Virtualization and containers section.
Quantum Computing
Explore new frontiers with quantum computing. Visit the Quantum section for more information.
Visualization Services
Our remote desktop service allow for fast and high-quality visualization of your results. Check out the Interactive access section for more.