Skip to content

Quickstart: partition gpu-3#

What is MIG?#

MIG (Multi-Instance GPU) is an NVIDIA technology (A100/H100/H200, etc.) that allows dividing a physical GPU into several isolated instances (partitions). Each instance has dedicated compute, HBM memory, caches and copy engines, which prevents a "noisy" job from affecting others.

Tip

This is useful so a single job does not occupy the entire GPU and thus we can optimize its usage.

The gpu-3 partition#

In the gpu-3 partition each A100 is divided into three MIG instances of ~10 GB (2g.10gb). It is ideal if your job needs an intermediate GPU: neither 5 GB nor 20 GB.

When should we choose the gpu-3 partition?#

  • If you are starting to use GPUs.
  • If your model needs ~10 GB of VRAM.
  • If you are running tests or proofs of concept.

Examples#

Quick interactive example:

salloc --partition=gpu-3 -n 1 --cpus-per-task=5 --mem=10G --gres=gpu:1

Batch template (copy, paste and adjust parameters):

#!/bin/bash
#SBATCH -p gpu-3
#SBATCH -n 1
#SBATCH --cpus-per-task=5
#SBATCH --mem=10G
#SBATCH --gres=gpu:1
#SBATCH --time=02:00:00

module purge
module load CUDA/12.4.0

nvidia-smi
# here run your GPU code, e.g. python train.py